mac搭建Hadoop和hive环境
文章目录
- mac搭建Hadoop和hive环境
Mac 搭建 Hadoop
1、ssh
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
2、安装hadoop
brew install hadoop
3、配置
主要都在这个目录下:
/usr/local/Cellar/hadoop/3.1.0/libexec/etc/hadoop
a) hadoop-env.sh
查看java安装路径
localhost:~ sudo$ /usr/libexec/java_home
/Library/Java/JavaVirtualMachines/jdk1.8.0_221.jdk/Contents/Home
localhost:~ sudo$
打开 hadoop-env.sh 文件(位置 etc/hadoop/),找到 # export JAVA_HOME=,改参数如下:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_221.jdk/Contents/Home
b) core-site.xml
打开 hdfs-site.xml 文件(位置 etc/hadoop/)
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
c) hdfs-site.xml
打开 hdfs-site.xml 文件(位置 etc/hadoop/),改参数如下
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
d) mapred-site.xml
打开 mapred-site.xml 文件(位置 etc/hadoop/)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
e) yarn-site.xml
打开 yarn-site.xml 文件(位置 etc/hadoop/),改参数如下:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
4、运行
格式化文件系统
bin/hdfs namenode -format
启动NameNode, DataNode
sbin/start-dfs.sh
问题解决:
Permission denied (publickey,password,keyboard-interactive).
问题原因:
之前配置的在主机上免密登录主机自身,有问题,即执行还需要输入密码
解决办法:
重新配置一下免密登录
进入ssh的目录
cd ~/.ssh
将id_rsa.pub中的内容拷贝到 authorized_keys中
localhost:ssh sudo$ cd ~/.ssh
localhost:.ssh sudo$ cat id_rsa.pub >> authorized_keys
重新启动hadoop
sbin/start-dfs.sh
再使用jps查看进程发现都启动起来了
打开界面
NameNode - http://localhost:9870
让 HDFS 可以被用来执行 MapReduce jobs:
说明sudo为用户名
localhost:libexec sudo$ bin/hdfs dfs -mkdir /user
localhost:libexec sudo$ bin/hdfs dfs -mkdir /user/sudo
启动 ResourceManager 和 NodeManager:
localhost:libexec sudo$ sbin/start-yarn.sh
访问 All Applications 界面
ResourceManager - http://localhost:8088
5、 快速启动和关闭
进入sbin目录下,直接使用命令 ./start-all.sh和./stop-all.sh命令可以同时启动和关闭hadoop和yarn服务
问题解决
Hadoop出现错误:WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
解决方案是在文件hadoop-env.sh中增加:
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib/native"
##搭建hive
总体思路
运行brew install hive 安装hive;
在mysql中创建元数据库(hive库);
之后修改hive的配置;
1、安装
brew install hive
2、在mysql中创建元数据库(hive库);
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create user 'hive' identifed by 'hive';
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'identifed by 'hive'' at line 1
mysql>
mysql>
mysql> create user 'hive' identified by 'hive';
Query OK, 0 rows affected (0.01 sec)
mysql> grant all on *.* to 'hive'@'localhost' identified by 'hive';
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
mysql> select host,user,authentication_string from mysql.user;
+-----------+---------------+-------------------------------------------+
| host | user | authentication_string |
+-----------+---------------+-------------------------------------------+
| localhost | root | *032197AE5731D4664921A6CCAC7CFCE6A0698693 |
| localhost | mysql.session | *THISISNOTAVALIDPASSWORDTHATCANBEUSEDHERE |
| localhost | mysql.sys | *THISISNOTAVALIDPASSWORDTHATCANBEUSEDHERE |
| % | hive | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
| localhost | hive | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
+-----------+---------------+-------------------------------------------+
5 rows in set (0.00 sec)
mysql>
mysql>
登陆hive用户
mysql> exit
Bye
localhost:~ sudo$ mysql -u hive -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.26 MySQL Community Server (GPL)
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
mysql>
mysql> create database hive
-> ;
Query OK, 1 row affected (0.00 sec)
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| caiji_activity |
| db5 |
| employees |
| exam |
| fdm |
| groupon |
| hive |
| mysql |
| ods |
| performance_schema |
| sdm |
| sys |
| test |
| vhr |
+--------------------+
15 rows in set (0.01 sec)
3、 配置环境变量
export HIVE_HOME=/usr/local/Cellar/hive/3.1.2
export PATH=$HIVE_HOME/bin:$PATH
touch hive-site.xml
说明, 向hive-site.xml中添加内容是:
<configuration>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<!--mysql用户名-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!--mysql密码-->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>000000</value>
</property>
<!-- hive用来存储不同阶段的map/reduce的执行计划的目录,同时也存储中间输出结果
,默认是/tmp/<user.name>/hive,我们实际一般会按组区分,然后组内自建一个tmp目录存>储 -->
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive</value>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hive</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/data/hive/warehouse</value>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/hive</value>
</property>
</configuration>
问题处理
报错.org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
原因:
缺少jdbc驱动,添加jdbc驱动到lib路径下
解决方案:
驱动下载:mysql-connector-java.jar包的下载教程
导入mysql-connector-java-8.0.21.jar包之后, 重新初始化hive数据库
重新初始化hive数据库
数据库初始化之后会生成很多张表
cd /usr/local/Cellar/hive/3.1.2/libexec/bin
schematool -initSchema -dbType mysql
Initialization script completed
Thu Jul 23 16:17:10 CST 2020 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool completed
初始化hive库成功
查看mysql中保存hive元数据的hive库
hive的元数据保存在mysql中(deby)
在命令行中输入hive命令
hive
至此,mac搭建hadoop和hive环境结束。后面可以进行hive实践。
后续启动可以直接进入目录
/usr/local/Cellar/hadoop/3.1.0/libexec/etc/hadoop路径下
localhost:3.3.0 sudo$ cd sbin/
localhost:sbin sudo$ ls
FederationStateStore refresh-namenodes.sh stop-balancer.sh
distribute-exclude.sh start-all.sh stop-dfs.sh
hadoop-daemon.sh start-balancer.sh stop-secure-dns.sh
hadoop-daemons.sh start-dfs.sh stop-yarn.sh
httpfs.sh start-secure-dns.sh workers.sh
kms.sh start-yarn.sh yarn-daemon.sh
mr-jobhistory-daemon.sh stop-all.sh yarn-daemons.sh
localhost:sbin sudo$
localhost:sbin sudo$
localhost:sbin sudo$
localhost:sbin sudo$ ./stop-all.sh
WARNING: Stopping all Apache Hadoop daemons as sudo in 10 seconds.
WARNING: Use CTRL-C to abort.
Stopping namenodes on [localhost]
Stopping datanodes
Stopping secondary namenodes [localhost]
2020-07-24 14:42:35,787 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping nodemanagers
Stopping resourcemanager
localhost:sbin sudo$ jps
3472 Jps
2519 RunJar
localhost:sbin sudo$
localhost:sbin sudo$
localhost:sbin sudo$ ./start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as sudo in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [localhost]
2020-07-24 14:43:25,668 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
localhost:sbin sudo$ jps
4242 RunJar
3653 NameNode
4086 ResourceManager
3894 SecondaryNameNode
4183 NodeManager
3754 DataNode
4362 Jps
localhost:sbin sudo$ pwd
/usr/local/Cellar/hadoop/3.3.0/sbin
localhost:sbin sudo$