标题
- 大数据特点
- 知道Hadoop
- 重要
- 点我Htdoop下载
- 注意这里选择hadoop-2.6.0-cdh5.7.0 后面下载其他的都得2.6.0要配套
- 分布式文件系统HTFS
- HDFS环境搭建
- 使用CDH版的Hadoop搭建HDFS环境
- [在 Linux 虚拟机中手动安装或升级 VMware Tools](https://docs.vmware.com/cn/VMware-Workstation-Pro/12.0/com.vmware.ws.using.doc/GUID-08BB9465-D40A-4E16-9E15-8C016CC8166F.html)
- hadoop中文文档
- 搭建前的前提准备 一般不出意外默认都是正常的
- 正式搭建
- HDFS文件参数设置
- HDFS格式化及启停
- HDFS shell操作
大数据特点
知道Hadoop
重要
点我Htdoop下载
或者直接点下面
Apache Hadoop 2.6.0-cdh5.7.0 这里内容很多 crtl+f hadoop-2.6.0-cdh5.7.0.tar.gz 能立马定位到要下载的
注意这里选择hadoop-2.6.0-cdh5.7.0 后面下载其他的都得2.6.0要配套
也就是说后面搭建环境的时候参看文档得看2.6.0的
分布式文件系统HTFS
节点可以理解为服务器
一个Block 128MB
HDFS环境搭建
使用CDH版的Hadoop搭建HDFS环境
未列出 然后 root 密码 (root用户登录)
下面是通过虚拟机自身开放共享文件夹拿到要安装的工具 也可以使用其他工具
他们就在 /mnt/hgfs/定义名字的文件夹下(后来一个重启进来竟然找不到共享文件夹的内容
百度后说安装下面这个工具就好了 果真如此)
在 Linux 虚拟机中手动安装或升级 VMware Tools
下面操作适合第一次安装
虚拟机中,以 root 身份登录到客户机操作系统并打开终端窗口。
mkdir /mnt/cdrom
mount /dev/cdrom /mnt/cdrom
cd /tmp
tar zxpf /mnt/cdrom/VMwareTools-x.x.x-yyyy.tar.gz
tar zxpf /mnt/cdrom/VMwareTools直接按tab自己就出来了
下面遇到yes就y 其他回车即可
cd vmware-tools-distrib
./vmware-install.pl
安装jdk
登录虚拟机后 在root用户下
su root(或者直接登录root用户更好 其实都一样)
useradd hadoop
passwd hadoop 密码设置为hadoop123(无效的密码: 密码包含用户名在某些地方 输入两次即可)
hostnamectl set-hostname hadoop000
hostname
(# 更改文件授权,在第100行将hadoop用户添加进去 hadoop ALL(ALL) ALL)
先给权限 然后修改里面的内容来给我们的hadoop用户root权限
chmod 777 /etc/sudoers
vim /etc/sudoers (/All 然后:set nu的话就在100行copy那个root的那个root换成hadoop即可)
reboot
重启后 未列出 然后 root 密码 (root用户登录)
su - hadoop
mkdir software
cp /mnt/hgfs/database/hadoop-2.6.0-cdh5.7.0.tar.tar ./software/
cp /mnt/hgfs/database/jdk-8u251-linux-x64.tar.gz ./software/
ll software/
mkdir app source data
ll
安装jdk 先查看jdk 都是openjdk都卸载掉(的root用户卸载 不然权限不够 前面加sudo都不好使)
rpm -qa| grep jdk
su root
[root@hadoop000 hadoop]# rpm -e --nodeps $(rpm -qa|grep jdk)
[root@hadoop000 hadoop]# rpm -qa| grep jdk
[root@hadoop000 hadoop]# su - hadoop
cd /software/
ll
这个时候还得切换成root不然权限不够解压不了
su root
[root@hadoop000 software]# tar zxvf jdk-8u251-linux-x64.tar.gz -C /usr/local/
[root@hadoop000 software]# cd /usr/local/
[root@hadoop000 local]# ll
[root@hadoop000 local]# cd jdk1.8.0_251/
[root@hadoop000 jdk1.8.0_251]# ll
[root@hadoop000 jdk1.8.0_251]# pwd
/usr/local/jdk1.8.0_251
[root@hadoop000 jdk1.8.0_251]# vim /etc/profile
export java_home=/usr/local/jdk1.8.0_251
export PATH=$PATH:$java_home/bin
[root@hadoop000 jdk1.8.0_251]# source /etc/profile
[root@hadoop000 jdk1.8.0_251]# echo $java_home
/usr/local/jdk1.8.0_251
[root@hadoop000 jdk1.8.0_251]# java -version
java version "1.8.0_251"
Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode)
[root@hadoop000 jdk1.8.0_251]#
hadoop中文文档
所需软件
Linux和Windows所需软件包括:
JavaTM1.5.x,必须安装,建议选择Sun公司发行的Java版本。
搭建前的前提准备 一般不出意外默认都是正常的
ssh 必须安装并且保证 sshd一直运行,以便用Hadoop 脚本管理远端Hadoop守护进程。
[root@hadoop000 jdk1.8.0_251]# ssh
usage: ssh [-1246AaCfGgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
[-D [bind_address:]port] [-E log_file] [-e escape_char]
[-F configfile] [-I pkcs11] [-i identity_file]
[-J [user@]host[:port]] [-L address] [-l login_name] [-m mac_spec]
[-O ctl_cmd] [-o option] [-p port] [-Q query_option] [-R address]
[-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]]
[user@]hostname [command]
[root@hadoop000 jdk1.8.0_251]# ps -ef | grep sshd
root 1322 1 0 17:43 ? 00:00:00 /usr/sbin/sshd -D
root 67824 11887 0 18:45 pts/0 00:00:00 grep --color=auto sshd
正式搭建
上面是做映射的
#linux主机配置: 设置主机名hostname 他就在/etc/sysconfig/network(主机名文件)
#上面我们已经设置过hostname为hadoop000
[root@hadoop000 jdk1.8.0_251]# hostname
hadoop000
[root@hadoop000 jdk1.8.0_251]# vim /etc/sysconfig/network
#输入内容 HOSTNAME=hadoop000
[root@hadoop000 jdk1.8.0_251]# cat /etc/sysconfig/network
# Created by anaconda
HOSTNAME=hadoop000
[root@hadoop000 jdk1.8.0_251]#
# 弄映射(添加IP和hostname的映射关系)
[root@hadoop000 ~]# vim /etc/hosts
192.168.79.152 hadoop000
192.168.79.152 localhost
[root@hadoop000 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.79.152 hadoop000
192.168.79.152 localhost
[root@hadoop000 ~]#
ssh免密码登录(第一次登录需要之后就不用了 他记住了)
su - hadoop (将来工作 不可能一直操作root用户 ,这里也是我们在hadoop这里进行的操作其他用户是没有的 脑子清醒点)
ssh-keygen -t rsa 后3回车 (-t type 选择加密类型)
ll -a
cd .ssh
ll 下面有私钥公钥 pub公钥 你可以查看 是堆字符串 末尾记录有hadoop@hadoop000 即用户及主机名
ssh-copy-id -i .ssh/id_rsa.pub hadoop@hadoop000
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub root@hadoop000
ssh hadoop@hadoop000 登录
如果弄错了想重来
ll -a
rm -rf ./.ssh
ll -a
ssh-keygen -t rsa 后3回车
ll ./.ssh
ssh-copy-id -i ./.ssh/id_rsa.pub hadoop@hadoop000 输入yes hadoop密码
ll ./.ssh 3个文件正常(id_rsa id_rsa.pub anthorized_key ) 如果有known_hosts那是之前操作的 删掉
现在你可以查看id_rsa.pub anthorized_key 内容一样
ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub root@hadoop000
后来发现 你在hadoop用户下登录hadoop或者root用户是不需要输入密码 但是root到hadoop就要密码
[root@hadoop000 ~]# su - hadoop
上一次登录:二 8月 11 17:58:05 CST 2020pts/0 上
[hadoop@hadoop000 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:GMkpLzGCPa+uFB5kyZOWvDT2W9IHcxSx+5EIRLflDzE hadoop@hadoop000
The key's randomart image is:
+---[RSA 2048]----+
| .o =oE |
|oo+ ..oo= o |
|.#+ ++=+ o |
|* =+.==oo + |
| + oooo+So . |
|. o.+.. . . |
| o.. . |
|.. |
|... |
+----[SHA256]-----+
[hadoop@hadoop000 ~]$ ll -a
总用量 16
drwx------. 10 hadoop hadoop 193 8月 11 19:18 .
drwxr-xr-x. 4 root root 35 8月 10 23:50 ..
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 app
-rw-------. 1 hadoop hadoop 40 8月 11 00:20 .bash_history
-rw-r--r--. 1 hadoop hadoop 18 4月 1 10:17 .bash_logout
-rw-r--r--. 1 hadoop hadoop 193 4月 1 10:17 .bash_profile
-rw-r--r--. 1 hadoop hadoop 231 4月 1 10:17 .bashrc
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .cache
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .config
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 data
drwxr-xr-x. 4 hadoop hadoop 39 8月 10 09:06 .mozilla
drwxrwxr-x. 2 hadoop hadoop 77 8月 11 17:49 software
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 source
drwx------. 2 hadoop hadoop 38 8月 11 19:18 .ssh
[hadoop@hadoop000 ~]$ cd .ssh
[hadoop@hadoop000 .ssh]$ ll
总用量 8
-rw-------. 1 hadoop hadoop 1675 8月 11 19:18 id_rsa
-rw-r--r--. 1 hadoop hadoop 398 8月 11 19:18 id_rsa.pub
[hadoop@hadoop000 .ssh]$ cat ./id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDegEY2P6DeNefaoPNMQH3tI5X2Zbxb/Z1vobyMLWXZyxPgOAVTc8vFV85F0NIffSjsjXJh3oCLoA2xBz8Z6cyTdm+qt6lN2Qn4+OvKDLbP+cFQVc1bt62gASl/571AaakLJQsMTm19cY49wLFuQtTFZ5zAVDkEhVtXRZ/6/GGOJ84E8Sh4xa9BNEHOY3BaIsWpGWt6/LR56f7YJ7mjgAnvHOI3U4LunkEnR5znn5X/Bp3dK4s2zYTLozVFITsb8BSZIPvZa1r53V3i8DP4+cT+b9e21KmUIYMxTN2NVaSJYI8OP6YeRFZZgZfUR9JGHBMiA7k8DoNdWlB0p082j0AR hadoop@hadoop000
[hadoop@hadoop000 .ssh]$ exit
登出
[root@hadoop000 ~]# ssh hadoop@hadoop000
The authenticity of host 'hadoop000 (192.168.79.152)' can't be established.
ECDSA key fingerprint is SHA256:SqudvupI2u9X4DdADL0AjAGXk5UJIqXy9wvOm80hOsY.
ECDSA key fingerprint is MD5:ea:95:5d:90:17:5c:eb:1a:92:52:2b:4a:07:a1:aa:17.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop000,192.168.79.152' (ECDSA) to the list of known hosts.
hadoop@hadoop000's password:
Permission denied, please try again.
hadoop@hadoop000's password:
Last login: Tue Aug 11 19:17:28 2020
[hadoop@hadoop000 ~]$ exit
登出
Connection to hadoop000 closed.
[root@hadoop000 ~]# ssh hadoop@hadoop000
hadoop@hadoop000's password:
Last login: Tue Aug 11 19:24:13 2020 from hadoop000
[hadoop@hadoop000 ~]$ exit
登出
Connection to hadoop000 closed.
[root@hadoop000 ~]# su - hadoop
上一次登录:二 8月 11 19:24:58 CST 2020从 hadoop000pts/3 上
[hadoop@hadoop000 ~]$ ssh-copy-id -i .ssh/id_rsa.pub hadoop@hadoop000
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_rsa.pub"
The authenticity of host 'hadoop000 (192.168.79.152)' can't be established.
ECDSA key fingerprint is SHA256:SqudvupI2u9X4DdADL0AjAGXk5UJIqXy9wvOm80hOsY.
ECDSA key fingerprint is MD5:ea:95:5d:90:17:5c:eb:1a:92:52:2b:4a:07:a1:aa:17.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@hadoop000's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@hadoop000'"
and check to make sure that only the key(s) you wanted were added.
[hadoop@hadoop000 ~]$
[hadoop@hadoop000 ~]$ ll -a
总用量 16
drwx------. 10 hadoop hadoop 193 8月 11 21:56 .
drwxr-xr-x. 4 root root 35 8月 10 23:50 ..
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 app
-rw-------. 1 hadoop hadoop 214 8月 11 22:21 .bash_history
-rw-r--r--. 1 hadoop hadoop 18 4月 1 10:17 .bash_logout
-rw-r--r--. 1 hadoop hadoop 193 4月 1 10:17 .bash_profile
-rw-r--r--. 1 hadoop hadoop 231 4月 1 10:17 .bashrc
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .cache
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .config
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 data
drwxr-xr-x. 4 hadoop hadoop 39 8月 10 09:06 .mozilla
drwxrwxr-x. 2 hadoop hadoop 77 8月 11 17:49 software
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 source
drwx------. 2 hadoop hadoop 80 8月 11 22:08 .ssh
[hadoop@hadoop000 ~]$ rm -rf .ssh
[hadoop@hadoop000 ~]$ ll -a
总用量 16
drwx------. 9 hadoop hadoop 181 8月 12 11:57 .
drwxr-xr-x. 4 root root 35 8月 10 23:50 ..
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 app
-rw-------. 1 hadoop hadoop 214 8月 11 22:21 .bash_history
-rw-r--r--. 1 hadoop hadoop 18 4月 1 10:17 .bash_logout
-rw-r--r--. 1 hadoop hadoop 193 4月 1 10:17 .bash_profile
-rw-r--r--. 1 hadoop hadoop 231 4月 1 10:17 .bashrc
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .cache
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .config
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 data
drwxr-xr-x. 4 hadoop hadoop 39 8月 10 09:06 .mozilla
drwxrwxr-x. 2 hadoop hadoop 77 8月 11 17:49 software
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 source
[hadoop@hadoop000 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:g5KOQazHIMcJUYr4y/XTAAN1J70vZYkTA3OCdFs/YYQ hadoop@hadoop000
The key's randomart image is:
+---[RSA 2048]----+
|ooooo.Bo+o+ |
|o= o.o OEo . |
|* * o . =o. |
|.B + .+ +. |
|. = + o S= |
| o * o o... |
| + . o .. |
| . |
| |
+----[SHA256]-----+
[hadoop@hadoop000 ~]$ ll ./.ssh
总用量 8
-rw-------. 1 hadoop hadoop 1675 8月 12 11:57 id_rsa
-rw-r--r--. 1 hadoop hadoop 398 8月 12 11:57 id_rsa.pub
[hadoop@hadoop000 ~]$ ssh-copy-id -i ./.ssh/id_rsa.pub hadoop@hadoop000
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "./.ssh/id_rsa.pub"
The authenticity of host 'hadoop000 (192.168.79.152)' can't be established.
ECDSA key fingerprint is SHA256:SqudvupI2u9X4DdADL0AjAGXk5UJIqXy9wvOm80hOsY.
ECDSA key fingerprint is MD5:ea:95:5d:90:17:5c:eb:1a:92:52:2b:4a:07:a1:aa:17.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
hadoop@hadoop000's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop@hadoop000'"
and check to make sure that only the key(s) you wanted were added.
[hadoop@hadoop000 ~]$ ssh 'hadoop@hadoop000
> ^C
[hadoop@hadoop000 ~]$ ssh 'hadoop@hadoop000'
Last login: Wed Aug 12 11:56:43 2020 from 192.168.79.152
[hadoop@hadoop000 ~]$ ll ./.ssh
总用量 16
-rw-------. 1 hadoop hadoop 398 8月 12 11:58 authorized_keys
-rw-------. 1 hadoop hadoop 1675 8月 12 11:57 id_rsa
-rw-r--r--. 1 hadoop hadoop 398 8月 12 11:57 id_rsa.pub
-rw-r--r--. 1 hadoop hadoop 186 8月 12 11:58 known_hosts
[hadoop@hadoop000 ~]$ rm -rf ./.ssh/known_hosts
[hadoop@hadoop000 ~]$ chmod 700 .ssh
[hadoop@hadoop000 ~]$ chmod 600 authorized_keys
chmod: 无法访问"authorized_keys": 没有那个文件或目录
[hadoop@hadoop000 ~]$ chmod 600 ./.ssh/authorized_keys
[hadoop@hadoop000 ~]$ ll ./.ssh
总用量 12
-rw-------. 1 hadoop hadoop 398 8月 12 11:58 authorized_keys
-rw-------. 1 hadoop hadoop 1675 8月 12 11:57 id_rsa
-rw-r--r--. 1 hadoop hadoop 398 8月 12 11:57 id_rsa.pub
[hadoop@hadoop000 ~]$ ll -a
总用量 16
drwx------. 10 hadoop hadoop 193 8月 12 11:57 .
drwxr-xr-x. 4 root root 35 8月 10 23:50 ..
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 app
-rw-------. 1 hadoop hadoop 214 8月 11 22:21 .bash_history
-rw-r--r--. 1 hadoop hadoop 18 4月 1 10:17 .bash_logout
-rw-r--r--. 1 hadoop hadoop 193 4月 1 10:17 .bash_profile
-rw-r--r--. 1 hadoop hadoop 231 4月 1 10:17 .bashrc
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .cache
drwxrwxr-x. 3 hadoop hadoop 18 8月 11 00:10 .config
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 data
drwxr-xr-x. 4 hadoop hadoop 39 8月 10 09:06 .mozilla
drwxrwxr-x. 2 hadoop hadoop 77 8月 11 17:49 software
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 source
drwx------. 2 hadoop hadoop 61 8月 12 11:59 .ssh
[hadoop@hadoop000 ~]$ ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub root@hadoop000
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/hadoop/.ssh/id_rsa.pub"
The authenticity of host 'hadoop000 (192.168.79.152)' can't be established.
ECDSA key fingerprint is SHA256:SqudvupI2u9X4DdADL0AjAGXk5UJIqXy9wvOm80hOsY.
ECDSA key fingerprint is MD5:ea:95:5d:90:17:5c:eb:1a:92:52:2b:4a:07:a1:aa:17.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@hadoop000's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@hadoop000'"
and check to make sure that only the key(s) you wanted were added.
[hadoop@hadoop000 ~]$ ssh root@hadoop000
Last login: Wed Aug 12 11:52:50 2020 from 192.168.79.1
[root@hadoop000 ~]# ssh hadoop@hadoop000
hadoop@hadoop000's password:
Last login: Wed Aug 12 11:58:48 2020 from 192.168.79.152
[hadoop@hadoop000 ~]$ ssh root@hadoop000
Last login: Wed Aug 12 12:02:18 2020 from 192.168.79.152
[root@hadoop000 ~]# ll /home/hadoop/
总用量 0
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 app
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 data
drwxrwxr-x. 2 hadoop hadoop 77 8月 11 17:49 software
drwxrwxr-x. 2 hadoop hadoop 6 8月 11 17:51 source
[root@hadoop000 ~]# ll /home/hadoop/.ssh/
总用量 16
-rw-------. 1 hadoop hadoop 398 8月 12 11:58 authorized_keys
-rw-------. 1 hadoop hadoop 1675 8月 12 11:57 id_rsa
-rw-r--r--. 1 hadoop hadoop 398 8月 12 11:57 id_rsa.pub
-rw-r--r--. 1 hadoop hadoop 186 8月 12 12:01 known_hosts
[root@hadoop000 ~]# chmod 700 /home/hadoop/.ssh
[root@hadoop000 ~]# chmod 600 /home/hadoop/.ssh/authorized_keys
[root@hadoop000 ~]# rm -rf /home/hadoop/.ssh/known_hosts
[root@hadoop000 ~]# ssh hadoop@hadoop000
hadoop@hadoop000's password:
Last login: Wed Aug 12 12:03:07 2020 from 192.168.79.152
[hadoop@hadoop000 ~]
HDFS文件参数设置
解压hadoop
19 su - hadoop
20 cd software/
21 tar zxvf hadoop-2.6.0-cdh5.7.0.tar.tar -C ~/app/
22 cd ~/app/
23 ll
24 cd hadoop-2.6.0-cdh5.7.0/
25 ll
su root(hadoop权限不够安不了 tree查看目录工具 这步安装tree不必要就是方便看目录结构)
yum install -y tree
ssh hadoop@hadoop000
19 cd ./app/hadoop-2.6.0-cdh5.7.0/
20 tree
21 tree -L 2(后面可在追加具体目录) 指定深度的目录结构-L
23 ll
24 tree bin
(两处删除只是为了看起来少点以及启动更方便 .cmd客户端的 这里没用 而且最大好处就是将来启动是唯一的 不然配置好环境变量 老是提示到底用cmd还是ssh)
25 rm -rf bin/*.cmd
26 tree bin
27 tree sbin
28 rm -rf sbin/*.cmd
29 tree sbin
(课外知识:找某个文件/目录 whereis app find / -name app)
接下来修改配置文件1、(hadoop-2.6.0-cdh5.7.0/etc/hadoop/hadoop-env.sh) 需要用到java运行目录到bin即可
36 cd etc/
37 ll
38 cd hadoop
39 ll
40 rm -rf *.cmd
41 ll
42 vim hadoop-env.sh 要用到前面我们弄得jdk路径shift +zz出来
[root@hadoop000 ~]# which java
/usr/local/jdk1.8.0_251/bin/java 不要到bin
44 vim hadoop-env.sh
另外 不妨我们把hadoop也换上 将来用更方便
[hadoop@hadoop000 hadoop]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop
即里面内容为 添加jdk目录 添加hadoop目录
export JAVA_HOME=/usr/local/jdk1.8.0_251
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
再弄用户环境变量
vim ~/.bash_profile
最下面中间加入下面内容
export hadoop_home=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export PATH=$PATH:$hadoop_home/bin:$hadoop_home/sbin
source ~/.bash_profile
接着你 cd ~ bi+tab 看有没有bind biosdecode biosdevname
[hadoop@hadoop000 ~]$ hdfs 有没有 hdfs hdfs-config.sh
不好用就有问题了
echo $hadoop_home
echo $PATH 我这里是正常的 如果你这里弄了却没出现我们刚才的路径 解决办法
1、 vim ~/.bash_profile 恢复到原来状态 source ~/.bash_profile
2、su root vim /etc/profile
打开要修改的地方是这样的
export java_home=/usr/local/jdk1.8.0_251
export PATH=$PATH:$java_home/bin
弄成下面
export java_home=/usr/local/jdk1.8.0_251
export hadoop_home=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export PATH=$PATH:$java_home/bin:$hadoop_home/bin:$hadoop_home/sbin
source /etc/profile
2、/etc/hadoop/core-site.xml
vim core-site.xml 删掉原来的<configuration> 直接复制下面内容
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/app/tmp</value>
</property>
</configuration>
3、etc/hadoop/hdfs-site.xml:
vim hdfs-site.xml 跟2一样
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
下面注意 我们前面配置了$hadoop_home它是到/home/hadoop/app/hadoop-2.6.0-cdh5.7.0 就等于不用输前面那么多内容了
[hadoop@hadoop000 ~]$ cd $hadoop_home/etc/
[hadoop@hadoop000 etc]$ cd $hadoop_home
[hadoop@hadoop000 hadoop-2.6.0-cdh5.7.0]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop000 hadoop-2.6.0-cdh5.7.0]$
下面这情况你的去root用户配置了
下面这个错就是当初修改这个文件(hadoop-env.sh)jdk路径多了个bin 上面我以去掉了bin 应该不会出现了
HDFS格式化及启停
格式化
cd ~
hdfs namenode -format
如果你想重新格式化操作 必须删除 tmp目录 ( /home/hadoop/app/tmp)
启动
start-dfs.sh 输入两次yes
jps 能打印出几个几点成功
7141 Jps 6727 NameNode 6826 DataNode 7019 SecondaryNameNode
他们的关系是: 最高节点 叫NameNode(做源数据的保存 索引啊 就是目录系统) 为了保证他的安全就得有个备份它的节点 这个节点叫SecondaryNameNode NameNode下面存数据 至少两个(有个备份的) 叫DataNode 任意一个挂了另一个立马来恢复
也可以用浏览器访问方法检查是否成功
数组机 你的先关闭防火前
systemctl stop firewalld 因为在hadoop用户 需要权限的 这个时候你看下面需要输入哪个用户的密码 输入正确后去浏览器输入
http://192.168.79.152:50070/
[hadoop@hadoop000 ~]$ ll /home/hadoop/app/tmp/dfs/name
总用量 8 原先只有 current 现在多了个组in_use.lock
drwxrwxr-x. 2 hadoop hadoop 4096 8月 12 15:16 current
-rw-rw-r--. 1 hadoop hadoop 14 8月 12 15:14 in_use.lock
创建目录
[hadoop@hadoop000 ~]$ hdfs dfs -mkdir /user
20/08/12 15:54:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop000 ~]$
[hadoop@hadoop000 ~]$ hdfs dfs -mkdir /input
20/08/12 15:56:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
上传
[hadoop@hadoop000 ~]$ hdfs dfs -put /etc/passwd /input
20/08/12 15:57:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
下载
[hadoop@hadoop000 ~]$ hdfs dfs -get /input/passwd /opt
20/08/12 16:06:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
get: /opt/passwd._COPYING_ (权限不够)
[hadoop@hadoop000 ~]$
[hadoop@hadoop000 ~]$ sudo mkdir /opt/out
sudo: /etc/sudoers 可被任何人写
sudo: 没有找到有效的 sudoers 资源,退出
sudo: 无法初始化策略插件
[hadoop@hadoop000 ~]$
既然不行就弄到tmp那个是我们自己创建的 终于成功
[hadoop@hadoop000 ~]$ hdfs dfs -get /input/passwd /home/hadoop/app/tmp/
20/08/12 16:10:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop000 ~]$ ll /home/hadoop/app/tmp/
总用量 4
drwxrwxr-x. 5 hadoop hadoop 51 8月 12 15:15 dfs
-rw-r--r--. 1 hadoop hadoop 2414 8月 12 16:10 passwd
[hadoop@hadoop000 ~]$
查看
[hadoop@hadoop000 ~]$ hdfs dfs -cat /input/passwd
关闭 stop-dfs.sh
[hadoop@hadoop000 ~]$ stop-dfs.sh
20/08/12 16:14:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
20/08/12 16:14:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop000 ~]$ jps
9697 Jps
[hadoop@hadoop000 ~]$
备份1个因为就一台机器 每个块都按照128MB走的
HDFS shell操作
下面可查看跟他他们后面的常见命令
[hadoop@hadoop000 ~]$ hdfs dfs
[-appendToFile … ]
[-cat [-ignoreCrc] …]
[-checksum …]
[-chgrp [-R] GROUP PATH…]
[-chmod [-R] <MODE[,MODE]… | OCTALMODE> PATH…]
[-chown [-R] [OWNER][:[GROUP]] PATH…]
[-copyFromLocal [-f] [-p] [-l] … ]
[-copyToLocal [-p] [-ignoreCrc] [-crc] … ]
[-count [-q] [-h] [-v] …]
[-cp [-f] [-p | -p[topax]] … ]
[-createSnapshot []]
[-deleteSnapshot ]
[-df [-h] [ …]]
[-du [-s] [-h] …]
[-expunge]
[-find … …]
[-get [-p] [-ignoreCrc] [-crc] … ]
[-getfacl [-R] ]
[-getfattr [-R] {-n name | -d} [-e en] ]
[-getmerge [-nl] ]
[-help [cmd …]]
[-ls [-d] [-h] [-R] [ …]]
[-mkdir [-p] …]
[-moveFromLocal … ]
[-moveToLocal ]
[-mv … ]
[-put [-f] [-p] [-l] … ]
[-renameSnapshot ]
[-rm [-f] [-r|-R] [-skipTrash] …]
[-rmdir [–ignore-fail-on-non-empty] …]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} ]]
[-setfattr {-n name [-v value] | -x name} ]
[-setrep [-R] [-w] …]
[-stat [format] …]
[-tail [-f] ]
[-test -[defsz] ]
[-text [-ignoreCrc] …]
[-touchz …]
[-usage [cmd …]]
查看所有目录:hdfs dfs -ls /
• 递归显示文件夹:hdfs dfs -ls -r /
• 创建目录:hdfs dfs -mkdir
• 递归创建文件夹:hdfs dfs -mkdir -p
• 上传文件:hdfs dfs -put Hello.txt /
• 拷贝到本地:hdfs dfs -get /Hello.txt
• 查看文件内容:hdfs dfs -cat /Hello.txt
• 删除文件:hdfs dfs -rm /hello.txt
• 删除文件夹:hdfs dfs -rm -r /test