文章目录

  • hdfs的dfs shell常用命令操作
  • hdfs dfs命令的所有操作
  • 权限相关操作
  • 文件相关操作
  • 上传本地文件到hadoop的dfs文件系统
  • 将文件从hadoop的hdfs文件系统下载到本地
  • 新建文件:touch
  • 查找文件:find
  • 查看文件内容
  • 文件重命名:mv
  • 删除文件:rm
  • 清空文件内容直到指定的长度:truncate + 文件大小
  • 目录相关操作
  • 创建目录 mkdir
  • 删除目录 rmdir
  • hdfs 的常用命令
  • 查看hadoop的版本信息:hdfs version
  • 获取hadoop的配置信息


hdfs的dfs shell常用命令操作

hadoop的dfs的shell命令一般有如下两种形式:

  • hadoop fs + 操作
  • hdfs dfs + 操作

我们一般使用下面这种,在这篇笔记中,我将全部使用第二种形式来总结介绍hadoop的dfs文件系统的常用操作

hdfs dfs命令的所有操作

[root@k8s-node3 home]# hdfs dfs --help
--help: Unknown command
Usage: hadoop fs [generic options]
	[-appendToFile <localsrc> ... <dst>]
	[-cat [-ignoreCrc] <src> ...]
	[-checksum [-v] <src> ...]
	[-chgrp [-R] GROUP PATH...]
	[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
	[-chown [-R] [OWNER][:[GROUP]] PATH...]
	[-concat <target path> <src path> <src path> ...]
	[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
	[-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] [-s] <path> ...]
	[-cp [-f] [-p | -p[topax]] [-d] <src> ... <dst>]
	[-createSnapshot <snapshotDir> [<snapshotName>]]
	[-deleteSnapshot <snapshotDir> <snapshotName>]
	[-df [-h] [<path> ...]]
	[-du [-s] [-h] [-v] [-x] <path> ...]
	[-expunge [-immediate] [-fs <path>]]
	[-find <path> ... <expression> ...]
	[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
	[-getfacl [-R] <path>]
	[-getfattr [-R] {-n name | -d} [-e en] <path>]
	[-getmerge [-nl] [-skip-empty-file] <src> <localdst>]
	[-head <file>]
	[-help [cmd ...]]
	[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]
	[-mkdir [-p] <path> ...]
	[-moveFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>]
	[-moveToLocal <src> <localdst>]
	[-mv <src> ... <dst>]
	[-put [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]
	[-renameSnapshot <snapshotDir> <oldName> <newName>]
	[-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...]
	[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
	[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
	[-setfattr {-n name [-v value] | -x name} <path>]
	[-setrep [-R] [-w] <rep> <path> ...]
	[-stat [format] <path> ...]
	[-tail [-f] [-s <sleep interval>] <file>]
	[-test -[defswrz] <path>]
	[-text [-ignoreCrc] <src> ...]
	[-touch [-a] [-m] [-t TIMESTAMP (yyyyMMdd:HHmmss) ] [-c] <path> ...]
	[-touchz <path> ...]
	[-truncate [-w] <length> <path> ...]
	[-usage [cmd ...]]

权限相关操作

  • 更改文件的权限:chmod + 权限掩码 + 文件名

如果需要更改目录的权限,则可以加上 chmox -R + 权限掩码 + 目录名

[root@k8s-node3 home]# hdfs dfs -ls /root/data/test.txt
Found 1 items
-rw-r--r--   1 root supergroup          0 2022-05-25 03:09 /root/data/test.txt
[root@k8s-node3 home]# hdfs dfs -chmod 777 /root/data/test.txt 
[root@k8s-node3 home]# hdfs dfs -ls /root/data/
Found 1 items
-rwxrwxrwx   1 root supergroup          0 2022-05-25 03:09 /root/data/test.txt
  • 更改文件或目录所属组:chgrp
[root@k8s-node3 home]# hdfs dfs -chgrp root2 /root/data
[root@k8s-node3 home]# hdfs dfs -ls  /root/
Found 4 items
drwxrwxrwx   - root root2               0 2022-05-25 03:43 /root/data
drwxr-xr-x   - root supergroup          0 2021-09-29 06:15 /root/mapreduce
drwxr-xr-x   - root supergroup          0 2021-09-03 00:53 /root/ouput
drwxr-xr-x   - root supergroup          0 2021-09-03 02:58 /root/ouput2
[root@k8s-node3 home]#
  • 更改文件所属用户:chown
[root@k8s-node3 home]# hdfs dfs -chown root2:root2 /root/data
[root@k8s-node3 home]# hdfs dfs -ls  /root/
Found 4 items
drwxrwxrwx   - root2 root2               0 2022-05-25 03:43 /root/data
drwxr-xr-x   - root  supergroup          0 2021-09-29 06:15 /root/mapreduce
drwxr-xr-x   - root  supergroup          0 2021-09-03 00:53 /root/ouput
drwxr-xr-x   - root  supergroup          0 2021-09-03 02:58 /root/ouput2
[root@k8s-node3 home]#

文件相关操作

上传本地文件到hadoop的dfs文件系统

  • put + 本地目录 + hdfs的目录

将本地目录的文件上传到hdfs的目录中,如下所示,将本地目录nginx/test.txt文件上传到hadoop的文件系统中的/root/data/nginx/目录中

[root@k8s-node3 home]# hdfs dfs -put ./nginx/test.txt /root/data/nginx/ 
[root@k8s-node3 home]# hdfs dfs -put ./nginx/test.txt /root/data/nginx/test2.txt 
[root@k8s-node3 home]# hdfs dfs -ls /root/data/nginx/
Found 2 items
-rw-r--r--   1 root supergroup          0 2022-05-25 03:24 /root/data/nginx/test.txt
-rw-r--r--   1 root supergroup          0 2022-05-25 03:27 /root/data/nginx/test2.txt
[root@k8s-node3 home]#
  • copyFromLocal + 本地路径 + hdfs的路径
[root@k8s-node3 home]# hdfs dfs -copyFromLocal ./data/ /root/data/data4
  • moveFromLocal + 本地路径 + hdfs的路径
[root@k8s-node3 home]# hdfs dfs -moveFromLocal ./data/ /root/data/data3

将文件从hadoop的hdfs文件系统下载到本地

  • get + hdfs的路径 + 本地路径
[root@k8s-node3 home]# hdfs dfs -get /root/data/student.txt ./student2.txt 
[root@k8s-node3 home]# ls
administrator  elasticsearch  k8s       logstash-config    logstash-pipeline   neo4j  software      tensorflow
conf           hadoop3.3      logstash  logstash-metedata  mapreduce_task_jar  nginx  student2.txt  www
[root@k8s-node3 home]#
  • copyToLocal + hdfs的路径 + 本地路径
[root@k8s-node3 home]# hdfs dfs -copyToLocal /root/data/student.txt ./student3.txt
  • moveToLocal + hdfs的路径 + 本地路径
[root@k8s-node3 home]# hdfs dfs -moveToLocal /root/data/student.txt ./student3.txt

新建文件:touch

[root@k8s-node3 home]# hdfs dfs -touch /root/data/test.txt

查找文件:find

[root@k8s-node3 home]# hdfs dfs -find /root/data -name "*.txt"
/root/data/data3/student.txt
/root/data/nginx/test.txt
/root/data/student.txt
/root/data/test.txt

查看文件内容

  • 查看文件的所有内容:cat
[root@k8s-node3 home]# hdfs dfs -cat /root/data/test.txt
  • 查看文件开头的1KB内容:head
[root@k8s-node3 home]# hdfs dfs -head /root/data/test.txt
  • 查看文件开头的1KB内容:tail
[root@k8s-node3 home]# hdfs dfs -tail /root/data/test.txt
  • 查看文件的信息:ls
[root@k8s-node3 home]# hdfs dfs  -ls /root/data/student.txt
-rw-r--r--   1 root supergroup   55000000 2021-09-02 23:07 /root/data/student.txt

文件重命名:mv

[root@k8s-node3 home]# hdfs dfs -mv /root/data/nginx/test2.txt /root/data/nginx/test3.txt 
[root@k8s-node3 home]# hdfs dfs -ls /root/data/nginx/
Found 2 items
-rw-r--r--   1 root supergroup          0 2022-05-25 03:24 /root/data/nginx/test.txt
-rw-r--r--   1 root supergroup          0 2022-05-25 03:27 /root/data/nginx/test3.txt
[root@k8s-node3 home]#

删除文件:rm

[root@k8s-node3 home]# hdfs dfs -rm /root/data/nginx/test2.txt /root/data/nginx/test3.txt 
rm: `/root/data/nginx/test2.txt': No such file or directory
Deleted /root/data/nginx/test3.txt
[root@k8s-node3 home]#

清空文件内容直到指定的长度:truncate + 文件大小

[root@k8s-node3 home]# hdfs dfs -truncate 0 /root/data/student.txt
Truncated /root/data/student.txt to length: 0

目录相关操作

创建目录 mkdir

[root@k8s-node3 home]# hdfs dfs -mkdir /root/data3
[root@k8s-node3 home]# hdfs dfs -ls /root
Found 5 items
drwxrwxrwx   - root2 root2               0 2022-05-25 03:43 /root/data
drwxr-xr-x   - root  supergroup          0 2022-05-25 04:26 /root/data3
drwxr-xr-x   - root  supergroup          0 2021-09-29 06:15 /root/mapreduce
drwxr-xr-x   - root  supergroup          0 2021-09-03 00:53 /root/ouput
drwxr-xr-x   - root  supergroup          0 2021-09-03 02:58 /root/ouput2

删除目录 rmdir

  • rm -R 的方式删除目录
[root@k8s-node3 home]# hdfs dfs -rm -R /root/data3
Deleted /root/data3
  • rmdir的方式
    如果被刪除的目录非空,则需要加上参数–ignore-fail-on-non-empty
[root@k8s-node3 home]# hdfs dfs -rmdir --ignore-fail-on-non-empty  /root/data
[root@k8s-node3 home]#

hdfs 的常用命令

查看hadoop的版本信息:hdfs version

[root@k8s-node3 home]# hdfs version
Hadoop 3.3.1
Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2
Compiled by ubuntu on 2021-06-15T05:13Z
Compiled with protoc 3.7.1
From source with checksum 88a4ddb2299aca054416d6b7f81ca55
This command was run using /home/software/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar
[root@k8s-node3 home]#

获取hadoop的配置信息

hdfs getconf 用于获取hadoop的相关配置信息

  • hdfs getconf -namedoes 获取namenode信息
[root@k8s-node3 home]# hdfs getconf -namenodes
k8s-node3 k8s-node8
  • hdfs getconf -secondaryNameNodes 获取secondaryNameNodes信息
[root@k8s-node3 home]# hdfs getconf -secondaryNameNodes
Incorrect configuration: secondary namenode address dfs.namenode.secondary.http-address is not configured.
  • hdfs getconf -backupNodes 获取backupNodes信息
[root@k8s-node3 home]# hdfs getconf -backupNodes
Incorrect configuration: backup node address dfs.namenode.backup.address is not configured.
  • hdfs getconf -journalNodes 获取journalNodes信息
[root@k8s-node3 home]# hdfs getconf -journalNodes
k8s-node5 k8s-node8 k8s-node3
  • hdfs getconf -confKey dfs.namenode.fs-limits.min-block-size 获取最小块信息
[root@k8s-node3 home]# hdfs getconf -confKey dfs.namenode.fs-limits.min-block-size
1048576

默认的最小块是128M