.hiverc文件 hive add file

转载

IT剑客行 2023-06-12 20:56:56

文章标签 .hiverc文件 hive 数据 HDFS 文章分类 Hive 大数据

文章目录

DML 数据操作

1. 数据导入

1.1 方式一：load data方式向表中装载数据
1.2 方式二：通过查询语句向表中插入数据（Insert）
1.3 方式三：查询语句中创建表并加载数据（As Select）
1.4 方式四：创建表时通过location制定加载数据路径
1.5 方式五：Import数据导指定Hive表中

2. 数据导出

2.1 方式一：insert导出
2.2 方式二：Hadoop命令导出到本地
2.3 方式三：Hive Shell 命令导出
2.4 方式四：Export导出到HDFS上
2.5 方式五：Sqoop导出

3. 清除表中数据（Truncate）

DML 数据操作

1. 数据导入

1.1 方式一：load data方式向表中装载数据

-- 先创建一张表 
hive(default)> create table student(id string, name string) row format delimited fields terminated by '\t';

-- 装载数据
hive(default)> load data [local] inpath '' [overwrite] into table student [partition(month='201908')]

注：

① 如果建表的时候不指明分隔符，插入后会是
NULL NULL
NULL NULL
NULL NULL
② 现在表中有两个字段，如果文件中一行有三个字段，也是可以插入的。只不过字三个字段不显示。数据和表字段进行匹配，匹配成功（个数、类型）则显示，不成功显示NULL。
③ 当再次 load student.txt时，不会覆盖 hdfs中的student.txt，会新建一个文件，student_copy_1.txt。
overwrite into 可以覆盖数据，同时hdfs中的文件也会被覆盖，不会再新建一个新的文件了。
load data local inpath ‘/usr/local/hive-3.1.2/student.txt’ overwrite into table student;
④ 插入时不检验数据是否合法，都会插入，只不过显示的时候会显示NULL。
⑤ load hdfs的数据相当于mv文件到另一个目录中，原目录文件消失。

1.2 方式二：通过查询语句向表中插入数据（Insert）

-- 先创建一张分区表
hive(default)> create table student(id int, name string) partitioned by (month string) row format delimited fields terminated by '\t';

-- 基本插入数据
hive(default)> insert into table student partition(month='201907') values(1,'wangwu')

-- 根据单张表查询结果
hive(default)> insert into table student partition(month='201907') select id, name from student where month='201907'

-- 根据多张表查询结果
hive(default)> from student
          insert overwrite table student partition(month='201909')
          select id, name where month='201907'
          insert overwrite table student partition(month='201910')
          select id, name where month='201908';
-- 7月数据写到9月，将8月数据写到10月。

1.3 方式三：查询语句中创建表并加载数据（As Select）

hive(default)> create table if not exists student3 as select id, name from student;

注：这种方式不能创建外部表

1.4 方式四：创建表时通过location制定加载数据路径

hive(default)> create table if not exists student3(id int, name string) row format delimited fields terminated by '\t'
location '/user/hive/warehouse/student3'
-- 上传数据到hdfs上
hive (default)hive(default)> dfs -put /opt/module/datas/student.txt  /user/hive/warehouse/student3;

1.5 方式五：Import数据导指定Hive表中

先用export导出后（导出的数据目录里面附带有元数据)，再import数据导入。同在HDFS上是Copy级操作。

-- 导出所有分区的数据
hive(default)> export table dept_partition1 to '/user/hive/warehouse/exports/dept_partition1_all';

-- 先创建一个新的表
hive(default)> create table dept_partition_dump(deptno int, dname string, loc string) partitioned by (month string) row format delimited fields terminated by '\t';

-- import导入
hive(default)> import table dept_partition_dump from '/user/hive/warehouse/exports/dept_partition1_all';

-- 导出指定分区
hive(default)> export table dept_partition1 partition(month='201907') to '/user/hive/warehouse/exports/dept_partition1_201907';

-- 先删除刚导入的分区，否则会提示Partition already exists month=201907
hive(default)> alter table dept_partition_dump drop partition(month='201907');
hive(default)> import table dept_partition_dump partition(month='201907') from '/user/hive/warehouse/exports/dept_partition1_201907';

总结：
export table student [partition(month=‘201907’)] to ‘’
import table student_dump [partition(month=‘201907’)] from ‘’

2. 数据导出

2.1 方式一：insert导出

-- 将查询的结果导出到本地，数据之间无间隔。
hive(default)> insert overwrite local directory '/usr/local/hive-3.1.2/test_files/dept_partition1' select * from dept_partition1;

-- 将查询的结果格式化导出到本地,数据之间"\t"间隔
hive(default)> insert overwrite local directory '/usr/local/hive-3.1.2/test_files/dept_partition1_tab' 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' select * from dept_partition1;

-- 将查询的结果导出到HDFS上(没有local)
hive(default)> insert overwrite directory '/user/hive/warehouse/exports/dept_partition1' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' select * from dept_partition1;
-- 注：虽然同是HDFS，但不是copy操作

2.2 方式二：Hadoop命令导出到本地

-- 在hive中执行比在hdfs中执行要快很多
hive(default)> dfs -get /user/hive/warehouse/during.db/dept_partition_dump/month=201907/000000_0  /usr/local/hive-3.1.2/test_files/dept_partition_dump_201907.txt;

2.3 方式三：Hive Shell 命令导出

基本语法：（hive -f/-e 执行语句或者脚本 > file）

bin/hive -e 'select * from student;' > /usr/local/hive-3.1.2/test_files/student4.txt;
bin/hive -f /opt/module/datas/hivef.sql > /opt/module/datas/hive_result.txt

2.4 方式四：Export导出到HDFS上

hive(default)> export table student to '/user/hive/warehouse/export/student';

2.5 方式五：Sqoop导出

通过Sqoop将Hive中的数据导出到MySQL。

sqoop export \
--connect jdbc:mysql://topnpl200:3306/topdb_dev?characterEncoding=utf-8 \
--username root \
--password TOPtop123456 \
--export-dir /user/hive/warehouse/during.db/stu \
--table stu_hive \
--num-mappers 1 \
--input-fields-terminated-by "\t"

注：导出到MySQL时，如果MySQL表不存在，不会自动创建。