hive ddl实例

转载

墨守成规de网工 2024-09-06 14:26:04

文章标签 hive ddl实例大数据 hive hadoop 数据 文章分类 Hive 大数据

1、数据导入

1）向表中装载数据（Load）

（1）语法

hive> load data [local] inpath '数据的path' [overwrite] into table student [partition (partcol1=val1,…)];

（1）load data:表示加载数据

（2）local:表示从linux本地加载数据到hive表；否则从HDFS加载数据到hive表

（3）inpath:表示加载数据的路径

（4）overwrite:表示覆盖表中已有数据（不管表中有几个文件都会删除），否则表示追加

（5）into table:表示加载到哪张表

（6）student:表示具体的表

（7）partition:表示上传到指定分区

（2）实操案例

（0）创建一张表

hive (default)> create table student(id int, name string) row format delimited fields terminated by '\t';

（1）加载本地文件到hive（等于拷贝过去）

hive (default)> load data local inpath '/opt/module/hive/datas/student.txt' into table student;

（2）加载HDFS文件到hive中（是把hdfs中的文件剪切过去）

上传文件到HDFS

hive (default)> dfs -put /opt/module/hive/datas/student.txt /user/atguigu;

加载HDFS上数据，导入完成后去hdfs查看文件是否还存在

hive (default)> load data inpath '/user/atguigu/student.txt' into table student;

（3）加载数据覆盖表中已有的数据

上传文件到HDFS

hive (default)> dfs -put /opt/module/hive/datas/student.txt /user/atguigu;

加载数据覆盖表中已有的数据

hive (default)> load data inpath '/user/atguigu/student.txt' overwrite into table student;

2）通过查询语句向表中插入数据（Insert）

（1）创建一张表

hive (default)> create table student2(id int, name string) row format delimited fields terminated by '\t';

（2）基本模式插入数据

hive (default)> insert into table  student2 values(1,'wangwu'),(2,'zhaoliu');

（3）根据查询结果插入数据

hive (default)> insert overwrite table student2  select id, name from student where id < 1006;

insert into：以追加数据的方式插入到表或分区，原有数据不会删除

insert overwrite：会覆盖表中已存在的数据

注意：insert不支持插入部分字段，并且后边跟select语句时，select之前不能加as，加了as会报错，一定要跟下面的as select区分开。

3）查询语句中创建表并加载数据（As Select）

根据查询结果创建表（查询的结果会添加到新创建的表中）

create table if not exists student3
as select id, name from student;

4）创建表时通过Location指定加载数据路径

（1）上传数据到hdfs上

[atguigu@hadoop102 datas]$ hadoop fs -mkdir -p /stu3;
[atguigu@hadoop102 datas]$ hadoop fs -put stu3.txt /stu3

（2）创建表，并指定在hdfs上的位置

hive (default)> create external table if not exists student5(
              id int, name string
              )
              row format delimited fields terminated by '\t'
              location '/student';

（3）查询数据

hive (default)> select * from student5;

5） Import数据到指定Hive表中

注意：先用export导出后，再将数据导入。并且因为export导出的数据里面包含了元数据，因此import要导入的表不可以存在，否则报错。

hive (default)> export table default.student to
 '/user/hive/warehouse/export/student';

hive (default)> import table student2  from

 '/user/hive/warehouse/export/student';

2、数据导出

1） Insert导出

（1）将查询的结果导出到本地

hive (default)> insert overwrite local directory '/opt/module/hive/datas/export/student'
            select * from student;

（2）将查询的结果格式化导出到本地

hive(default)>insert overwrite local directory '/opt/module/hive/datas/export/student1'
           ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'             select * from student;

（3）将查询的结果导出到HDFS上(没有local)

hive (default)> insert overwrite directory '/user/atguigu/student2'
             ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 
             select * from student;

注意：insert 导出，导出的目录不用自己提前创建，hive会帮我们自动创建，但是由于是overwrite，所以导出路径一定要写具体，否则很可能会误删数据。这个步骤很重要，切勿大意。

2）Hadoop命令导出到本地

hive (default)> dfs -get /user/hive/warehouse/student/student.txt
/opt/module/hive/datas/export/student3.txt;

3）Hive Shell 命令导出

基本语法：（hive -f/-e 执行语句或者脚本 > file）

[atguigu@hadoop102 hive]$ bin/hive -e 'select * from default.student;' >
 /opt/module/hive/datas/export/student4.txt;

4）Export导出到HDFS上

hive (default)> export table default.student to
 '/user/hive/warehouse/export/student';

export和import主要用于两个Hadoop平台集群之间Hive表迁移，不能直接导出的本地。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：qt treepropertymanager 对com组件的调用返回了错误hresult e_fail

下一篇：mysql主表小副表大用什么查询

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

hive ddl实例

hive ddl实例

1、数据导入

1）向表中装载数据（Load）

（1）语法

（2）实操案例

2）通过查询语句向表中插入数据（Insert）

（1）创建一张表

（2）基本模式插入数据

（3）根据查询结果插入数据

3）查询语句中创建表并加载数据（As Select）

4）创建表时通过Location指定加载数据路径

（1）上传数据到hdfs上

（2）创建表，并指定在hdfs上的位置

（3）查询数据

5） Import数据到指定Hive表中

2、数据导出

1） Insert导出

（1）将查询的结果导出到本地

（2）将查询的结果格式化导出到本地

（3）将查询的结果导出到HDFS上(没有local)

2）Hadoop命令导出到本地

3）Hive Shell 命令导出

4）Export导出到HDFS上

51CTO博客