python部署impala SQL日期参数 python impala 查询

转载

游侠小影 2023-12-06 07:00:24

文章标签 impala查询语句 sql 执行计划 hdfs 文章分类 Python 后端开发

Impala的操作命令

一、Impala的外部shell

选项

描述

-h, --help

显示帮助信息

-v or --version

显示版本信息

-i hostname, --impalad=hostname

指定连接运行 impalad 守护进程的主机。默认端口是 21000。

-q query, --query=query

从命令行中传递一个shell 命令。执行完这一语句后 shell 会立即退出。

-f query_file, --query_file= query_file

传递一个文件中的 SQL 查询。文件内容必须以分号分隔

-ofilename or --output_file filename

保存所有查询结果到指定的文件。通常用于保存在命令行使用 -q 选项执行单个查询时的查询结果。(会覆盖原目标文件的内容)

-c

查询执行失败时继续执行

-d default_db or --database=default_db

指定启动后使用的数据库，与建立连接后使用use语句选择数据库作用相同，如果没有指定，那么使用default数据库

-r or --refresh_after_connect

建立连接后刷新 Impala 元数据

-p, --show_profiles

对 shell 中执行的每一个查询，显示其查询执行计划

-B(--delimited)

去格式化输出(就是去掉查询结果的表外框)

--output_delimiter=character

指定分隔符

--print_header

打印列名

1.连接指定bigdata12的impala主机－ｉ

[root@bigdata11 datas]# impala-shell -i bigdata12

2.查询表中数据，并将数据写入文件中－q、-o

引号内部可以不写分号，因为它只识别一条语句。

[root@cdh2 ~]#  impala-shell -q "select * from student"
[hdfs@bigdata12 ~]$impala-shell -q 'select * from student' -o output.txt

3.查询执行失败时继续执行－ｃ

select * from student;
select * from 111;
select id from student;

[root@cdh2 ~]$ impala-shell -c -f impalasql;

将－ｆ、－ｃ、－ｏ组合使用

[root@cdh2 ~]#impala-shell -c -f /root/impalasql -o /root/impalasqlout

将含有问题的查询语句的文件执行出来的结果保存到指定文件中去

[root@cdh2 ~]# vi impalasqlout

4.执行一个文件－ｆ

１)先创建一个文件，文件内容为：

select * from student;
select id from student;

２)执行该文件－ｆ：

[root@cdh2 ~]#impala-shell -f /root/impalasql

5.去格式化输出－B：

[root@cdh2 ~]#impala-shell -c -f /root/impalasql -o /root/impalasqlout -B
[root@cdh2 ~]# vi impalasqlout

对比之前的文件，不难看出是以覆盖的方式写入到此文件中

再举个例子：

[root@bigdata12 ~]# impala-shell -q 'select * from student' -B --output_delimiter="\t" -o output.txt

注：output.txt 是相对于Linux本地的相对路径，并且是以覆盖的方式写入到此文件中

[root@bigdata12 ~]# cat output.txt
1001    tignitgn
1002    yuanyuan
1003    haohao
1004    yunyun

6.加上分隔符--output_delimiter=

[root@cdh2 ~]#impala-shell -c -f /root/impalasql -o /root/impalasqlout-B --output_delimiter=,

像这里就是用逗号作为分隔符分割各列数据

7.在hive中创建表后，刷新元数据－ｒ

hive> create table stu(id int, name string);
[bigdata12:21000] > show tables;
Query: show tables
+---------+
| name    |
+---------+
| student |
+---------+
[hdfs@bigdata12 ~]$impala-shell -r
[bigdata12:21000] > show tables;
Query: show tables
+---------+
| name    |
+---------+
| stu   |
| student |
+---------+

8.显示查询执行计划

[hdfs@bigdata12 ~]$impala-shell -p
[bigdata12:21000] > select * from student;

二、Impala的内部shell

选项

描述

help

显示帮助信息

explain

显示执行计划

profile

(查询完成后执行) 查询最近一次查询的底层信息

shell

不退出impala-shell执行shell命令

version

显示版本信息(同于impala-shell -v)

connect

连接impalad主机，默认端口21000(同于impala-shell -i)

refresh

增量刷新元数据库(指定某张表的数据进行刷新)

invalidate metadata

全量刷新元数据库(慎用，刷新所有表)(同于 impala-shell -r)

history

历史命令

1.查看执行计划

explain select * from student;

2.查询最近一次查询的底层信息

[bigdata12:21000] > select count(*) from student;
[bigdata12:21000] >profile;

3.查看hdfs及linux文件系统

[bigdata12:21000] >shell hadoop fs -ls /;
[bigdata12:21000] >shell ls -al ./;

4.刷新指定表的元数据

hive> load data local inpath '/opt/module/datas/student.txt' into table student;
[bigdata12:21000] > select * from student;
[bigdata12:21000] >refresh student;
[bigdata12:21000] > select * from student;

5.查看历史命令