目录
- 创建表
- 使用 CTAS 建表
- 使用 LIKE 子句建表
- 查看表
- 使用SHOW查看表
- 查看数据库中的表
- 查看表的详细信息
- 查看表的字段信息
- 查看建表语句
- 查看表属性
- 使用DESCRIBE查看表信息
- 清理表
- DROP TABLE
- TRUNCATE TABLE
- 修改表
- 修改表名
- 修改表属性
- 修改表的SerDe属性
- 修改表的文件格式
- 修改表的存储位置
- 合并表中的小文件
- 修改字段的名称、类型、位置、注释
- 添加/替换字段
创建表
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later)
[(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
[SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)]
ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
[STORED AS DIRECTORIES]
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later)
]
[LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later)
[AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
CREATE TABLE 创建指定名字的表。如果相同名字的表已经存在,则会报错;用户可以用 IF NOT EXISTS 选项跳过这个错误。表名和列名是大小不敏感的,而SerDe和属性名称是大小写敏感的。
EXTERNAL 关键字指定创建一个外部表,默认是内部表(也称为管理表),外部表在建表的必须同时指定一个指向实际数据的路径(LOCATION),Hive 创建内部表时,会将数据移动到数据仓库指向的路径;若创建外部表,仅记录数据所在的路径,不对数据的位置做任何改变。在删除表的时候,内部表的元数据和数据会被一起删除,而外部表只删除元数据,不删除数据。
TEMPORARY 关键字指定创建一个临时表,临时表只对当前的用户会话可见,会话结束后,表会被自动删除,临时表不支持分区和索引。
COMMENT 子句是给字段或者表添加注释信息的,注释信息使用单引号包括的。注释对于表的计算没有影响,但是为了后期的维护,所以实际开发都是必须要加COMMENT的。
PARTITIONED BY 是给表指定分区,决定了表是否是分区表。Hive中所谓分区表就是将表里新增加一个字段,就是分区的名字,这样你在操作表中的数据时,可以按分区字段进行过滤。
[ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ ] 指定表字段的分隔符,默认是\001,这里指定的是逗号分隔符,还可以指定其他列的分隔符。
STORED AS SEQUENCEFILE|TEXTFILE|RCFILE 如果文件数据是纯文本,可以使用 STORED AS TEXTFILE。如果数据需要压缩,使用 STORED AS SEQUENCEFILE。
CLUSTERED BY 对于每一个表(table)或者分区, Hive可以进一步组织成桶,也就是说桶是更为细粒度的数据范围划分。Hive也针对某一列进行桶的组织。Hive采用对列值哈希,然后除以桶的个数求余的方式决定该条记录存放在哪个桶当中
LOCATION 是指定表在HDFS上的存储路径,内部表如果不指定,会使用默认路径,如果指定的路径不存在会自动创建;对于外部表,则需要直接指定一个路径。
TBLPROPERTIES 用以指定表的属性,创建表时会自动添加一些预定义的属性,如 last_modified_user 和 last_modified_time。
创建示例数据文件
$ vim /home/hadoop/employee.txt
Michael|Montreal,Toronto|Male,30|DB:80|Product:Developer^DLead
Will|Montreal|Male,35|Perl:85|Product:Lead,Test:Lead
Shelley|New York|Female,27|Python:80|Test:Lead,COE:Architect
Lucy|Vancouver|Female,57|Sales:89,HR:94|Sales:Lead
示例: 创建内部表
> CREATE TABLE IF NOT EXISTS employee_internal (
name STRING COMMENT 'this is optinal column comments',
work_place ARRAY<STRING>,-- 字段是大小写不敏感的
gender_age STRUCT<gender:STRING,age:INT>,
skills_score MAP<STRING,INT>,
depart_title MAP<STRING,ARRAY<STRING>> --最后一个字段没有","
)
COMMENT 'This is an internal table'-- 表的注释信息
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|' -- 字段分隔符
COLLECTION ITEMS TERMINATED BY ','-- 集合元素的分隔符
MAP KEYS TERMINATED BY ':' -- MAP类型key/value的分隔符
STORED as TEXTFILE; -- 表的文件格式
加载数据
> LOAD DATA INPATH '/tmp/hive/exployee.txt'
OVERWRITE INTO TABLE employee_internal;
示例:创建外部表
> CREATE EXTERNAL TABLE employee_external ( -- 使用了EXTERNAL关键字
name string,
work_place ARRAY<string>,
gender_age STRUCT<gender:string,age:int>,
skills_score MAP<string,int>,
depart_title MAP<STRING,ARRAY<STRING>>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
STORED as TEXTFILE
LOCATION '/tmp/employee'; -- 指定数据目录路径
加载数据
> LOAD DATA INPATH '/tmp/employee.txt'
OVERWRITE INTO TABLE employee_external;
示例:创建临时表
> CREATE TEMPORARY TABLE IF NOT EXISTS tmp_emp1 ( -- 使用了TEMPORARY关键字
name string,
work_place ARRAY<string>,
gender_age STRUCT<gender:string,age:int>,
skills_score MAP<string,int>,
depart_title MAP<STRING,ARRAY<STRING>>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
STORED as TEXTFILE;
加载数据
> LOAD DATA INPATH '/tmp/employee.txt'
OVERWRITE INTO TABLE tmp_emp1;
使用 CTAS 建表
除了上面的建表语句外,hive还可以使用CTAS(Create-Table-As-Select)的方式建表。来查询结果生成之前,使用CTAS创建的表对其他用户是不可见的。使用 CTAS 创建表有一些限制:
- 创建的表不能是分区表
- 创建的表不能是外部表
- 创建的表不能是分桶表
CTAS 建表通常会触发一个yarn作业,尽管 SELECT * 语句本身不会触发任何的yarn作业。
示例:CTAS 建表
> CREATE TEMPORARY TABLE tmp_emp2 as SELECT * FROM tmp_emp1;
上面创建的表tmp_emp2,创建完成后会有数据。如果想要创建一张空表呢,可以使用如下的语句
> CREATE TEMPORARY TABLE tmp_emp3 as SELECT * FROM tmp_emp1 where 1=2; --where条件为false
CTAS还可以和CTE(Common Table Expression)一起使用。CTE是一个WITH子句中指定的简单SELECT查询派生的临时结果集,后跟用于生成结果集的SELECT或INSERT语句。CTE仅在单个语句的执行范围内定义。一个或多个CTE可以与关键字(如SELECT、INSERT、CREATE TABLE AS SELECT或CREATE VIEW AS SELECT语句)一起嵌套或链式使用。与编写复杂的嵌套查询相比,使用HQL的CTE会使查询更加简洁明了。
示例
> CREATE TABLE cte_employee as
WITH r1 as (
SELECT name FROM r2 WHERE name = 'Michael'
),
r2 as (
SELECT name FROM employee_external WHERE gender_age.gender= 'Male'
),
r3 as (
SELECT name FROM employee_external WHERE gender_age.gender= 'Female'
)
SELECT * FROM r1
UNION ALL
SELECT * FROM r3;
> select * from cte_employee;
+--------------------+
| cte_employee.name |
+--------------------+
| Michael |
| Shelley |
| Lucy |
+--------------------+
使用 LIKE 子句建表
另外,还可以使用 LIKE子句来创建表,这种方式使用已有的表结构来创建一个新表
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
LIKE existing_table_or_view_name
[LOCATION hdfs_path];
示例
> CREATE TEMPORARY TABLE tmp_emp4 like tmp_emp1;
查看表
使用SHOW查看表
查看数据库中的表
查看当前数据库或者IN子句指定数据库中的表
SHOW TABLES [IN database_name] ['identifier_with_wildcards'];
示例
> show tables;
> show tables in default;
> show tables like "tmp*";
> show tables like "tmp*|emp*";
查看表的详细信息
SHOW TABLE EXTENDED [IN|FROM database_name] LIKE 'identifier_with_wildcards' [PARTITION(partition_spec)];
示例
> SHOW TABLE EXTENDED LIKE tmp_emp1;
+----------------------------------------------------+
| tab_name |
+----------------------------------------------------+
| tableName:tmp_emp1 |
| owner:hadoop |
| location:hdfs://ns001/user/hive/tmp/hadoop/33874d02-b862-4099-83c0-ec1c2b19479f/_tmp_space.db/a64106b0-2d37-4705-804c-3ea305f4dad2 |
| inputformat:org.apache.hadoop.mapred.TextInputFormat |
| outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
| columns:struct columns { string name, list<string> work_place, struct<gender:string,age:i32> gender_age, map<string,i32> skills_score, map<string,list<string>> depart_title} |
| partitioned:false |
| partitionColumns: |
| totalNumberFiles:1 |
| totalFileSize:228 |
| maxFileSize:228 |
| minFileSize:228 |
| lastAccessTime:1569397077750 |
| lastUpdateTime:1569397092995 |
| |
+----------------------------------------------------+
查看表的字段信息
SHOW COLUMNS (FROM|IN) table_name [(FROM|IN) db_name];
SHOW COLUMNS (FROM|IN) table_name [(FROM|IN) db_name] [ LIKE 'pattern_with_wildcards']; --Added in Hive 3.0 by HIVE-18373.
示例
> SHOW COLUMNS IN employee_internal;
+---------------+
| field |
+---------------+
| depart_title |
| gender_age |
| name |
| skills_score |
| work_place |
+---------------+
查看建表语句
示例
> SHOW CREATE TABLE employee_internal;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
| CREATE TABLE `employee_internal`( |
| `name` string COMMENT 'this is optinal column comments', |
| `work_place` array<string>, |
| `gender_age` struct<gender:string,age:int>, |
| `skills_score` map<string,int>, |
| `depart_title` map<string,array<string>>) |
| COMMENT 'This is an internal table' |
| ROW FORMAT SERDE |
| 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' |
| WITH SERDEPROPERTIES ( |
| 'collection.delim'=',', |
| 'field.delim'='|', |
| 'mapkey.delim'=':', |
| 'serialization.format'='|') |
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.mapred.TextInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
| LOCATION |
| 'hdfs://ns001/tmp/hive/employee_internal' |
| TBLPROPERTIES ( |
| 'bucketing_version'='2', |
| 'transient_lastDdlTime'='1569395848') |
+----------------------------------------------------+
查看表属性
SHOW TBLPROPERTIES tblname;
SHOW TBLPROPERTIES tblname("foo");
示例
> SHOW TBLPROPERTIES employee_internal;
+------------------------+----------------------------+
| prpt_name | prpt_value |
+------------------------+----------------------------+
| bucketing_version | 2 |
| comment | This is an internal table |
| numFiles | 1 |
| numRows | 0 |
| rawDataSize | 0 |
| totalSize | 228 |
| transient_lastDdlTime | 1569395848 |
+------------------------+----------------------------+
使用DESCRIBE查看表信息
在 Hive 1.x.x and 0.x.x 中描述table/view/column的语法有2种形式,主要区别是是否指定了数据库。
如果没有指定数据库,则在点(.)后面提供可选的列信息。
DESCRIBE [EXTENDED|FORMATTED]
table_name[.col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
如果指定了数据库,则在空格后提供可选的列信息
DESCRIBE [EXTENDED|FORMATTED]
[db_name.]table_name[ col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
在hive2.0之后,描述table的命令语法发生了改变,语法向后不兼容。
-- 不支持字段和表名使用一个点(DOT)分隔这种格式
-- 分区说明信息是在表名之后、字段之前;之前的版本中,字段信息在表名和分区说明之间。
DESCRIBE [EXTENDED | FORMATTED]
[db_name.]table_name [PARTITION partition_spec] [col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
DESCRIBE语句显示给定表包括分区字段在内的所有字段,如果使用了EXTENDED关键字,则以Thrift序列化形式显示表的元数据,这通常只对调试有用,不适用于一般用途。如果使用 FORMATTED 关键字,则以表格形式显示元数据。
示例
> DESCRIBE employee_internal;
+---------------+--------------------------------+----------------------------------+
| col_name | data_type | comment |
+---------------+--------------------------------+----------------------------------+
| name | string | this is optinal column comments |
| work_place | array<string> | |
| gender_age | struct<gender:string,age:int> | |
| skills_score | map<string,int> | |
| depart_title | map<string,array<string>> | |
+---------------+--------------------------------+----------------------------------+
> DESCRIBE EXTENDED employee_internal;
> DESCRIBE FORMATTED employee_internal;
‘用于array,’‘用于map的key,’'用于map的value)查看该列的属性。可以递归地指定这一语法,以剖析更复杂的字段类型。
示例
> DESCRIBE employee_internal skills_score.$key$;
+-----------+------------+--------------------+
| col_name | data_type | comment |
+-----------+------------+--------------------+
| $key$ | string | from deserializer |
+-----------+------------+--------------------+
> DESCRIBE employee_internal skills_score.$value$;
+-----------+------------+--------------------+
| col_name | data_type | comment |
+-----------+------------+--------------------+
| $value$ | int | from deserializer |
+-----------+------------+--------------------+
> DESCRIBE employee_internal gender_age.field_name;
+-----------+------------+--------------------+
| col_name | data_type | comment |
+-----------+------------+--------------------+
| gender | string | from deserializer |
+-----------+------------+--------------------+
> DESCRIBE employee_internal work_place.$elem$;
+-----------+------------+--------------------+
| col_name | data_type | comment |
+-----------+------------+--------------------+
| $elem$ | string | from deserializer |
+-----------+------------+--------------------+
清理表
清理表有2中方式,一种是DROP TABLE,另一种是TRUNCATE TABLE
DROP TABLE
对于内部表,DROP TABLE会把表的元数据信息和表数据都删除,删除的数据通常会被移动到当前用户家目录下的.trash中(如果配置了回收站)。对于外部表,DROP TABLE只会删除表的元数据,而保留表的数据。
DROP TABLE [IF EXISTS] table_name [PURGE]; -- (Note: PURGE available in Hive 0.14.0 and later)
TRUNCATE TABLE
TRUNCATE TABLE 会把表的数据清空,但是会保留表。也就是将表变为一张空表。TRUNCATE TABLE只能用于内部表。
TRUNCATE TABLE table_name [PARTITION partition_spec];
示例
> TRUNCATE TABLE cte_employee;
> SELECT name FROM cte_employee;
+--------------------+
| cte_employee.name |
+--------------------+
+--------------------+
修改表
使用ALTER可以对表进行修改,比如修改表名,修改表属性,添加字段,修改字段数据类型等等。需要注意的是,ALTER TABLE修改的只是表的元数据,不会对表数据产生影响,用户需要手动确保实际数据符合元数据的定义,否则查询不会返回预期的任何结果。
修改表名
ALTER TABLE table_name RENAME TO new_table_name;
示例
> ALTER TABLE cte_employee RENAME TO cte_employee_backup;
修改表属性
ALTER TABLE table_name SET TBLPROPERTIES table_properties;
table_properties:
: (property_name = property_value, property_name = property_value, ... )
示例:修改注释信息
> ALTER TABLE employee_internal SET TBLPROPERTIES ('comment' = 'Alter comment test');
> SHOW TBLPROPERTIES employee_internal;
+------------------------+---------------------+
| prpt_name | prpt_value |
+------------------------+---------------------+
| bucketing_version | 2 |
| comment | Alter comment test |
| last_modified_by | hadoop |
| last_modified_time | 1569463293 |
| numFiles | 1 |
| numRows | 0 |
| rawDataSize | 0 |
| totalSize | 228 |
| transient_lastDdlTime | 1569463293 |
+------------------------+---------------------+
修改表的SerDe属性
ALTER TABLE table_name [PARTITION partition_spec] SET SERDE serde_class_name [WITH SERDEPROPERTIES serde_properties];
ALTER TABLE table_name [PARTITION partition_spec] SET SERDEPROPERTIES serde_properties;
serde_properties:
: (property_name = property_value, property_name = property_value, ... )
上述语句可以修改表的SerDe或给表的SerDe对象添加用户自定义的元数据。property_name和property_value都必须加引号。
示例: 修改表的字段分隔符(省略了不相关的输出信息)
> create table employee_external_copy1 like employee_external;
> show create table employee_external_copy1;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
...
| WITH SERDEPROPERTIES ( |
| 'collection.delim'=',', |
| 'field.delim'='|', |
| 'mapkey.delim'=':', |
| 'serialization.format'='|') |
...
+----------------------------------------------------+
> ALTER TABLE employee_external_copy1 SET SERDEPROPERTIES ('field.delim'='&');
> show create table employee_external_copy1;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
...
| WITH SERDEPROPERTIES ( |
| 'collection.delim'=',', |
| 'field.delim'='&', |
| 'mapkey.delim'=':', |
| 'serialization.format'='|') |
...
+----------------------------------------------------+
修改表的文件格式
ALTER TABLE table_name [PARTITION partition_spec] SET FILEFORMAT file_format;
示例
> show create table employee_external_copy1;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
...
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.mapred.TextInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
...
+----------------------------------------------------+
> ALTER TABLE employee_external_copy1 SET FILEFORMAT RCFILE;
> show create table employee_external_copy1;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
...
| STORED AS INPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.RCFileInputFormat' |
| OUTPUTFORMAT |
| 'org.apache.hadoop.hive.ql.io.RCFileOutputFormat' |
...
+----------------------------------------------------+
修改表的存储位置
ALTER TABLE table_name [PARTITION partition_spec] SET LOCATION "new location";
注意:修改表的存储位置并不会对已有的数据文件进行移动,用户需要自己确保数据文件符合表的位置。
示例
> show create table employee_external_copy1;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
...
| LOCATION |
| 'hdfs://ns001/tmp/hive/employee_external_copy1' |
...
> ALTER TABLE employee_external_copy1 SET LOCATION '/tmp/employee_external_copy1';
> show create table employee_external_copy1;
+----------------------------------------------------+
| createtab_stmt |
+----------------------------------------------------+
...
| LOCATION |
| 'hdfs://ns001/tmp/employee_external_copy1' |
...
+----------------------------------------------------+
合并表中的小文件
在RCFile 或者 ORC存储格式的表中,如果有很多的小文件,可以使用 CONCATENATE 选项对小文件进行合并。从Hive 0.8.0开始,RCfile被添加到支持使用 concatenate 选项快速在块级合并小RCile。从v0.14.0开始,添加了ORC文件支持使用concatenate选项对小型ORC文件进行快速条带级合并。目前还不支持其他的文件格式。RCfile在块级合并,ORC文件在条带级合并,可以避免数据解压缩和解码的开销。合并小文件会触发一个yarn作业。
ALTER TABLE table_name [PARTITION (partition_key = 'partition_value' [, ...])] CONCATENATE;
示例
> ALTER TABLE employee_external_copy1 CONCATENATE;
修改字段的名称、类型、位置、注释
ALTER TABLE table_name [PARTITION partition_spec] CHANGE [COLUMN] col_old_name col_new_name column_type
[COMMENT col_comment] [FIRST|AFTER column_name] [CASCADE|RESTRICT];
CASCADE | RESTRICT子句在Hive1.1.0中可用。ALTER TABLE CHANGE COLUMN使用CASCADE更改表元数据的列信息,同时对所有分区元数据进行相同的更改。RESTRICT是默认选项,仅修改表中的字段信息。
示例1: 修改字段名称,将字段那么改为employee_name
> desc employee_external_copy1
+---------------+--------------------------------+----------+
| col_name | data_type | comment |
+---------------+--------------------------------+----------+
| name | string | |
| work_place | array<string> | |
| gender_age | struct<gender:string,age:int> | |
| skills_score | map<string,int> | |
| depart_title | map<string,array<string>> | |
+---------------+--------------------------------+----------+
> ALTER TABLE employee_external_copy1 CHANGE COLUMN name employee_name string;
> desc employee_external_copy1
+----------------+--------------------------------+----------+
| col_name | data_type | comment |
+----------------+--------------------------------+----------+
| employee_name | string | |
| work_place | array<string> | |
| gender_age | struct<gender:string,age:int> | |
| skills_score | map<string,int> | |
| depart_title | map<string,array<string>> | |
+----------------+--------------------------------+----------+
示例2:将字段名改回为原来的名称,并修改注释信息
> ALTER TABLE employee_external_copy1 CHANGE COLUMN employee_name name string COMMENT 'employ name';
> desc employee_external_copy1
+---------------+--------------------------------+--------------+
| col_name | data_type | comment |
+---------------+--------------------------------+--------------+
| name | string | employ name |
| work_place | array<string> | |
| gender_age | struct<gender:string,age:int> | |
| skills_score | map<string,int> | |
| depart_title | map<string,array<string>> | |
+---------------+--------------------------------+--------------+
示例3:修改字段位置(出现了由于字段类型不兼容导致的报错信息)
> ALTER TABLE employee_external_copy1 CHANGE COLUMN name employee_name string after work_place;
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
work_place,employee_name (state=08S01,code=1)
添加/替换字段
ALTER TABLE table_name
[PARTITION partition_spec] -- (Note: Hive 0.14.0 and later)
ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...)
[CASCADE|RESTRICT] -- (Note: Hive 1.1.0 and later)
ADD COLUMNS 可在现有字段之后、分区字段之前添加新的字段;
REPLACE COLUMNS 会删除所有现有字段,并新增一组字段。字段替换只支持使用了本地 SerDe(DynamicSerDe,MetadataTypedColumnsetSerDe,LazySimpleSerDe和ColumnarSerDe)的表。REPLACE COLUMNS还可以用来删除列。
CASCADE | RESTRICT 子句在Hive1.1.0中可用。ALTER TABLE ADD|REPLACE COLUMNS使用 CASCADE更改表元数据的列信息,同时对所有分区元数据进行相同的更改。RESTRICT是默认选项,仅修改表中的字段信息。
示例1:为表添加2个字段
> ALTER TABLE employee_external_copy1 ADD COLUMNS (c1 string COMMENT 'test1',c2 int COMMENT 'test2');
> DESC employee_external_copy1
+---------------+--------------------------------+--------------+
| col_name | data_type | comment |
+---------------+--------------------------------+--------------+
| name | string | employ name |
| work_place | array<string> | |
| gender_age | struct<gender:string,age:int> | |
| skills_score | map<string,int> | |
| depart_title | map<string,array<string>> | |
| c1 | string | test1 |
| c2 | int | test2 |
+---------------+--------------------------------+--------------+
示例2:替换(删除)表的字段
> create table employee_external_copy2 like employee_external;
> DESC employee_external_copy2
+---------------+--------------------------------+----------+
| col_name | data_type | comment |
+---------------+--------------------------------+----------+
| name | string | |
| work_place | array<string> | |
| gender_age | struct<gender:string,age:int> | |
| skills_score | map<string,int> | |
| depart_title | map<string,array<string>> | |
+---------------+--------------------------------+----------+
> ALTER TABLE employee_external_copy2 REPLACE COLUMNS (name string);
> DESC employee_external_copy2;
+-----------+------------+----------+
| col_name | data_type | comment |
+-----------+------------+----------+
| name | string | |
+-----------+------------+----------+
注意:无论是修改字段的属性(类型,位置等)还是添加或替换字段,这些操作只是修改hive的元数据,而不会修改表的实际数据。因此用户自己需要确保实际数据布局与元数据的定义是相符的。