元数据
- 引言
- 1、Hive 元数据
- 2、Hive 元数据表
- 2.1 Hive 库
- 2.2 VERSION 表
- 2.3 Hive数据库相关的元数据表
- 2.3.1 DBS—数据库表
- 2.3.2 DATABASE_PARAMS表—数据库参数表
- 2.4 Hive数据表相关的元数据表
- 2.4.1 TBLS表—数据表的元数据表
- 2.4.2 TABLE_PARAMS表—数据表属性信息表
- 2.4.3 TBL_PRIVS—授权表
- 2.5 Hive文件存储信息相关的元数据表
- 2.5.1 SDS表
- 2.5.2 SD_PARAMS表
- 2.5.3 SERDES表
- 2.5.4 SERDE_PARAMS表
- 2.6 Hive表字段元数据表—COLUMNS_V2表
- 2.7 Hive表分区相关的元数据表
- 2.7.1 PARTITIONS表
- 2.7.2 PARTITION_KEYS表
- 2.7.3 PARTITION_KEY_VALS表
- 2.7.4 PARTITION_PARAMS表
- 2.8 角色表—ROLES 表
- 2.9 其他
引言
元数据
描述数据的数据,例如数据表的大小是100KB,数据表是数据,表大小是数据的数据。
1、Hive 元数据
Hive元数据包括hive库信息、表信息(表的属性、表的名称、表的列、分区及其属性)以及表数据所在的目录等。
2、Hive 元数据表
- 进入Mysql终端
- 查看数据库信息
show databases;
- 看是不是有一个Hive库。
2.1 Hive 库
- 切到Hive库
use hive;
- 查看库中所有表
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| IDXS |
| INDEX_PARAMS |
| PARTITIONS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| VERSION |
+---------------------------+
36 rows in set (0.00 sec)
2.2 VERSION 表
- 存储Hive版本的元数据表,该表中数据只有一条,如果存在多条,会造成hive启动不起来。
mysql> select * from VERSION;
+--------+----------------+---------------------------------------+
| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |
+--------+----------------+---------------------------------------+
| 1 | 1.2.0 | Set by MetaStore root@192.168.233.130 |
+--------+----------------+---------------------------------------+
1 row in set (0.02 sec)
2.3 Hive数据库相关的元数据表
2.3.1 DBS—数据库表
- 该表存储Hive中所有数据库的基本信息
- 表结构
mysql> show create table DBS;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DBS | CREATE TABLE `DBS` (
`DB_ID` bigint(20) NOT NULL,
`DESC` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`DB_LOCATION_URI` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`OWNER_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`OWNER_TYPE` varchar(10) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
PRIMARY KEY (`DB_ID`),
UNIQUE KEY `UNIQUE_DATABASE` (`NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
mysql> select * from DBS;
+-------+-----------------------+------------------------------------------------+---------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+------------------------------------------------+---------+------------+------------+
| 1 | Default Hive database | hdfs://master:9000/user/hive/warehouse | default | public | ROLE |
| 11 | NULL | hdfs://master:9000/user/hive/warehouse/test.db | test | root | USER |
+-------+-----------------------+------------------------------------------------+---------+------------+------------+
2 rows in set (0.00 sec)
2.3.2 DATABASE_PARAMS表—数据库参数表
- 该表存储数据库的相关参数,在CREATE DATABASE时候用WITH DBPROPERTIES (property_name=property_value, …)指定的参数。
- 表结构
mysql> show create table DATABASE_PARAMS;
+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| DATABASE_PARAMS | CREATE TABLE `DATABASE_PARAMS` (
`DB_ID` bigint(20) NOT NULL,
`PARAM_KEY` varchar(180) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`PARAM_VALUE` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
PRIMARY KEY (`DB_ID`,`PARAM_KEY`),
KEY `DATABASE_PARAMS_N49` (`DB_ID`),
CONSTRAINT `DATABASE_PARAMS_FK1` FOREIGN KEY (`DB_ID`) REFERENCES `DBS` (`DB_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
2.4 Hive数据表相关的元数据表
2.4.1 TBLS表—数据表的元数据表
- 该表中存储Hive表、视图、索引表的基本信息。
- 表结构
mysql> show create table TBLS;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TBLS | CREATE TABLE `TBLS` (
`TBL_ID` bigint(20) NOT NULL,
`CREATE_TIME` int(11) NOT NULL,
`DB_ID` bigint(20) DEFAULT NULL,
`LAST_ACCESS_TIME` int(11) NOT NULL,
`OWNER` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`RETENTION` int(11) NOT NULL,
`SD_ID` bigint(20) DEFAULT NULL,
`TBL_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`TBL_TYPE` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`VIEW_EXPANDED_TEXT` mediumtext,
`VIEW_ORIGINAL_TEXT` mediumtext,
PRIMARY KEY (`TBL_ID`),
UNIQUE KEY `UNIQUETABLE` (`TBL_NAME`,`DB_ID`),
KEY `TBLS_N50` (`SD_ID`),
KEY `TBLS_N49` (`DB_ID`),
CONSTRAINT `TBLS_FK1` FOREIGN KEY (`SD_ID`) REFERENCES `SDS` (`SD_ID`),
CONSTRAINT `TBLS_FK2` FOREIGN KEY (`DB_ID`) REFERENCES `DBS` (`DB_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+---------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+---------------+---------------+--------------------+--------------------+
| 11 | 1608470832 | 11 | 0 | root | 0 | 11 | orders | MANAGED_TABLE | NULL | NULL |
| 12 | 1608515462 | 11 | 0 | root | 0 | 12 | trains | MANAGED_TABLE | NULL | NULL |
| 16 | 1609238892 | 11 | 0 | root | 0 | 16 | prior | MANAGED_TABLE | NULL | NULL |
| 29 | 1609383196 | 11 | 0 | root | 0 | 42 | prior_bucket | MANAGED_TABLE | NULL | NULL |
| 30 | 1609566691 | 11 | 0 | root | 0 | 43 | employee | MANAGED_TABLE | NULL | NULL |
| 31 | 1609923077 | 11 | 0 | root | 0 | 46 | grade | MANAGED_TABLE | NULL | NULL |
| 36 | 1609986997 | 11 | 0 | root | 0 | 51 | article | MANAGED_TABLE | NULL | NULL |
| 38 | 1609990640 | 11 | 0 | root | 0 | 53 | student_score | MANAGED_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+-------+-----------+-------+---------------+---------------+--------------------+--------------------+
8 rows in set (0.00 sec)
2.4.2 TABLE_PARAMS表—数据表属性信息表
- 该表存储表/视图的属性信息。
- 表结构
2.4.3 TBL_PRIVS—授权表
- 该表存储表/视图的授权信息
- 表结构
2.5 Hive文件存储信息相关的元数据表
由于HDFS支持的文件格式很多,而建Hive表时候也可以指定各种文件格式,Hive在将HQL解析成MapReduce时候,需要知道去哪里,使用哪种格式去读写HDFS文件,而这些信息就保存在这几张表中。
2.5.1 SDS表
- 该表保存文件存储的基本信息,如INPUT_FORMAT、OUTPUT_FORMAT、是否压缩等。
- 表结构
mysql> show create table SDS;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SDS | CREATE TABLE `SDS` (
`SD_ID` bigint(20) NOT NULL,
`CD_ID` bigint(20) DEFAULT NULL,
`INPUT_FORMAT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`IS_COMPRESSED` bit(1) NOT NULL,
`IS_STOREDASSUBDIRECTORIES` bit(1) NOT NULL,
`LOCATION` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`NUM_BUCKETS` int(11) NOT NULL,
`OUTPUT_FORMAT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`SERDE_ID` bigint(20) DEFAULT NULL,
PRIMARY KEY (`SD_ID`),
KEY `SDS_N50` (`CD_ID`),
KEY `SDS_N49` (`SERDE_ID`),
CONSTRAINT `SDS_FK1` FOREIGN KEY (`CD_ID`) REFERENCES `CDS` (`CD_ID`),
CONSTRAINT `SDS_FK2` FOREIGN KEY (`SERDE_ID`) REFERENCES `SERDES` (`SERDE_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
- TBLS表中的SD_ID与该表关联,可以获取Hive表的存储信息。
select
d.NAME,
t.TBL_NAME,
s.INPUT_FORMAT,
s.OUTPUT_FORMAT
from TBLS t
join DBS d
join SDS s
where t.DB_ID = d.DB_ID
and t.SD_ID = s.SD_ID
and d.NAME='test'
and s.INPUT_FORMAT like '%TextInputFormat%';
2.5.2 SD_PARAMS表
- 该表存储Hive存储的属性信息,在创建表时候使用
- STORED BY ‘storage.handler.class.name’ [WITH SERDEPROPERTIES (…)指定。
- 表结构
2.5.3 SERDES表
- 该表存储序列化使用的类信息
- 表结构
2.5.4 SERDE_PARAMS表
- 该表存储序列化的一些属性、格式信息,比如:行、列分隔符
- 表结构
2.6 Hive表字段元数据表—COLUMNS_V2表
- 该表存储表对应的字段信息
- 表结构
- comment注释要想支持中文的话,可在其设置
2.7 Hive表分区相关的元数据表
2.7.1 PARTITIONS表
- 该表存储表分区的基本信息。
- 表结构
mysql> show create table PARTITIONS;
+------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PARTITIONS | CREATE TABLE `PARTITIONS` (
`PART_ID` bigint(20) NOT NULL,
`CREATE_TIME` int(11) NOT NULL,
`LAST_ACCESS_TIME` int(11) NOT NULL,
`PART_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`SD_ID` bigint(20) DEFAULT NULL,
`TBL_ID` bigint(20) DEFAULT NULL,
PRIMARY KEY (`PART_ID`),
UNIQUE KEY `UNIQUEPARTITION` (`PART_NAME`,`TBL_ID`),
KEY `PARTITIONS_N50` (`SD_ID`),
KEY `PARTITIONS_N49` (`TBL_ID`),
CONSTRAINT `PARTITIONS_FK1` FOREIGN KEY (`SD_ID`) REFERENCES `SDS` (`SD_ID`),
CONSTRAINT `PARTITIONS_FK2` FOREIGN KEY (`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
2.7.2 PARTITION_KEYS表
- 该表存储分区的字段信息。
- 表结构
mysql> show create table PARTITION_KEYS;
+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PARTITION_KEYS | CREATE TABLE `PARTITION_KEYS` (
`TBL_ID` bigint(20) NOT NULL,
`PKEY_COMMENT` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`PKEY_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`PKEY_TYPE` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`INTEGER_IDX` int(11) NOT NULL,
PRIMARY KEY (`TBL_ID`,`PKEY_NAME`),
KEY `PARTITION_KEYS_N49` (`TBL_ID`),
CONSTRAINT `PARTITION_KEYS_FK1` FOREIGN KEY (`TBL_ID`) REFERENCES `TBLS` (`TBL_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
2.7.3 PARTITION_KEY_VALS表
- 该表存储分区字段值。
- 表结构
mysql> show create table PARTITION_KEY_VALS;
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PARTITION_KEY_VALS | CREATE TABLE `PARTITION_KEY_VALS` (
`PART_ID` bigint(20) NOT NULL,
`PART_KEY_VAL` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`INTEGER_IDX` int(11) NOT NULL,
PRIMARY KEY (`PART_ID`,`INTEGER_IDX`),
KEY `PARTITION_KEY_VALS_N49` (`PART_ID`),
CONSTRAINT `PARTITION_KEY_VALS_FK1` FOREIGN KEY (`PART_ID`) REFERENCES `PARTITIONS` (`PART_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
2.7.4 PARTITION_PARAMS表
- 该表存储分区的属性信息
- 表结构
mysql> show create table PARTITION_PARAMS;
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PARTITION_PARAMS | CREATE TABLE `PARTITION_PARAMS` (
`PART_ID` bigint(20) NOT NULL,
`PARAM_KEY` varchar(256) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`PARAM_VALUE` varchar(4000) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
PRIMARY KEY (`PART_ID`,`PARAM_KEY`),
KEY `PARTITION_PARAMS_N49` (`PART_ID`),
CONSTRAINT `PARTITION_PARAMS_FK1` FOREIGN KEY (`PART_ID`) REFERENCES `PARTITIONS` (`PART_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
2.8 角色表—ROLES 表
- 该表存储操作表的主角
- 表结构
mysql> show create table ROLES ;
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ROLES | CREATE TABLE `ROLES` (
`ROLE_ID` bigint(20) NOT NULL,
`CREATE_TIME` int(11) NOT NULL,
`OWNER_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
`ROLE_NAME` varchar(128) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
PRIMARY KEY (`ROLE_ID`),
UNIQUE KEY `ROLEENTITYINDEX` (`ROLE_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
2.9 其他
- hive库中不止以上表,还有一些,不再展示
参考资料: