hive set 常用参数汇总

转载

mob604756eb4476 2019-04-25 17:47:00

文章标签 hive mapreduce java apache hadoop 文章分类 代码人生

1、 set hive.auto.convert.join = true;

mapJoin的主要意思就是，当链接的两个表是一个比较小的表和一个特别大的表的时候，我们把比较小的table直接放到内存中去，然后再对比较大的表格进行map操作。join就发生在map操作的时候，每当扫描一个大的table中的数据，就要去去查看小表的数据，哪条与之相符，继而进行连接。这里的join并不会涉及reduce操作。map端join的优势就是在于没有shuffle。在本质上mapjoin根本就没有运行MR进程，仅仅是在内存就进行了两个表的联合。

2、 set mapred.job.priority = VERY_HIGH; --设置任务优先级

3、set mapred.output.compress = true;

set hive.exec.compress.output = true;

压缩最终结果

4、SET hive.default.fileformat = Orc; -- 设置默认文件格式

ORC File，它的全名是Optimized Row Columnar (ORC) file，其实就是对RCFile做了一些优化。据官方文档介绍，这种文件格式可以提供一种高效的方法来存储Hive数据。

它的设计目标是来克服Hive其他格式的缺陷。运用ORC File可以提高Hive的读、写以及处理数据的性能。

5、set hive.exec.dynamic.partition=true; 是开启动态分区

set hive.exec.dynamic.partition.mode=nonstrict; 这个属性默认值是strict,就是要求分区字段必须有一个是静态的分区值，随后会讲到，当前设置为nonstrict,那么可以全部动态分区.

7、动态分区参数设置

set hive.exec.max.dynamic.partitions = 130000;

set hive.exec.max.dynamic.partitions.pernode = 130000;

set hive.exec.max.created.files = 200000;

当对hive分区未做设置时，报错如下：

Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 5000
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:877)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:657)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 7 more

超过了最大的分区数设置

解决办法：

set hive.exec.dynamic.partition=true;

set hive.exec.dynamic.partition.mode=nonstrict;

set hive.exec.max.dynamic.partitions.pernode=600000;

set hive.exec.max.dynamic.partitions=6000000;

set hive.exec.max.created.files=6000000;