hadoop组件说明默认参数修改参数
hdfsdfs.datanode.data.dir/data/dfs/dn/data/dfs/dn
dfs.journalnode.edits.dir/data/dfs/jn/data/dfs/jn
dfs.namenode.name.dir/data/dfs/nn/data/dfs/nn
hadoop.log.dir/var/log/hadoop-hdfs/var/log/hadoop-hdfs
dfs.blocksize128m128m
dfs.namenode.handler.count5632
dfs.namenode.service.handler.count5635
dfs.datanode.handler.count4950
dfs.replication33
dfs.datanode.max.transfer.threads40964096
NameNode Nameservicenameservice1T3-ns1
dfs.balancer.getBlocks.min-block-size用来平衡的最小block大小,默认10485760(10MB)10M
dfs.balancer.getBlocks.size获取block的数量,默认2147483648(2GB)2G
dfs.balancer.max-size-to-move每次balance进行迭代的过程最大移动数据量,默认10737418240(10GB)10G
dfs.balancer.moverThreads用于执行block移动的线程池大小,默认10001000
dfs.datanode.balance.bandwidthPerSec1050M
dfs.datanode.balance.max.concurrent.moves5050

无需配
dfs.ha.fencing.methodsshell(true)shell(true)
dfs.replication.max6512
dfs.image.transfer.bandwidthPerSec00
dfs.image.transfer.timeout1分钟1分钟
dfs.thrift.threads.max2020
最大进程文件描述符数102400102400
dfs.namenode.replication.max-streams2020
dfs.namenode.replication.max-streams-hard-limit404040
dfs.datanode.balance.bandwidthPerSec10MB50MB/待定
hadoop.tmp.dir/tmp/tmp
hivehive.exec.reducers.bytes.per.reducer500
hive.metastore.warehouse.dir/user/hive/warehouse/user/hive/warehouse
hive.warehouse.subdir.inherit.permsFALSETRUE
进行初始化时的最大重试次数11
进行初始化时每个 RPC 的表数100100
mapred.reduce.tasks-1-1
sentry.hdfs.sync.metastore.cache.init.threads1010
hive.exec.reducers.bytes.per.reducer500M64M
hive.exec.reducers.max10991099
sentry.metastore.service.usershive,impala,hdfs,hue,kudu,yarn,spark,sdchive,impala,hdfs,hue,kudu
hive.optimize.index.filterTRUETRUE
hive.vectorized.execution.enabledTRUETRUE
hive.merge.mapfilesTRUETRUE
hive.merge.sparkfilesTRUETRUE
hive.optimize.reducededuplicationTRUETRUE
hive.map.aggrTRUETRUE
hive.execution.enginespark对于 MapReduce 可设为 mr,
对于 Spark 可设为 spark。
spark.executor.memory4g2g
spark.yarn.driver.memoryOverhead1409m
hive.exec.copyfile.maxsize32m32m
yarnmapreduce.map.memory.mb01g
Cgroup 内存软限制300064g
Cgroup 内存硬限制300066g
mapreduce.map.cpu.vcores1g1g
mapreduce.reduce.memory.mb4g1g
mapreduce.reduce.cpu.vcores1g1g
yarn.nodemanager.resource.memory-mb128g18g
yarn.nodemanager.resource.cpu-vcores7264
huemax_number_of_sessions
10
server_conn_timeout3分钟3分钟
kudumaintenance_manager_num_threads88
block_cache_capacity_mb4g4g
Kudu Tablet Server WAL Directory
fs_wal_dir
/data2/kudu/ktsw/data/kudu/tablet/waldir/
Kudu Master WAL Directory
fs_wal_dir
/data/kudu/ktsw/data/kudu/master/waldir/
memory_limit_hard_bytes4812g
impalaImpala Daemon 内存限制
mem_limit
12864g
Impala Daemon JVM Heap88g
Catalog Server 的 Java 堆栈大小88g
state_store_num_server_worker_threads1010
stacks_collection_frequency55
zookeepertickTime30002000
initLimit105
syncLimit510
dataDir/var/lib/zookeeper/var/lib/zookeeper
StreamSetsPipeline runner thread pool200100
最大进程文件描述符数204,800102400
sdc.properties 的 Data Collector 高级配置
max.stage.private.classloaders=200
max.stage.private.classloaders=200
200
sentrysentry.service.admin.grouphive,impala,hue,solr,kafka,hbase,kudu,spark,yarn,sdchive,impala,hue,solr,kafka,hbase,kudu,spark,yarn,hdfs
sentry.service.allow.connecthive,impala,hue,solr,kafka,hbase,kudu,spark,yarn,sdc,hdfshive,impala,hue,hdfs,solr,kafka,hbase,kudu,spark,yarn,t3cx
Sentry Server 的 Java 配置选项-XX:+UseG1GC -Xmx1024m -Xms1024m'-XX:+UseG1GC
 Cloudera Management Service
Activity Monitor 的 Java 堆栈大小(字节)1G256MB
Alert Publisher 的 Java 堆栈大小(字节)1256MB
EventServer 的 Java 堆栈大小(字节)11g
Host Monitor 的 Java 堆栈大小(字节)1G1G
Host Monitor 的最大非 Java 内存4G2G
Service Monitor 的 Java 堆栈大小(字节)4G2G
Service Monitor 的最大非 Java 内存12G12G