| hadoop组件 | 说明 | 默认参数 | 修改参数 |
| hdfs | dfs.datanode.data.dir | /data/dfs/dn | /data/dfs/dn |
| dfs.journalnode.edits.dir | /data/dfs/jn | /data/dfs/jn | |
| dfs.namenode.name.dir | /data/dfs/nn | /data/dfs/nn | |
| hadoop.log.dir | /var/log/hadoop-hdfs | /var/log/hadoop-hdfs | |
| dfs.blocksize | 128m | 128m | |
| dfs.namenode.handler.count | 56 | 32 | |
| dfs.namenode.service.handler.count | 56 | 35 | |
| dfs.datanode.handler.count | 49 | 50 | |
| dfs.replication | 3 | 3 | |
| dfs.datanode.max.transfer.threads | 4096 | 4096 | |
| NameNode Nameservice | nameservice1 | T3-ns1 | |
| dfs.balancer.getBlocks.min-block-size | 用来平衡的最小block大小,默认10485760(10MB) | 10M | |
| dfs.balancer.getBlocks.size | 获取block的数量,默认2147483648(2GB) | 2G | |
| dfs.balancer.max-size-to-move | 每次balance进行迭代的过程最大移动数据量,默认10737418240(10GB) | 10G | |
| dfs.balancer.moverThreads | 用于执行block移动的线程池大小,默认1000 | 1000 | |
| dfs.datanode.balance.bandwidthPerSec | 10 | 50M | |
| dfs.datanode.balance.max.concurrent.moves | 50 | 50 | |
| 无需配 | |||
| dfs.ha.fencing.methods | shell(true) | shell(true) | |
| dfs.replication.max | 6 | 512 | |
| dfs.image.transfer.bandwidthPerSec | 0 | 0 | |
| dfs.image.transfer.timeout | 1分钟 | 1分钟 | |
| dfs.thrift.threads.max | 20 | 20 | |
| 最大进程文件描述符数 | 102400 | 102400 | |
| dfs.namenode.replication.max-streams | 20 | 20 | |
| dfs.namenode.replication.max-streams-hard-limit40 | 40 | 40 | |
| dfs.datanode.balance.bandwidthPerSec | 10MB | 50MB/待定 | |
| hadoop.tmp.dir | /tmp | /tmp | |
| hive | hive.exec.reducers.bytes.per.reducer | 500 | |
| hive.metastore.warehouse.dir | /user/hive/warehouse | /user/hive/warehouse | |
| hive.warehouse.subdir.inherit.perms | FALSE | TRUE | |
| 进行初始化时的最大重试次数 | 1 | 1 | |
| 进行初始化时每个 RPC 的表数 | 100 | 100 | |
| mapred.reduce.tasks | -1 | -1 | |
| sentry.hdfs.sync.metastore.cache.init.threads | 10 | 10 | |
| hive.exec.reducers.bytes.per.reducer | 500M | 64M | |
| hive.exec.reducers.max | 1099 | 1099 | |
| sentry.metastore.service.users | hive,impala,hdfs,hue,kudu,yarn,spark,sdc | hive,impala,hdfs,hue,kudu | |
| hive.optimize.index.filter | TRUE | TRUE | |
| hive.vectorized.execution.enabled | TRUE | TRUE | |
| hive.merge.mapfiles | TRUE | TRUE | |
| hive.merge.sparkfiles | TRUE | TRUE | |
| hive.optimize.reducededuplication | TRUE | TRUE | |
| hive.map.aggr | TRUE | TRUE | |
| hive.execution.engine | spark | 对于 MapReduce 可设为 mr, 对于 Spark 可设为 spark。 | |
| spark.executor.memory | 4g | 2g | |
| spark.yarn.driver.memoryOverhead | 1 | 409m | |
| hive.exec.copyfile.maxsize | 32m | 32m | |
| yarn | mapreduce.map.memory.mb | 0 | 1g |
| Cgroup 内存软限制 | 3000 | 64g | |
| Cgroup 内存硬限制 | 3000 | 66g | |
| mapreduce.map.cpu.vcores | 1g | 1g | |
| mapreduce.reduce.memory.mb | 4g | 1g | |
| mapreduce.reduce.cpu.vcores | 1g | 1g | |
| yarn.nodemanager.resource.memory-mb | 128g | 18g | |
| yarn.nodemanager.resource.cpu-vcores | 72 | 64 | |
| hue | max_number_of_sessions | 10 | |
| server_conn_timeout | 3分钟 | 3分钟 | |
| kudu | maintenance_manager_num_threads | 8 | 8 |
| block_cache_capacity_mb | 4g | 4g | |
| Kudu Tablet Server WAL Directory fs_wal_dir | /data2/kudu/ktsw | /data/kudu/tablet/waldir/ | |
| Kudu Master WAL Directory fs_wal_dir | /data/kudu/ktsw | /data/kudu/master/waldir/ | |
| memory_limit_hard_bytes | 48 | 12g | |
| impala | Impala Daemon 内存限制 mem_limit | 128 | 64g |
| Impala Daemon JVM Heap | 8 | 8g | |
| Catalog Server 的 Java 堆栈大小 | 8 | 8g | |
| state_store_num_server_worker_threads | 10 | 10 | |
| stacks_collection_frequency | 5 | 5 | |
| zookeeper | tickTime | 3000 | 2000 |
| initLimit | 10 | 5 | |
| syncLimit | 5 | 10 | |
| dataDir | /var/lib/zookeeper | /var/lib/zookeeper | |
| StreamSets | Pipeline runner thread pool | 200 | 100 |
| 最大进程文件描述符数 | 204,800 | 102400 | |
| sdc.properties 的 Data Collector 高级配置 max.stage.private.classloaders=200 | max.stage.private.classloaders=200 | 200 | |
| sentry | sentry.service.admin.group | hive,impala,hue,solr,kafka,hbase,kudu,spark,yarn,sdc | hive,impala,hue,solr,kafka,hbase,kudu,spark,yarn,hdfs |
| sentry.service.allow.connect | hive,impala,hue,solr,kafka,hbase,kudu,spark,yarn,sdc,hdfs | hive,impala,hue,hdfs,solr,kafka,hbase,kudu,spark,yarn,t3cx | |
| Sentry Server 的 Java 配置选项 | -XX:+UseG1GC -Xmx1024m -Xms1024m' | -XX:+UseG1GC | |
| Cloudera Management Service | Activity Monitor 的 Java 堆栈大小(字节) | 1G | 256MB |
| Alert Publisher 的 Java 堆栈大小(字节) | 1 | 256MB | |
| EventServer 的 Java 堆栈大小(字节) | 1 | 1g | |
| Host Monitor 的 Java 堆栈大小(字节) | 1G | 1G | |
| Host Monitor 的最大非 Java 内存 | 4G | 2G | |
| Service Monitor 的 Java 堆栈大小(字节) | 4G | 2G | |
| Service Monitor 的最大非 Java 内存 | 12G | 12G |
CDH组件调优参考
原创
©著作权归作者所有:来自51CTO博客作者江南独孤客的原创作品,请联系作者获取转载授权,否则将追究法律责任
下一篇:CDH的组件java调优建议值
提问和评论都可以,用心的回复会被更多人看到
评论
发布评论
相关文章
-
【rabbitmq 高级特性】RabbitMQ 延迟队列全面解析
本文介绍了RabbitMQ中实现延迟队列的两种方式:TTL+死信队列组合和官方延迟插件。TTL+死信队列通过设置消息TTL和死信交换机
rabbitmq 分布式 代码实现 实现原理 延迟时间
















