Spark on yarn 内存管理分配初探

简介:

按照Spark应用程序中的driver分布方式不同,Spark on YARN有两种模式: yarn-client模式、yarn-cluster模式。当在YARN上运行Spark作业,每个Spark executor作为一个YARN容器运行。Spark可以使得多个Tasks在同一个容器里面运行。

Spark版本不同,计算值可能会存在差异

1.Spark内存管理

1.1 spark内存管理源码:SparkEnv.scala,代码如下:

val useLegacyMemoryManager = conf.getBoolean("spark.memory.useLegacyMode", false)
val memoryManager: MemoryManager =
if (useLegacyMemoryManager) {
new StaticMemoryManager(conf, numUsableCores)
} else {
UnifiedMemoryManager(conf, numUsableCores)
}

通过源码可知,现在的spark内存管理有两种:静态内存管理StaticMemoryManager和统一内存管理UnifiedMemoryManager,通过参数spark.memory.useLegacyMode控制

1.2 静态内存管理内存计算公式(源码见StaticMemoryManager.scala):

private def getMaxExecutionMemory(conf: SparkConf): Long = {
    val systemMaxMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)

    if (systemMaxMemory < MIN_MEMORY_BYTES) {
      throw new IllegalArgumentException(s"System memory $systemMaxMemory must " +
        s"be at least $MIN_MEMORY_BYTES. Please increase heap size using the --driver-memory " +
        s"option or spark.driver.memory in Spark configuration.")
    }
    if (conf.contains("spark.executor.memory")) {
      val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
      if (executorMemory < MIN_MEMORY_BYTES) {
        throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
          s"$MIN_MEMORY_BYTES. Please increase executor memory using the " +
          s"--executor-memory option or spark.executor.memory in Spark configuration.")
      }
    }
    val memoryFraction = conf.getDouble("spark.shuffle.memoryFraction", 0.2)
    val safetyFraction = conf.getDouble("spark.shuffle.safetyFraction", 0.8)
    (systemMaxMemory * memoryFraction * safetyFraction).toLong
  }

ExecutionMemory = systemMemory *spark.shuffle.memoryFraction*spark.shuffle.safetyFraction=executor-memory*0.2*0.8

private def getMaxStorageMemory(conf: SparkConf): Long = {
    val systemMaxMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
    val memoryFraction = conf.getDouble("spark.storage.memoryFraction", 0.6)
    val safetyFraction = conf.getDouble("spark.storage.safetyFraction", 0.9)
    (systemMaxMemory * memoryFraction * safetyFraction).toLong
  }

storageMemory = systemMemory*spark.storage.memoryFraction*spark.storage.safetyFraction=executor-memory*0.6*0.9

1.3 统一内存管理内存计算公式(源码UnifiedMemoryManager.scala):

private def getMaxMemory(conf: SparkConf): Long = {
    val systemMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
    val reservedMemory = conf.getLong("spark.testing.reservedMemory",
      if (conf.contains("spark.testing")) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
    val minSystemMemory = (reservedMemory * 1.5).ceil.toLong
    if (systemMemory < minSystemMemory) {
      throw new IllegalArgumentException(s"System memory $systemMemory must " +
        s"be at least $minSystemMemory. Please increase heap size using the --driver-memory " +
        s"option or spark.driver.memory in Spark configuration.")
    }
    // SPARK-12759 Check executor memory to fail fast if memory is insufficient
    if (conf.contains("spark.executor.memory")) {
      val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
      if (executorMemory < minSystemMemory) {
        throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
          s"$minSystemMemory. Please increase executor memory using the " +
          s"--executor-memory option or spark.executor.memory in Spark configuration.")
      }
    }
    val usableMemory = systemMemory - reservedMemory
    val memoryFraction = conf.getDouble("spark.memory.fraction", 0.6)
    (usableMemory * memoryFraction).toLong
  }

  def apply(conf: SparkConf, numCores: Int): UnifiedMemoryManager = {
    val maxMemory = getMaxMemory(conf)
    new UnifiedMemoryManager(
      conf,
      maxHeapMemory = maxMemory,
      onHeapStorageRegionSize =
        (maxMemory * conf.getDouble("spark.memory.storageFraction", 0.5)).toLong,
      numCores = numCores)
  }

预留内存reservedMemory=300M
假设spark 应用分配的executor内存为systemMemory=2G,可通过参数–executor-memory 2g设置),systemMemory实际值要比设置executor-memory的稍小
(ExecutionMemory + storageMemory) = (systemMemory-reservedMemory)* spark.memory.fraction =(2048-300)*0.6=1048.8M (实际会比该值稍小)

storageMemory= (systemMemory-reservedMemory) * spark.memory.fraction * spark.memory.storageFraction=(2048-300)*0.6*0.5=524.4M

但是,我们在前端页面看见的StageMemory的计算公式其实是跟我们上面公式是有出入的:

function formatBytes(bytes, type) {
    if (type !== 'display') return bytes;
    if (bytes == 0) return '0.0 B';
    var k = 1000;
    var dm = 1;
    var sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
    var i = Math.floor(Math.log(bytes) / Math.log(k));
    return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}

spark reducebykey 超内存 spark on yarn 内存_内存管理

scala> Runtime.getRuntime.maxMemory
res0: Long = 954728448

#systemMemory可以通过 Runtime.getRuntime.maxMemory 命令得到
然后通过上边的公式,我们传入的是maxMemory,也就是(systemMemory - reservedMemory) * 0.6:
 parseFloat(((954728448 -300*1024*1024)*0.6 / Math.pow(1000, Math.floor(Math.log((954728448 -300*1024*1024)*0.6) / Math.log(1000)))).toFixed(1))
这段是可以在浏览器的console直接运行得到结果384.1MB

2.spark on yarn内存分配

1.相关参数:

关于Spark On YARN相关的配置参数,请参考Spark配置参数。本文主要讨论内存分配情况,所以只需要关注以下几个内心相关的参数:

spark.driver.memory:默认值512m

spark.executor.memory:默认值512m

(executor的内存量)

spark.yarn.am.memory:默认值512m

(在客户端模式下用于YARN Application Master的内存量,在群集模式下,请改用spark.driver.memory)

spark.yarn.executor.memoryOverhead:值为executorMemory * 0.1, with minimum of 384

(每个执行程序要分配的堆外内存量(以MB为单位),往往会随着执行程序的大小而增加(通常为6-10%))

spark.yarn.driver.memoryOverhead:值为driverMemory * 0.1, with minimum of 384

(在群集模式下为每个driver程序分配的堆外内存量(以MB为单位),会随着容器大小(通常为6-10%)而增长)

spark.yarn.am.memoryOverhead:值为AM memory * 0.1, with minimum of 384

(与spark.yarn.driver.memoryOverhead相同,但对于客户端模式下的YARN Application Master。)

2.spark on yarn内存计算公式:

1、spark on yarn申请的container数=num-executors+1 (AM会占用一个container,一个executor占用 一个container)

2、executor需要申请的总内存:total=executor-memory+max(executor-memory*0.1,384)

3、executor所在container实际申请的内存分两种情况:

(1) total <= yarn.scheduler.minimum-allocation-mb

executor申请总内存小于等于分配给AM单个容器可申请的最小内存,则实际分配内存大小为yarn.scheduler.minimum-allocation-mb
(2)total > yarn.scheduler.minimum-allocation-mb

executor申请总内存大于分配给AM单个容器可申请的最小内存,则实际分配内存大小为yarn.scheduler.minimum-allocation-mb+( total-yarn.scheduler.minimum-allocation-mb)取 yarn.scheduler.increment-allocation-mb(Container内存不够用时一次性加多少内存)的整数倍

假设:yarn.scheduler.minimum-allocation-mb=2G,yarn.scheduler.increment-allocation-mb=1G,–executor-memory 2G
则executor所在container实际申请的内存=2G+1G=3G,AM所在executor实际申请内存=2G,所有总申请内存为5G