SparK Shuffle之SortShffleWriter

阿新 • • 發佈：2018-12-04

SparK Shuffle之SortShffleWriter

SortShffleWriter的邏輯在write方法，檢視程式碼

  /** Write a bunch of records to this task's output */
  override def write(records: Iterator[Product2[K, V]]): Unit = {
    sorter = if (dep.mapSideCombine) {
      require(dep.aggregator.isDefined, "Map-side combine without Aggregator specified!" 
)
      new ExternalSorter[K, V, C](
        context, dep.aggregator, Some(dep.partitioner), dep.keyOrdering, dep.serializer)
    } else {
      // In this case we pass neither an aggregator nor an ordering to the sorter, because we don't
      // care whether the keys get sorted in each partition; that will be done on the reduce side 

      // if the operation being run is sortByKey.
      new ExternalSorter[K, V, V](
        context, aggregator = None, Some(dep.partitioner), ordering = None, dep.serializer)
    }
    sorter.insertAll(records)

    // Don't bother including the time to open the merged output file in the shuffle write time, 

    // because it just opens a single file, so is typically too fast to measure accurately
    // (see SPARK-3570).
    val output = shuffleBlockResolver.getDataFile(dep.shuffleId, mapId)
    val tmp = Utils.tempFileWith(output)
    try {
      val blockId = ShuffleBlockId(dep.shuffleId, mapId, IndexShuffleBlockResolver.NOOP_REDUCE_ID)
      val partitionLengths = sorter.writePartitionedFile(blockId, tmp)
      shuffleBlockResolver.writeIndexFileAndCommit(dep.shuffleId, mapId, partitionLengths, tmp)
      mapStatus = MapStatus(blockManager.shuffleServerId, partitionLengths)
    } finally {
      if (tmp.exists() && !tmp.delete()) {
        logError(s"Error while deleting temp file ${tmp.getAbsolutePath}")
      }
    }
  }

在ExternalSorter的insertAll方法中，先判斷是否需要進行聚合（Aggregation),如果需要，則根據鍵值進行合併（Combine)，然後把這些資料寫入到記憶體緩衝區中，如果排序中的Map佔用的記憶體已經超越了使用的閾值，則將Map中的內容溢寫到磁碟中，每一次溢寫產生一個不同的檔案。如果不需要聚合，則直接把資料寫入到記憶體緩衝區中。

  def insertAll(records: Iterator[Product2[K, V]]): Unit = {
    // TODO: stop combining if we find that the reduction factor isn't high
    val shouldCombine = aggregator.isDefined

    if (shouldCombine) {
      // Combine values in-memory first using our AppendOnlyMap
      val mergeValue = aggregator.get.mergeValue
      val createCombiner = aggregator.get.createCombiner
      var kv: Product2[K, V] = null
      val update = (hadValue: Boolean, oldValue: C) => {
        if (hadValue) mergeValue(oldValue, kv._2) else createCombiner(kv._2)
      }
      while (records.hasNext) {
        // 處理一個元素，就更新一次結果
        addElementsRead()
        kv = records.next()
        map.changeValue((getPartition(kv._1), kv._1), update)
        maybeSpillCollection(usingMap = true)
      }
    } else {
      // Stick values into our buffer
      while (records.hasNext) {
        // 處理一個元素，就更新一次結果
        addElementsRead()
        val kv = records.next()
        buffer.insert(getPartition(kv._1), kv._1, kv._2.asInstanceOf[C])
        maybeSpillCollection(usingMap = false)
      }
    }
  }

溢寫到磁碟
1、如果需要map端的聚合：
估計map的大小，根據預估的map大小決定是否需要進行spill。如果需要spill，在spill之後，初始化一個新的PartitionedAppendOnlyMap。

2、如果不需要map端的聚合：
估計buffer的大小，根據預估的buffer大小決定是否需要進行spill。如果需要spill，spill之後，初始化一個新的PartitionedPairBuffer。

/**
   * Spill the current in-memory collection to disk if needed.
   *
   * @param usingMap whether we're using a map or buffer as our current in-memory collection
   */
  private def maybeSpillCollection(usingMap: Boolean): Unit = {
    var estimatedSize = 0L
    if (usingMap) {
      estimatedSize = map.estimateSize()
      if (maybeSpill(map, estimatedSize)) {
        map = new PartitionedAppendOnlyMap[K, C]
      }
    } else {
      estimatedSize = buffer.estimateSize()
      if (maybeSpill(buffer, estimatedSize)) {
        buffer = new PartitionedPairBuffer[K, C]
      }
    }

    if (estimatedSize > _peakMemoryUsedBytes) {
      _peakMemoryUsedBytes = estimatedSize
    }
  }

SparK Shuffle之SortShffleWriter

SparK Shuffle之SortShffleWriter

SparK Shuffle之SortShffleWriter

Spark學習之Spark Shuffle

Spark---Shuffle調優之調節map端記憶體緩衝與reduce端記憶體佔比

spark基礎之shuffle機制和原理分析

Spark學習之11：Shuffle Read

spark shuffle寫操作三部曲之UnsafeShuffleWriter

spark shuffle寫操作之SortShuffleWriter

Spark-Sql之DataFrame實戰詳解

Spark SQL 之 Join 實現

Spark-Streaming之window滑動窗口應用

Spark MLlib之使用Breeze操作矩陣向量

Spark學習之第一個程序打包、提交任務到集群

Spark學習之路（二）Spark2.3 HA集群的分布式安裝

Spark學習之路（四）Spark的廣播變量和累加器

Spark學習之路（十一）SparkCore的調優之Spark內存模型

Spark學習之路（十二）SparkCore的調優之資源調優JVM的基本架構

Spark學習之路（十二）SparkCore的調優之資源調優

Spark學習之路（十四）SparkCore的調優之資源調優JVM的GC垃圾收集器

Spark學習之路（十五）SparkCore的源碼解讀（一）啟動腳本

Spark學習之路（二十八）分布式圖計算系統

SparK Shuffle之SortShffleWriter

SparK Shuffle之SortShffleWriter

相關推薦