Spark1.6-----原始碼解讀之TaskScheduler啟動

阿新 • • 發佈：2018-12-16

必須啟動TaskScheduler才能讓他發揮作用

SparkContext 530行：

    _taskScheduler.start()

實際去調TaskSchedulerImpl 143行：

  override def start() {
    backend.start()

    if (!isLocal && conf.getBoolean("spark.speculation", false)) {
      logInfo("Starting speculative execution thread")
      speculationScheduler.scheduleAtFixedRate(new Runnable {
        override def run(): Unit = Utils.tryOrStopSparkContext(sc) {
          checkSpeculatableTasks()
        }
      }, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)
    }
  }

而有去調本例就以Localbackend為例的123行的start方法：

  override def start() {
    val rpcEnv = SparkEnv.get.rpcEnv
    //建立LoaclEndpoint
    val executorEndpoint = new LocalEndpoint(rpcEnv, userClassPath, scheduler, this, totalCores)
    localEndpoint = rpcEnv.setupEndpoint("LocalBackendEndpoint", executorEndpoint)
    listenerBus.post(SparkListenerExecutorAdded(
      System.currentTimeMillis,
      executorEndpoint.localExecutorId,
      new ExecutorInfo(executorEndpoint.localExecutorHostname, totalCores, Map.empty)))
    launcherBackend.setAppId(appId)
    launcherBackend.setState(SparkAppHandle.State.RUNNING)
  }

LoaclEndPoint在LocalBackend的45行：

/**
 * Calls to LocalBackend are all serialized through LocalEndpoint. Using an RpcEndpoint makes the
 * calls on LocalBackend asynchronous, which is necessary to prevent deadlock between LocalBackend
 * and the TaskSchedulerImpl.
 */
private[spark] class LocalEndpoint(
    override val rpcEnv: RpcEnv,
    userClassPath: Seq[URL],
    scheduler: TaskSchedulerImpl,
    executorBackend: LocalBackend,
    private val totalCores: Int)
  extends ThreadSafeRpcEndpoint with Logging {

該類中第58行建立Executor在Driver端：

  private val executor = new Executor(
    localExecutorId, localExecutorHostname, SparkEnv.get, userClassPath, isLocal = true)

Executor的主要構建過程如下：

建立Executor執行Task的執行緒池。

建立並註冊ExecutorSource，用於測量系統。

獲取SparkEnv的資訊。

urlClassLoader建立，用於載入任務傳過來的jar包。

註冊並且建立heartbeatReceiverEndPoint，並且獲得引用。

啟動Executor心跳執行緒。用於向Driver傳送心跳。

// Start worker thread pool
  private val threadPool = ThreadUtils.newDaemonCachedThreadPool("Executor task launch worker")
  private val executorSource = new ExecutorSource(threadPool, executorId)

  if (!isLocal) {
    env.metricsSystem.registerSource(executorSource)
    env.blockManager.initialize(conf.getAppId)
  }

  // Whether to load classes in user jars before those in Spark jars
  private val userClassPathFirst = conf.getBoolean("spark.executor.userClassPathFirst", false)

  // Create our ClassLoader
  // do this after SparkEnv creation so can access the SecurityManager
  private val urlClassLoader = createClassLoader()
  private val replClassLoader = addReplClassLoaderIfNeeded(urlClassLoader)

  // Set the classloader for serializer
  env.serializer.setDefaultClassLoader(replClassLoader)

  // Akka's message frame size. If task result is bigger than this, we use the block manager
  // to send the result back.
  private val akkaFrameSize = AkkaUtils.maxFrameSizeBytes(conf)

  // Limit of bytes for total size of results (default is 1GB)
  private val maxResultSize = Utils.getMaxResultSize(conf)

  // Maintains the list of running tasks.
  private val runningTasks = new ConcurrentHashMap[Long, TaskRunner]

  // Executor for the heartbeat task.
  private val heartbeater = ThreadUtils.newDaemonSingleThreadScheduledExecutor("driver-heartbeater")

  // must be initialized before running startDriverHeartbeat()
  private val heartbeatReceiverRef =
    RpcUtils.makeDriverRef(HeartbeatReceiver.ENDPOINT_NAME, conf, env.rpcEnv)

  startDriverHeartbeater()

著重看一下心跳執行緒：

  private def startDriverHeartbeater(): Unit = {
    val intervalMs = conf.getTimeAsMs("spark.executor.heartbeatInterval", "10s")

    // Wait a random interval so the heartbeats don't end up in sync
    val initialDelay = intervalMs + (math.random * intervalMs).asInstanceOf[Int]

    val heartbeatTask = new Runnable() {
      override def run(): Unit = Utils.logUncaughtExceptions(reportHeartBeat())
    }
    heartbeater.scheduleAtFixedRate(heartbeatTask, initialDelay, intervalMs, TimeUnit.MILLISECONDS)
  }

上述程式碼最主要的實現在在reportHeartBeat函式中。該函式作用有兩個：

更新正在執行的測量資訊。

通知BlockManagerMaster，這個Executor上的BlockManager還活著。

greportHeartBeat具體實現：

  /** Reports heartbeat and metrics for active tasks to the driver. */
  private def reportHeartBeat(): Unit = {
    // list of (task id, metrics) to send back to the driver
    val tasksMetrics = new ArrayBuffer[(Long, TaskMetrics)]()
    val curGCTime = computeTotalGcTime()
    //該迴圈只是為了獲取Job的執行資訊
    for (taskRunner <- runningTasks.values().asScala) {
      if (taskRunner.task != null) {
        taskRunner.task.metrics.foreach { metrics =>
          metrics.updateShuffleReadMetrics()
          metrics.updateInputMetrics()
          metrics.setJvmGCTime(curGCTime - taskRunner.startGCTime)
          metrics.updateAccumulators()

          if (isLocal) {
            // JobProgressListener will hold an reference of it during
            // onExecutorMetricsUpdate(), then JobProgressListener can not see
            // the changes of metrics any more, so make a deep copy of it
            val copiedMetrics = Utils.deserialize[TaskMetrics](Utils.serialize(metrics))
            tasksMetrics += ((taskRunner.taskId, copiedMetrics))
          } else {
            // It will be copied by serialization
            tasksMetrics += ((taskRunner.taskId, metrics))
          }
        }
      }
    }
    //將資訊封裝起來
    val message = Heartbeat(executorId, tasksMetrics.toArray, env.blockManager.blockManagerId)
    try {
      //傳送給heartbeatReceiverRef
      //在TaskSchedulerImpl中有一個executorHeartbeatReceived方法會接受到傳送的訊息
      //並將其轉發給DAGScheduler，DAGScheduler再將其轉發給BlockManagerMaster
      //告訴BlockManager該Executo上的BlockManager還活著
      val response = heartbeatReceiverRef.askWithRetry[HeartbeatResponse](
          message, RpcTimeout(conf, "spark.executor.heartbeatInterval", "10s"))
      if (response.reregisterBlockManager) {
        logInfo("Told to re-register on heartbeat")
        env.blockManager.reregister()
      }
    } catch {
      case NonFatal(e) => logWarning("Issue communicating with driver in heartbeater", e)
    }
  }

Spark1.6-----原始碼解讀之TaskScheduler啟動

必須啟動TaskScheduler才能讓他發揮作用 SparkContext 530行： _taskScheduler.start() 實際去調TaskSchedulerImpl 143行： override def start() { backend.start

Spark1.6-----原始碼解讀之TaskScheduler

TaskScheduler是SparkContext重要成員之一，負責任務的提交，並且請求叢集管理器對任務排程。他也可以看做任務排程的客戶端。 SparkContext 522行建立TaskScheduler： val (sched, ts) = SparkContex

Spark1.6-----原始碼解讀之BlockManager元件shuffle服務和客戶端

spark是分散式部署的，每個Task最終都執行在不同的機器節點上，map任務的輸出結果直接儲存到map任務所在的機器的儲存體系，reduce極有可能不再同一個機器上執行，所以需要遠端下載map任務的中間輸出。所以儲存系統中也包含ShuffleClient。在BlockManager 176行

Spark1.6-----原始碼解讀之BlockManager的概述

BlockManager的實現 BlockManager是spark儲存體系中的核心元件，Driver 和Executor都會建立BlockManager。在SparkEnv 364行會建立BlockManager： // NB: blockManager is not val

Spark1.6-----原始碼解讀之DAGScheduler

純滑鼠點程式碼寫出來的，閱讀時希望你能跟著這樣操作。 DAGScheduler的主要用於在任務正式提交給TaskSchedulerImpl提交之前做一些準備工作。比如建立job，將DAG的RDD劃分到不同的stage，提交stage SparkContext 525行建立DAGSchedul

Spark1.6-----原始碼解讀之SparkEnv

我是跟著原始碼點進去一步一寫的，所以觀看時希望大家能一步一跟著原始碼走，不要只看博文。在SparkContext 284行建立SparkEnv: // This function allows components created by SparkEnv to be mocked in

Spark1.6-----原始碼解讀之BlockManager元件MemoryStore

MemoryStore負責將沒有序列化的java物件陣列或者序列化的ByteBuffer儲存到記憶體中： MemoryStore記憶體模型 maxUnrollMemory：當前Driver或者Executor的block最多提前佔用的記憶體的大小，每個執行緒都能佔記憶體。(類似上課佔座，人沒

【1】pytorch torchvision原始碼解讀之Alexnet

最近開始學習一個新的深度學習框架PyTorch。框架中有一個非常重要且好用的包：torchvision，顧名思義這個包主要是關於計算機視覺cv的。這個包主要由3個子包組成，分別是：torchvision.datasets、torchvision.models、torchvision.trans

java原始碼解讀之HashMap

1:首先下載openjdk(http://pan.baidu.com/s/1dFMZXg1),把原始碼匯入eclipse,以便看到jdk原始碼 Windows-Prefe

PyTorch原始碼解讀之torch.utils.data.DataLoader(轉)

原文連結 https://blog.csdn.net/u014380165/article/details/79058479 寫得特別好！最近正好在學習pytorch，學習一下！ PyTorch中資料讀取的一個重要介面是torch.utils.data.DataLoade

PyTorch原始碼解讀之torchvision.models(轉)

原文地址：https://blog.csdn.net/u014380165/article/details/79119664 PyTorch框架中有一個非常重要且好用的包：torchvision，該包主要由3個子包組成，分別是：torchvision.datasets、torchvision.mode

PyTorch原始碼解讀之torchvision.transforms（轉）

原文地址：https://blog.csdn.net/u014380165/article/details/79167753 PyTorch框架中有一個非常重要且好用的包：torchvision，該包主要由3個子包組成，分別是：torchvision.dat

swoft| 原始碼解讀系列二: 啟動階段, swoft 都幹了些啥?

date: 2018-8-01 14:22:17title: swoft| 原始碼解讀系列二: 啟動階段, swoft 都幹了些啥?description: 閱讀 sowft 框架原始碼, 瞭解 sowft 啟動階段的那些事兒小夥伴剛接觸 swoft 的時候會感覺壓力有點大, 更直觀的說法是難. 開發

jQuery原始碼解讀之init函式

jQuery的構造方法： // 直接new了一個物件。同時根據jQuery.fn = jQuery.prototype，jQuery.fn相當於jQuery.prototype。 jQuery = function( selector, context ) { return

PyTorch原始碼解讀之torchvision.transforms

PyTorch框架中有一個非常重要且好用的包：torchvision，該包主要由3個子包組成，分別是：torchvision.datasets、torchvision.models、torchvision.transforms。這3個子包的具體介紹可以參考

eureka原始碼解讀之服務端

剖析eureka服務端啟動流程服務端啟動類-入口處 @EnableEurekaServer @SpringBootApplication public class EurekaServerApplication { public static void main(Strin

Dubbo原始碼解讀之動態代理

前言或許我們已悉知Java的動態代理的方式：jdk——通過介面中的方法名，在動態生成的代理類中呼叫業務實現類的同名方法；cglib——通過繼承業務類，生成的動態代理類是業務類的子類，通過重寫業務方法進行代理。dubbo在沿用java的jdk方式外，還採取了javassist方式——通過

Lumen開發：lumen原始碼解讀之初始化(2)——門面(Facades)與資料庫(db)

緊接上一篇 $app->withFacades();//為應用程式註冊門面。 $app->withEloquent();//為應用程式載入功能強大的庫。先來看看withFacades() /** * Register the facades

Lumen開發：lumen原始碼解讀之初始化(1)——app例項

先來看看入口檔案public/index.php //請求頭 header('Content-Type: application/json; charset=utf-8'); /* |-------------------------------------------------

Spring原始碼解讀之核心容器上節

Spring架構圖說明 Spring的流行程度就不用我來說了，相信大家如果使用JAVA開發就一定知道它。寫這篇文章的初衷在於：1.瞭解Spring底層實現原理，提升對Spring的認識與理解。2.學習優秀框架程式設計實現，學習優秀的設計模式。3.使用Spring三年多，對於底層細節希望知道更多，便於

Spark1.6-----原始碼解讀之TaskScheduler啟動

相關推薦