Spark Executor啟動原始碼分析

阿新 • • 發佈：2018-12-05

Spark CoarseGrainedExecutorBackend啟動原始碼分析

Youtube視訊分析

Spark CoarseGrainedExecutorBackend啟動原始碼分析(youtube視訊) : https://youtu.be/1qg4UMPV3pQ

Bilibili視訊分析

Spark CoarseGrainedExecutorBackend啟動原始碼分析(bilibili視訊) : https://www.bilibili.com/video/av37442311/

executor啟動圖解

SparkContext向Master傳送訊息

SparkContext向Master傳送訊息RegisterApplication

    /**
     *  Register with all masters asynchronously and returns an array `Future`s for cancellation.
     */
    private def tryRegisterAllMasters(): Array[JFuture[_]] = {
      for (masterAddress <- masterRpcAddresses) yield {
        registerMasterThreadPool.submit(new Runnable {
          override def run(): Unit = try {
            if (registered.get) {
              return
            }
            logInfo("Connecting to master " + masterAddress.toSparkURL + "...")
            val masterRef =
              rpcEnv.setupEndpointRef(Master.SYSTEM_NAME, masterAddress, Master.ENDPOINT_NAME)
            masterRef.send(RegisterApplication(appDescription, self))
          } catch {
            case ie: InterruptedException => // Cancelled
            case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)
          }
        })
      }
    }

Master向Worker傳送訊息

master處理RegisterApplication訊息時呼叫，資源排程方法
資源排程方法中呼叫launchExecutor方法

 private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = {
    logInfo("Launching executor " + exec.fullId + " on worker " + worker.id)
    worker.addExecutor(exec)
    worker.endpoint.send(LaunchExecutor(masterUrl,
      exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory))
    exec.application.driver.send(
      ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory))
  }

Worker處理LaunchExecutor訊息

new ExecutorRunner 並啟動新執行緒來進行executor程序
向master傳送Executor狀態改變訊息： ExecutorStateChange

 case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
      if (masterUrl != activeMasterUrl) {
        logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")
      } else {
        try {
          logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))

          // Create the executor's working directory
          val executorDir = new File(workDir, appId + "/" + execId)
          if (!executorDir.mkdirs()) {
            throw new IOException("Failed to create directory " + executorDir)
          }

          // Create local dirs for the executor. These are passed to the executor via the
          // SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the
          // application finishes.
          val appLocalDirs = appDirectories.get(appId).getOrElse {
            Utils.getOrCreateLocalRootDirs(conf).map { dir =>
              val appDir = Utils.createDirectory(dir, namePrefix = "executor")
              Utils.chmod700(appDir)
              appDir.getAbsolutePath()
            }.toSeq
          }
          appDirectories(appId) = appLocalDirs
          val manager = new ExecutorRunner(
            appId,
            execId,
            appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
            cores_,
            memory_,
            self,
            workerId,
            host,
            webUi.boundPort,
            publicAddress,
            sparkHome,
            executorDir,
            workerUri,
            conf,
            appLocalDirs, ExecutorState.RUNNING)
          executors(appId + "/" + execId) = manager
          manager.start()
          coresUsed += cores_
          memoryUsed += memory_
          sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))
        } catch {
          case e: Exception => {
            logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)
            if (executors.contains(appId + "/" + execId)) {
              executors(appId + "/" + execId).kill()
              executors -= appId + "/" + execId
            }
            sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
              Some(e.toString), None))
          }
        }
      }

CoarseGrainedExecutorBackend啟動

main方法啟動executor程序

def main(args: Array[String]) {
    var driverUrl: String = null
    var executorId: String = null
    var hostname: String = null
    var cores: Int = 0
    var appId: String = null
    var workerUrl: Option[String] = None
    val userClassPath = new mutable.ListBuffer[URL]()

    var argv = args.toList
    while (!argv.isEmpty) {
      argv match {
        case ("--driver-url") :: value :: tail =>
          driverUrl = value
          argv = tail
        case ("--executor-id") :: value :: tail =>
          executorId = value
          argv = tail
        case ("--hostname") :: value :: tail =>
          hostname = value
          argv = tail
        case ("--cores") :: value :: tail =>
          cores = value.toInt
          argv = tail
        case ("--app-id") :: value :: tail =>
          appId = value
          argv = tail
        case ("--worker-url") :: value :: tail =>
          // Worker url is used in spark standalone mode to enforce fate-sharing with worker
          workerUrl = Some(value)
          argv = tail
        case ("--user-class-path") :: value :: tail =>
          userClassPath += new URL(value)
          argv = tail
        case Nil =>
        case tail =>
          // scalastyle:off println
          System.err.println(s"Unrecognized options: ${tail.mkString(" ")}")
          // scalastyle:on println
          printUsageAndExit()
      }
    }

    if (driverUrl == null || executorId == null || hostname == null || cores <= 0 ||
      appId == null) {
      printUsageAndExit()
    }

    run(driverUrl, executorId, hostname, cores, appId, workerUrl, userClassPath)
  }

Spark Executor啟動原始碼分析

Spark CoarseGrainedExecutorBackend啟動原始碼分析更多資源 github: https://github.com/opensourceteams/spark-scala-maven csdn(彙總視訊線上看): https://blog

Spark Worker啟動原始碼分析

Spark Worker啟動原始碼分析更多資源 github: https://github.com/opensourceteams/spark-scala-maven csdn(彙總視訊線上看): https://blog.csdn.net/thinktothing

Spark Master啟動原始碼分析

Spark Master啟動原始碼分析更多資源 github: https://github.com/opensourceteams/spark-scala-maven csdn(彙總視訊線上看): https://blog.csdn.net/thinktothing

Spark叢集啟動流程-Worker啟動-原始碼分析

Spark叢集啟動流程-Worker啟動-原始碼分析上篇文章介紹了Master啟動（Master啟動點選：https://blog.csdn.net/weixin_43637653/article/details/84073849 ），接下來，我們在原始碼裡繼續分析Worker的啟動

Spark叢集啟動流程-Master啟動-原始碼分析

Spark叢集啟動流程-Master啟動-原始碼分析總結： 1.初始化一些用於啟動Master的引數 2.建立ActorSystem物件，並啟動Actor 3.呼叫工具類AkkaUtils工具類來建立actorSystem（用來建立Actor的物件） 4.建立屬於Master的ac

【1】netty4服務端啟動原始碼分析-執行緒的建立

轉自 http://xw-z1985.iteye.com/blog/1925013 本文分析Netty中boss和worker的執行緒的建立過程：以下程式碼是服務端的啟動程式碼，執行緒的建立就發生在其中。 EventLoopGroup bossGroup = new NioEv

Android7.1 [Camera] CameraService啟動原始碼分析

原始碼平臺：rk3399 摘要: 1.拷貝cameraserver.rc編譯拷貝到system/etc/init目錄 2.啟動cameraserver服務摘要1:cameraserver.rc編譯拷貝到system/etc/init目錄 an

1.1spring啟動原始碼分析（ClassPathXmlApplicationContext）

spring啟動原始碼分析（ClassPathXmlApplicationContext） Applicantioncontext uml圖 ClassPathXmlApplicationContext xml 配置檔案專案中的路徑 FileSystemXml

spark 2.3原始碼分析之ShuffleInMemorySorter

PackedRecordPointer 概述 PackedRecordPointer物件用一個64bit的long型變數來記錄record資訊： [24 bit partition number][13 bit memory page number][27 bit

spark 2.3原始碼分析之SortShuffleWriter

SortShuffleWriter 概述 SortShuffleWriter它主要是判斷在Map端是否需要本地進行combine操作。如果需要聚合，則使用PartitionedAppendOnlyMap；如果不進行combine操作，則使用PartitionedPairB

spark 2.3原始碼分析之ShuffleDependency

ShuffleDependency 成員變數 - ShuffleHandle 在ShuffleDependency中建立ShuffleHandle. 如前面的部落格所述，有以下三種ShuffleHandle： BypassMergeSortShuffleHandle

Executor相關原始碼分析

Executor是一個用來執行提交的任務（Runnable）的物件。這個介面提供了一種將任務的提交和任務如何去執行解耦機制 Executor詳解先來檢視Executor的介面定義： public interface Executor {

知識小罐頭09（tomcat8啟動原始碼分析下）

　　初始化已經完成，現在就是啟動這些元件，Tomcat中的start方法就是用於啟動的，其實start的原理還是和上一篇說的初始化幾乎一樣！這裡我就大概說一下，看幾個比較關鍵的地方就行了。　　前面的步驟就大概截圖看一下就ok了　　ok，由於前面這些東西基本和初始化的

RePlugin外掛啟動原始碼分析

大年初一，先祝各位新年快樂！今天還在看部落格學習的兄dei很強大，如果能把一年節日假期時間分配到自己成長上，那你的一年 = 別人一年 * 1.1。如果能夠做到年年如此，10年後你就相當於活了11年。而這期間，學習複利效應的效果是呈現指數增長的。當然，朋友關係也不能落下，但在節假日做

Spark SQL Catalyst原始碼分析之Physical Plan

前面幾篇文章主要介紹的是spark sql包裡的的spark sql執行流程，以及Catalyst包內的SqlParser，Analyzer和Optimizer，最後要介紹一下Catalyst裡最後的一個Plan了，即Physical Plan。物理計劃是Spark SQ

spark-core_04: org.apache.spark.deploy.SparkSubmit原始碼分析：

SparkSubmitArgumentsParser的父類就SparkSubmitOptionParser，在launcher.Main方法執行時用到OptionParser 它的父類也是SparkSubmitOptionParser。並且這個父類有一個方法parser。作用

spark 2.x 原始碼分析之 Logistic Regression 邏輯迴歸

Logistic Regression 邏輯迴歸注：第一次寫部落格，希望互相交流改進。如果公式顯示不完整，請看github原文一、二元邏輯迴歸 1、簡介迴歸是解決變數之間的對映關係（x->y），而邏輯迴歸則通過sigmoi

Spark RPC 框架原始碼分析（二）執行時序

前情提要： Spark RPC 框架原始碼分析（一）簡述一. Spark RPC 概述概述上一篇我們已經說明了 Spark RPC 框架的一個簡單例子，以及一些基本概念的說明。這一篇我們主要講述其執行時序，從而揭露 Spark RPC 框架的執行原理。我們將分為兩部分，分別從服務端和客戶端

Spark整合Kafka原始碼分析——SparkStreaming從kafak中接收資料

整體概括：要實現SparkStreaming從kafak中接收資料分為以下幾步(其中涉及的類在包org.apache.spark.streaming.kafka中)： 1.建立createStream()函式，返回型別為ReceiverInputDStream物件，在cre

Spark SQL Catalyst原始碼分析之UDF

在SQL的世界裡，除了官方提供的常用的處理函式之外，一般都會提供可擴充套件的對外自定義函式介面，這已經成為一種事實的標準。在前面Spark SQL原始碼分析之核心流程一文中，已經介紹了Spark SQL Catalyst Analyzer的作用，其中包含了ResolveF

Spark Executor啟動原始碼分析

Spark CoarseGrainedExecutorBackend啟動原始碼分析

更多資源

Youtube視訊分析

Bilibili視訊分析

executor啟動圖解

SparkContext向Master傳送訊息

Master向Worker傳送訊息

Worker處理LaunchExecutor訊息

CoarseGrainedExecutorBackend啟動

相關推薦