Kubernetes與大資料之二:編譯並執行基於Scalar的Spark程式WordCount
阿新 • • 發佈:2018-12-13
一、前言
通過SBT編譯scala程式然後在Kubernetes使用Spark執行WordCount任務。
轉載自https://blog.csdn.net/cloudvtech
二、安裝環境和編譯
2.1 安裝SBT
mv bintray-sbt-rpm.repo /etc/yum.repos.d/
yum install -y sbt
2.2 編輯檔案
mkdir -p spark-example-project/src/main/scala/
cd spark-example-project
ls /opt/spark/spark-2.1.1-bin-hadoop2.7/jars/scala-library-2.11.8.jar
ls /opt/spark/spark-2.1.1-bin-hadoop2.7/jars/spark-core_2.11-2.1.1.jar
simple.sbt
name := "Simple Project"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.1"
src/main/scala/SparkPi.scala
package org.apache.spark.examples import scala.math.random import org.apache.spark._ /** Computes an approximation to pi */ object SparkPi { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Spark Pi") val spark = new SparkContext(conf) val slices = if (args.length > 0) args(0).toInt else 2 val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow val count = spark.parallelize(1 until n, slices).map { i => val x = random * 2 - 1 val y = random * 2 - 1 if (x*x + y*y < 1) 1 else 0 }.reduce(_ + _) println("Pi is roughly " + 4.0 * count / n) spark.stop() } }
2.3 編譯
sbt clean sbt package [info] Loading project definition from /root/spark-example-project/project [info] Loading settings for project spark-example-project from simple.sbt ... [info] Set current project to Simple Project (in build file:/root/spark-example-project/) [info] Updating ... [info] Done updating. [warn] There may be incompatibilities among your library dependencies. [warn] Run 'evicted' to see detailed eviction warnings [info] Compiling 2 Scala sources to /root/spark-example-project/target/scala-2.11/classes ... [info] Done compiling. [warn] Multiple main classes detected. Run 'show discoveredMainClasses' to see the list [info] Packaging /root/spark-example-project/target/scala-2.11/simple-project_2.11-1.0.jar ... [info] Done packaging. [success] Total time: 27 s, completed Sep 10, 2018 1:48:22 PM
轉載自https://blog.csdn.net/cloudvtech
三、打包和執行
3.1 構建docker image
cp spark-example-project/target/scala-2.11/simple-project_2.11-1.0.jar /root/spark-2.3.0-bin-hadoop2.7/examples/jars/sparkpi_2.11-1.0.jar
cd /root/spark-2.3.0-bin-hadoop2.7
docker build -t 192.168.56.10:5000/spark:2.3.0.4 -f kubernetes/dockerfiles/spark/Dockerfile .
docker push 192.168.56.10:5000/spark:2.3.0.4
3.2 執行task
/root/spark/bin/spark-submit \
--master k8s://https://192.168.56.10:6443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=3 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=192.168.56.10:5000/spark:2.3.0.4 \
local:///opt/spark/examples/jars/sparkpi_2.11-1.0.jar
轉載自https://blog.csdn.net/cloudvtech