sbt的安裝測試
1.下載
wget https://github.com/sbt/sbt/releases/download/v0.13.15/sbt-0.13.15.tgz
2.安裝
tar -zxvf sbt-0.13.15.tgz -C /root/scala/sbt
3.在/root/scala/sbt目錄下面添加文件sbt:
vim sbt
SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=1024M"
java $SBT_OPTS -jar /root/scala/sbt/bin/sbt-launch.jar "$@"
4.修改sbt權限
$ chmod u+x sbt
5.配置環境變量
$ vim ~/.bashrc
# 在文件尾部添加如下代碼後,保存退出
export PATH=/root/scala/sbt/:$PATH
$ source ~/.bashrc
配置配置文件的目錄在./sbt/conf/sbtconfig.txt
-Dhttp.proxyHost=proxy.zte.com.cn
-Dhttp.proxyPort=80
6.測試
[root@host sbt]# sbt sbt-version
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=1024M; support was removed in 8.0
WARN: No sbt.version set in project/build.properties, base directory: /root/scala/sbt
[warn] Executing in batch mode.
[warn] For better performance, hit [ENTER] to switch to interactive mode, or
[warn] consider launching sbt without any commands, or explicitly passing ‘shell‘
[info] Set current project to sbt (in build file:/root/scala/sbt/)
[info] 0.13.15
7.編寫scala應用程序
cd # 進入用戶主文件
mkdir sparkapp # 創建應用程序根目
mkdir -p sparkapp/src/main/scala # 創建所需的文件夾結構
創建 vim sparkapp/src/main/scala/SimpleApp.scala
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val logFile = "/tmp/20171024/20171024.txt" // 文件路徑
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
logData.foreach(println)
}
}
8..使用sbt打包Scala程序
該程序依賴 Spark API,因此我們需要通過 sbt 進行編譯打包。 請在sparkapp 中新建文件 simple.sbt(vim sparkapp/simple.sbt
),添加內容如下,聲明該獨立應用程序的信息以及與 Spark 的依賴關系:
name := "Simple Project"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"
文件 simple.sbt 需要指明 Spark 和 Scala 的版本。在上面的配置信息中,scalaVersion用來指定scala的版本,sparkcore用來指定spark的版本,這兩個版本信息都可以在之前的啟動 Spark shell 的過程中,從屏幕的顯示信息中找到。
我們就可以通過如下代碼將整個應用程序打包成 JAR:
[root@host ~]# cd sparkapp
[root@host sparkapp]# /root/scala/sbt/sbt package
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=1024M; support was removed in 8.0
[warn] Executing in batch mode.
[warn] For better performance, hit [ENTER] to switch to interactive mode, or
[warn] consider launching sbt without any commands, or explicitly passing ‘shell‘
[info] Loading project definition from /root/sparkapp/project
[info] Set current project to Simple Project (in build file:/root/sparkapp/)
[info] Compiling 1 Scala source to /root/sparkapp/target/scala-2.11/classes...
[info] Packaging /root/sparkapp/target/scala-2.11/simple-project_2.11-1.0.jar ...
[info] Done packaging.
[success] Total time: 4 s, completed Nov 6, 2017 11:23:49 AM
9.通過 spark-submit 運行程序(紅色字體為結果)
[root@host sparkapp]# $SPARK_HOME/bin/spark-submit --class "SimpleApp" /root/sparkapp/target/scala-2.11/simple-project_2.11-1.0.jar
17/11/06 11:24:00 INFO spark.SparkContext: Running Spark version 2.2.0
.................................................
17/11/06 11:24:03 INFO rdd.HadoopRDD: Input split: hdfs://localhost:9000/tmp/20171024/20171024.txt:23+23
17/11/06 11:24:03 INFO rdd.HadoopRDD: Input split: hdfs://localhost:9000/tmp/20171024/20171024.txt:0+23
17/11/06 11:24:04 INFO memory.MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 16.0 B, free 366.0 MB)
17/11/06 11:24:04 INFO memory.MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 160.0 B, free 366.0 MB)
17/11/06 11:24:04 INFO storage.BlockManagerInfo: Added rdd_1_1 in memory on 192.168.120.140:56693 (size: 16.0 B, free: 366.3 MB)
17/11/06 11:24:04 INFO storage.BlockManagerInfo: Added rdd_1_0 in memory on 192.168.120.140:56693 (size: 160.0 B, free: 366.3 MB)
10 20 30 50 80 100 60 90 60 60 31 80 70 51 50
.......................................
17/11/06 11:24:04 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-dcdfa23e-2972-49d0-9e1d-9643d5f25eb4
sbt的安裝測試