1. 程式人生 > >Windows安裝Spark-2.2.0

Windows安裝Spark-2.2.0

前提:安裝scala-2.11.8.msi

1、根據上篇安裝部署好Hadoop環境

2、下載Spark軟體:http://spark.apache.org/downloads.html

3、解壓到D:\BigDataApp\spark-2.2.0-bin-hadoop2.7

4、配置系統環境變數:建立系統變數SPARK_HOME,內容為D:\BigDataApp\spark-2.2.0-bin-hadoop2.7,新增%SPARK_HOME%\bin 和 %SPARK_HOME%\sbin到系統變數PATH中

5、cmd執行:spark-shell

驗證:看到Spark context available as 'sc' 則啟動成功

Scala驗證:

val textFile=sc.textFile("C:\\logs\\1.txt")
val tokenizedFileData = textFile.flatMap(line=>line.split(" "))
val countPrep = tokenizedFileData.map(word=>(word,1))
val counts = countPrep.reduceByKey((accumValue, newValue)=>accumValue+newValue)
var sortedCounts = counts.sortBy(kvPair=>kvPair._2,false)
//sortedCounts.saveAsTextFile("file:///SparkOutputData/ReadMeWordCount")
sortedCounts.saveAsTextFile("C:\\logs\\test")