Windows安裝Spark-2.2.0
阿新 • • 發佈:2019-01-22
前提:安裝scala-2.11.8.msi
1、根據上篇安裝部署好Hadoop環境
2、下載Spark軟體:http://spark.apache.org/downloads.html
3、解壓到D:\BigDataApp\spark-2.2.0-bin-hadoop2.7
4、配置系統環境變數:建立系統變數SPARK_HOME,內容為D:\BigDataApp\spark-2.2.0-bin-hadoop2.7,新增%SPARK_HOME%\bin 和 %SPARK_HOME%\sbin到系統變數PATH中
5、cmd執行:spark-shell
驗證:看到Spark context available as 'sc' 則啟動成功
Scala驗證:
val textFile=sc.textFile("C:\\logs\\1.txt")
val tokenizedFileData = textFile.flatMap(line=>line.split(" "))
val countPrep = tokenizedFileData.map(word=>(word,1))
val counts = countPrep.reduceByKey((accumValue, newValue)=>accumValue+newValue)
var sortedCounts = counts.sortBy(kvPair=>kvPair._2,false)
//sortedCounts.saveAsTextFile("file:///SparkOutputData/ReadMeWordCount")
sortedCounts.saveAsTextFile("C:\\logs\\test")