Spark系列(1)—Spark單機安裝與測試
阿新 • • 發佈:2019-02-05
Spark作為最有可能代替mapreduce的分散式計算框架,當前非常火,本人也開始關注Spark並試著從hadoop+mahout轉向Spark。
1.本地環境
本地為ubuntu14.04+jdk1.7
2.原始碼下載
我所使用的版本是:0.9.1版本,原始碼包為:spark-0.9.1.tgz
3.編譯
解壓縮原始碼包
$tar xzvf spark-0.9.1.tgz
$cd spark-0.9.1
sbt/sbt assembly
編譯成功,你就可以玩兒啦!
4.執行例項
最簡單的計算pi
local代表本地,[3]表示3個執行緒跑。$./bin/run-example org.apache.spark.examples.SparkPi local[3]
結果如下:
看到了吧PI=3.13486Pi is roughly 3.13486 14/05/08 10:26:15 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/static,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/metrics/json,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/executors,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/environment,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/stages,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/stages/pool,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/stages/stage,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/storage,null} 14/05/08 10:26:15 INFO handler.ContextHandler: stopped o.e.j.s.h.ContextHandler{/storage/rdd,null} 14/05/08 10:26:16 INFO spark.MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 14/05/08 10:26:16 INFO network.ConnectionManager: Selector thread was interrupted! 14/05/08 10:26:16 INFO network.ConnectionManager: ConnectionManager stopped 14/05/08 10:26:16 INFO storage.MemoryStore: MemoryStore cleared 14/05/08 10:26:16 INFO storage.BlockManager: BlockManager stopped 14/05/08 10:26:16 INFO storage.BlockManagerMasterActor: Stopping BlockManagerMaster 14/05/08 10:26:16 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 14/05/08 10:26:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 14/05/08 10:26:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 14/05/08 10:26:16 INFO spark.SparkContext: Successfully stopped SparkContext 14/05/08 10:26:16 INFO Remoting: Remoting shut down 14/05/08 10:26:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
5.互動方式執行
$ ./bin/spark-shell
進入互動模式。
互動模式例項可以看這裡,它是對READ.me檔案的處理。