spark 案例叢集測試整理

阿新 • • 發佈：2018-12-25

時間：20150210

工作過程：今天打算使用spark 自帶的案例sparkpi 對叢集進行測試，主要向瞭解叢集啟動過程及機器的負載情況。沒想到問題還還真不少，感謝群友，特別是hali 支援。

主要的問題有3個：

1.測試spark 叢集與local 執行方式使用的差別及叢集測試時Ip 與機器訪問的處理

2.spark 叢集不能重啟問題的處理

1。.測試spark 叢集與local 執行方式使用的差別

1.1 本地啟動

./run-example org.apache.spark.examples.SparkPi 2 spark://10.7.12.117:7077 這樣啟動，啟動方式其實是Local模式。可以通過檢視run-example指令碼看出，並且./run-example org.apache.spark.examples.SparkPi 2 local 這樣不行。注意本地啟動，在

http://10.7.12.117:8080/ 下看不到job 情況，

1.2 叢集啟動

./bin/spark-submit --master spark://jt-host-kvm-17:7077 --class org.apache.spark.examples.SparkPi --executor-memory 300m ./lib/spark-examples-1.1.0-hadoop2.4.0.jar 1

這裡用ip有問題，錯誤如下

15/02/10 13:45:53 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/02/10 13:45:53 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from Stage 0 (MappedRDD[1] at map at SparkPi.scala:31)
15/02/10 13:45:53 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
15/02/10 13:46:08 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/02/10 13:46:13 INFO client.AppClient$ClientActor: Connecting to master spark://10.7.12.117:7077...
15/02/10 13:46:23 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/02/10 13:46:33 INFO client.AppClient$ClientActor: Connecting to master spark://10.7.12.117:7077...
15/02/10 13:46:38 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
15/02/10 13:46:53 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/02/10 13:46:53 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
15/02/10 13:46:53 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
15/02/10 13:46:53 INFO scheduler.DAGScheduler: Failed to run reduce at SparkPi.scala:35
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up.
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)

其他群友支援的資料

2.spark 叢集不能重啟問題的處理：

執行stop-all.sh 停止spark 叢集命令後提示，如下

jt-host-kvm-17: no org.apache.spark.deploy.worker.Worker to stop
jt-host-kvm-19: no org.apache.spark.deploy.worker.Worker to stop
jt-host-kvm-18: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop

初步分析是worker.pid或者master.pid預設位置在/tmp 資料夾下，可能被刪除了因為 在RHEL6中，系統自動清理/tmp資料夾的預設時限是30天
配置環境變數 SPARK_PID_DIR

spark 案例叢集測試整理

spark 案例叢集測試整理

GIS+=地理資訊+行業+大資料——Spark叢集下SPARK SQL開發測試介紹

軟件測試整理復習（判斷題）

安全測試整理

LoadRunner 叢集測試配置

spark部署(叢集)

大資料篇：Spark-shell的測試及Scala獨立應用程式的編寫與sbt打包

大資料篇：Spark安裝及測試PI的值

Spark案例之根據ip地址計算歸屬地二

Spark案例之根據ip地址計算歸屬地一

Spark案例之根據ip地址計算歸屬地四

Spark案例之根據ip地址計算歸屬地三

Spark學習之問題整理

單元測試整理

Java生成-zipf分佈的資料集（自定義傾斜度，用作spark data skew測試）

推薦系統從入門到 Spark 案例實踐

Spark Window 程式碼片段整理

Spark Optane IMDT 測試

使用 Hibench 對 Spark 進行基準測試

Kubernetes實戰(二)：k8s v1.11.1 prometheus traefik元件安裝及叢集測試

spark 案例叢集測試整理

相關推薦