1. 程式人生 > >spark2.1叢集安裝

spark2.1叢集安裝

規劃

cancer01 master/worker

cancer02 worker

cancer03 worker

cancer04 worker

cancer05 worker

準備

su hadoop

安裝scala

每臺機器上

cd /usr/local

wget http://downloads.lightbend.com/scala/2.11.8/scala-2.11.8.tgz

tar zxf scala-2.11.8.tgz

mv scala-2.11.8 scala

chown -R hadoop:hadoop scala

vim /etc/profile

export SCALA_HOME=/usr/local/scala

export PATH=$PATH:$SCALA_HOME/bin

source /etc/profile

安裝spark

wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.1-bin-hadoop2.7.tgz

tar zxf spark-2.0.1-bin-hadoop2.7.tgz

mv spark-2.0.1-bin-hadoop2.7/usr/local/spark

chown -R hadoop:hadoop spark

vim /etc/profile

export SPARK_HOME=/usr/local/spark

export PATH=$PATH:$SPARK_HOME/bin

source /etc/profile

配置

cd /usr/local/spark/conf

mv spark-env.sh.template spark-env.sh

vim spark-env.sh

export SCALA_HOME=/usr/local/scala

exportHADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

export SPARK_MASTER_IP=192.168.11.134

export SPARK_MASTER_PORT=12345

exportSPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)

複製

在cancer02|03|04|05上建立/usr/local/spark目錄

scp –r spark [email protected]:/usr/local/

scp –r spark [email protected]:/usr/local/

scp –r spark [email protected]:/usr/local/

scp –r spark hado[email protected]:/usr/local/

啟動

$HADOOP_HOME/sbin/start-all.sh

$SPARK_HOME/sbin/start-all.sh

或者

$SPARK_HOME/sbin/start-master.sh

$SPARK_HOME/sbin/start-slaves.sh

驗證

執行

./bin/run-example SparkPi 2>%1 | grep "Piis roughly"

./bin/spark-submitexamples/src/main/python/pi.py 2>%1 | grep "Pi is roughly"

執行(scala python)

./bin/spark-shell

Scala樣例:

val textFile =sc.textFile(“file:///usr/local/spark/README.md”);

textFile.count();

textFile.first();

val linesWithSpark = textFile.filter(line=> line.contains("Spark"));

linesWithSpark.count();

textFile.filter(line =>line.contains("Spark")).count();

配置conf/spark-env.sh

export SPARK_HOME=/var/lib/myspark/spark

export JAVA_HOME=/usr/java/jdk1.7.0_80

exportHADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop

exportPATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

exportHADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

exportYARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

exportSPARK_LIBARY_PATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib/native

SPARK_MASTER_HOST=10.20.24.199

#web頁面埠

SPARK_MASTER_WEBUI_PORT=28686

#Spark的local目錄

SPARK_LOCAL_DIRS=/hadoopdata1/sparkdata/local

#worker目錄

SPARK_WORKER_DIR=/hadoopdata1/sparkdata/work

#Driver記憶體大小

SPARK_DRIVER_MEMORY=4G

#Worker的cpu核數

SPARK_WORKER_CORES=16

#worker記憶體大小

SPARK_WORKER_MEMORY=64g

#Spark的log日誌目錄

SPARK_LOG_DIR=/var/lib/myspark/spark/logs