Spark 分散式環境搭建
阿新 • • 發佈:2020-12-31
Spark 分散式環境搭建
1. scala環境搭建
1)下載scala安裝包scala2.12.10.tgz安裝到 /usr/scala
[[email protected] scala]# tar -zxvf scala-2.12.10.tgz
[[email protected] scala]# ln -s scala-2.12.10.tgz scala
2)新增Scala環境變數,在/etc/profile中新增:
export SCALA_HOME=/usr/scala/scala
export PATH=$SCALA_HOME/bin:$PATH
3)儲存後重新整理
[[email protected] scala]:~# source /etc/profile
4)使用scala -version命令確認
[[email protected] scala]# scala -version
2. Spark安裝
2.1 解壓
[[email protected] software]$ tar -zxvf spark-2.4.6-bin-2.6.0-cdh5.16.2.tgz -C ~/app/
軟連線
[[email protected] app]$ ln -s spark-2.4.6-bin-2.6.0-cdh5.16.2/ spark
2.2 修改環境配置檔案
[[email protected] app]$ vi /home/hadoop/.bashrc
#spark
export SPARK_HOME=/home/hadoop/app/spark
export PATH=$PATH:$SPARK_HOME/bin
修改spark配置檔案
[[email protected] conf]$ cp spark-env.sh.template spark-env.sh export JAVA_HOME=/usr/java/jdk export SCALA_HOME=/usr/scala/scala export HADOOP_HOME=/home/hadoop/app/hadoop export HADOOP_CONF_DIR=/home/hadoop/app/hadoop/etc/hadoop export SPARK_MASTER_IP=192.168.1.148 export SPARK_MASTER_HOST=192.168.1.148 #export SPARK_LOCAL_IP=11.24.24.112 #export SPARK_LOCAL_IP=11.24.24.113 export SPARK_LOCAL_IP=0.0.0.0 export SPARK_WORKER_MEMORY=1g export SPARK_WORKER_CORES=2 export SPARK_HOME=/home/hadoop/app/spark export SPARK_DIST_CLASSPATH=$(/home/hadoop/app/hadoop/bin/hadoop classpath)
2.3 修改slaves
[[email protected] conf]$ mv slaves.template slaves
[[email protected] conf]$ vim slaves
刪除localhost
hadoop001
hadoop002
hadoop003
2.4 配置hadoop002 hadoop003 的配置檔案
#spark
export SPARK_HOME=/home/hadoop/app/spark
export PATH=$PATH:$SPARK_HOME/bin
source .bashrc
2.5 scp到hadoop002 hadoop003
[[email protected] ~]$ scp -r /home/hadoop/app/spark-2.4.6-bin-2.6.0-cdh5.16.2 hadoop002:/home/hadoop/app/
軟連線
[[email protected] app]$ ln -s spark-2.4.6-bin-2.6.0-cdh5.16.2/ spark
2.6 配置hadoop002 hadoop003 spark 的配置檔案
[[email protected] conf]$ pwd
/home/hadoop/app/spark/conf
[[email protected] conf]$ vim spark-env.sh
配置成他們自己的ip
export SPARK_LOCAL_IP=192.168.1.183
export SPARK_LOCAL_IP=192.168.1.175
3. Scala分發
[[email protected] usr]# scp -r /usr/scala/ hadoop002:/usr/
[[email protected] usr]# scp -r /usr/scala/ hadoop003:/usr/
[[email protected] usr]# scp /etc/profile hadoop002:/etc/
profile 100% 2016 890.7KB/s 00:00
[[email protected] usr]# scp /etc/profile hadoop003:/etc/
profile
[[email protected] ~]# source /etc/profile
[[email protected] ~]# source /etc/profile
4. 啟動
[[email protected] spark]$ sbin/start-all.sh
Spark IDEA 配置
官網檢視spark版本與scala版本相匹配的版本
idea建立spark module 然後配置pom檔案
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.5</version>
</dependency>
</dependencies>
<build>
<plugins>
<!-- 該外掛用於將Scala程式碼編譯成class檔案 -->
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
<executions>
<execution>
<!-- 宣告繫結到maven的compile階段 -->
<goals>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
import之後下載安裝scala
https://www.scala-lang.org/download/
然後在idea的setting裡下載scala外掛
開啟Setting 裡的Plugins 搜尋scala 然後下載
如果提示安裝不成功 選擇本地安裝 開啟v*n下載更快
https://plugins.jetbrains.com/plugin/1347-scala
在setting的右上角選擇 設定紐 install from disk
選好與idea 想匹配的版本
然後配置scala的jdk
ctrl+shift+alt + S
開啟Project structure
然後配置Global Libraries裡的scala jdk