1. 程式人生 > >spark源碼編譯

spark源碼編譯

info active pos 編譯 安裝jdk ack script date mx4

編譯環境準備

      1. 安裝JDK1.8並配置環境變量
      2. 安裝maven並配置環境變量

下載spark源碼並解壓

[root@MySQL ~]# wget https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0.tgz

[root@MySQL ~]# tar zxvf spark-2.4.0.tgz 

[root@MySQL ~]# cd spark-2.4.0

修改配置

[root@MySQL ~]# cd spark-2.4.0/dev

修改文件make-distribution.sh

添加內容

VERSION=2.4.0

SCALA_VERSION
=2.12.8 SPARK_HADOOP_VERSION=2.6.0-cdh5.14.0 SPARK_HIVE=1

註釋掉以下內容

#VERSION=$("$MVN" help:evaluate -Dexpression=project.version $@ 2>/dev/null | grep -v "INFO" | tail -n 1)

#SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version $@ 2>/dev/null
#    | grep -v "INFO"
#    | tail -n 
1) #SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null # | grep -v "INFO" # | tail -n 1) #SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null # | grep -v "INFO" # | fgrep --count "<id>hive</id>";
\ # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing # because we use "set -o pipefail" # echo -n)

添加maven編譯庫,修改pom.xml文件

修改Hadoop、flume以及zookeeper的版本

<hadoop.version>2.6.0-cdh5.14.0</hadoop.version>

<flume.version>1.6.0-cdh5.14.0</flume.version>

<zookeeper.version>3.4.5-cdh5.14.0</zookeeper.version>

添加CDH以及阿裏雲的倉庫地址

<repository>

<id>cloudera</id>

<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>

</repository>

<repository>

<id>aliyun</id>

<name>aliyun</name>

<url>http://maven.aliyun.com/nexus/content/groups/public/</url>

<layout>default</layout>

<releases>

<enabled>true</enabled>

<updatePolicy>never</updatePolicy>

</releases>

<snapshots>

<enabled>true</enabled>

<updatePolicy>never</updatePolicy>

</snapshots>

</repository>

添加pluginRepository

<pluginRepository>

<id>cloudera</id>

<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>

</pluginRepository>

執行編譯命令

[root@MySQL ~]# cd spark-2.4.0/

[root@MySQL ~]# export MAVEN_OPTS="-Xmx4g -XX:ReservedCodeCacheSize=1024m"

[root@MySQL ~]# ./dev/make-distribution.sh --name 2.6.0-cdh5.14.0 --tgz  -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver -Dhadoop.version=2.6.0-cdh5.14.0 -DskipTests

spark源碼編譯