spark源碼編譯
阿新 • • 發佈:2019-03-06
info active pos 編譯 安裝jdk ack script date mx4
編譯環境準備
- 安裝JDK1.8並配置環境變量
- 安裝maven並配置環境變量
下載spark源碼並解壓
[root@MySQL ~]# wget https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0.tgz [root@MySQL ~]# tar zxvf spark-2.4.0.tgz [root@MySQL ~]# cd spark-2.4.0
修改配置
[root@MySQL ~]# cd spark-2.4.0/dev
修改文件make-distribution.sh
添加內容
VERSION=2.4.0 SCALA_VERSION=2.12.8 SPARK_HADOOP_VERSION=2.6.0-cdh5.14.0 SPARK_HIVE=1
註釋掉以下內容
#VERSION=$("$MVN" help:evaluate -Dexpression=project.version $@ 2>/dev/null | grep -v "INFO" | tail -n 1) #SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version $@ 2>/dev/null # | grep -v "INFO" # | tail -n1) #SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null # | grep -v "INFO" # | tail -n 1) #SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null # | grep -v "INFO" # | fgrep --count "<id>hive</id>";\ # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing # because we use "set -o pipefail" # echo -n)
添加maven編譯庫,修改pom.xml文件
修改Hadoop、flume以及zookeeper的版本
<hadoop.version>2.6.0-cdh5.14.0</hadoop.version> <flume.version>1.6.0-cdh5.14.0</flume.version> <zookeeper.version>3.4.5-cdh5.14.0</zookeeper.version>
添加CDH以及阿裏雲的倉庫地址
<repository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository> <repository> <id>aliyun</id> <name>aliyun</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <layout>default</layout> <releases> <enabled>true</enabled> <updatePolicy>never</updatePolicy> </releases> <snapshots> <enabled>true</enabled> <updatePolicy>never</updatePolicy> </snapshots> </repository> 添加pluginRepository <pluginRepository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </pluginRepository>
執行編譯命令
[root@MySQL ~]# cd spark-2.4.0/ [root@MySQL ~]# export MAVEN_OPTS="-Xmx4g -XX:ReservedCodeCacheSize=1024m" [root@MySQL ~]# ./dev/make-distribution.sh --name 2.6.0-cdh5.14.0 --tgz -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver -Dhadoop.version=2.6.0-cdh5.14.0 -DskipTests
spark源碼編譯