1. 程式人生 > >Spark Job Server 0.7.0部署和使用

Spark Job Server 0.7.0部署和使用

##安裝Scala

Scala官網下載合適的版本 解壓到/usr/local/scala目錄下(目錄可隨意修改) 在linux下加入環境變數

export PATH="$PATH:/usr/scala/bin"

輸入scala檢查是否安裝成功

##手動安裝sbt

官網下載sbt,可以用zip或tgz 解壓到/usr/local/sbt目錄下 在/usr/local/sbt目錄下新建sbt檔案

cd /usr/local/sbt
vi sbt

輸入以下內容(-XX:MaxPermSize=256M 在JAVA 1.8可以取消):

SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M"
java $SBT_OPTS -jar /usr/local/sbt/bin/sbt-launch.jar "
[email protected]
"

配置倉庫

vi ~/.sbt/repositories

輸入以下內容

[repositories]
  local
  aliyun-nexus: http://maven.aliyun.com/nexus/content/groups/public/
  #或者oschina: http://maven.oschina.net/content/groups/public/
  jcenter: http://jcenter.bintray.com/
  typesafe-ivy-releases: http://repo.typesafe.com/typesafe/ivy-releases/, [organization]/[module]/[revision]/[type]s/[artifact](-[classifier]).[ext], bootOnly
  maven-central: http://repo1.maven.org/maven2/

配置環境變數

export SBT_HOME=/usr/local/sbt
export PATH=$PATH:$SBT_HOME

輸入sbt命令檢查是否安裝成功 sbt第一次執行時會自動下載包,等出現sbt控制檯即配置完成

sbt:sbt> 

##部署spark-jobserver ###配置

cd /home/hadoop/application/spark-jobserver/conf
cp local.conf.template local.conf
cp local.sh.template local.sh
#!/usr/bin/env bash

# Environment and deploy file
# For use with bin/server_deploy, bin/server_package etc.
#ssh遠端部署host,可以使用ip
DEPLOY_HOSTS="dashuju213
              dashuju214"

#ssh安裝時使用者名稱和使用者組
APP_USER=hadoop
APP_GROUP=hadoop
JMX_PORT=9999
# optional SSH Key to login to deploy server
#SSH_KEY=/path/to/keyfile.pem
# deploy安裝目錄
INSTALL_DIR=/home/hadoop/application/jobserver
# 日誌目錄
LOG_DIR=/home/hadoop/application/jobserver/logs
PIDFILE=spark-jobserver.pid
JOBSERVER_MEMORY=1G
#SPARK版本
SPARK_VERSION=2.3.2
MAX_DIRECT_MEMORY=512M
#SPARK目錄
SPARK_HOME=/home/hadoop/application/spark
SPARK_CONF_DIR=$SPARK_HOME/conf
#SCALA版本
SCALA_VERSION=2.11.12

配置資料庫 vi local.conf,只列出需要修改的配置

# also add the following line at the root level.
flyway.locations="db/mysql/migration"

spark {
  # local[...], yarn, mesos://... or spark://...
  master = "spark://dashuju213:6066,dashuju214:6066"

  # client or cluster deployment
  submit.deployMode = "cluster"

  # Default # of CPUs for jobs to use for Spark standalone cluster
  job-number-cpus = 2
  
  jobserver {
    ...
    sqldao {
      # Slick database driver, full classpath
      slick-driver = slick.driver.MySQLDriver

      # JDBC driver, full classpath
      jdbc-driver = com.mysql.jdbc.Driver

      jdbc {
        url = "jdbc:mysql://db_host/spark_jobserver"
        user = "jobserver"
        password = "secret"
      }

      dbcp {
        maxactive = 20
        maxidle = 10
        initialsize = 10
      }
    }
  }
}

配置ssh免密登入 配置ssh埠,預設使用了22埠,可以根據需要修改 vi server_deploy.sh

for host in $DEPLOY_HOSTS; do
  # We assume that the deploy user is APP_USER and has permissions
  ssh -p 2222 -o StrictHostKeyChecking=no $ssh_key_to_use  ${APP_USER}@$host mkdir -p $INSTALL_DIR
  scp -P 2222 -o StrictHostKeyChecking=no $ssh_key_to_use  $FILES ${APP_USER}@$host:$INSTALL_DIR/
  scp -P 2222 -o StrictHostKeyChecking=no $ssh_key_to_use  "$CONFIG_DIR/$ENV.conf" ${APP_USER}@$host:$INSTALL_DIR/
  scp -P 2222 -o StrictHostKeyChecking=no $ssh_key_to_use  "$configFile" ${APP_USER}@$host:$INSTALL_DIR/settings.sh
done

###部署

進入bin目錄下,執行部署命令

./server_deploy.sh local

執行完成後,進入INSTALL_DIR中的目錄,使用server_start.sh和server_stop.sh進行啟停

###遇到的問題 ####啟動問題

因為我在spark-defult.xml中配置了master和deployMode,因此需要修改server_start.sh,改為需要的方式

cmd='$SPARK_HOME/bin/spark-submit --master local[1] --deploy-mode

####資料庫初始化失敗

修改spark-jobserver\spark-jobserver-master\job-server\src\main\resources\db\mysql\migration\V0_7_2\V0_7_2__convert_binaries_table_to_use_milliseconds.sql 可以重新執行部署命令或直接修改jar包中檔案

ALTER TABLE `BINARIES` MODIFY COLUMN `UPLOAD_TIME` TIMESTAMP;

Validate failed. Migration Checksum mismatch for migration 0.7.2 由於初始化失敗造成,刪除資料庫下所有表,重新初始化

####java.lang.ClassNotFoundException: akka.event.slf4j.Slf4jLogger

修改project/Dependencies.scala

    "com.typesafe.akka" %% "akka-slf4j" % akka % "provided",
    ...
    "io.spray" %% "spray-routing" % spray,

改為

    "com.typesafe.akka" %% "akka-slf4j" % akka,
    ...
    "io.spray" %% "spray-routing-shapeless23" % "1.3.4",

project/Versions.scala 新增

  lazy val mysql = "5.1.42"

###使用 ####執行spark-sql

修改local.conf

spark {
  jobserver {
    # Automatically load a set of jars at startup time.  Key is the appName, value is the path/URL.
    job-binary-paths {    # NOTE: you may need an absolute path below
      sql = job-server-extras/target/scala-2.10/job-server-extras_2.10-0.6.2-SNAPSHOT-tests.jar
    }
  }
  
  contexts {
    sql-context {
      num-cpu-cores = 1           # Number of cores to allocate.  Required.
      memory-per-node = 512m         # Executor memory per node, -Xmx style eg 512m, 1G, etc.
      context-factory = spark.jobserver.context.HiveContextFactory
    }
  }  
}