TensorflowOnSpark 介紹與搭建

阿新 • • 發佈：2018-11-01

1.介紹

TensorFlowOnSpark 為 Apache Hadoop 和 Apache Spark 叢集帶來可擴充套件的深度學習。通過結合深入學習框架 TensorFlow 和大資料框架 Apache Spark 、Apache Hadoop 的顯著特徵，TensorFlowOnSpark 能夠在 GPU 和 CPU 伺服器叢集上實現分散式深度學習。

2.為了滿足什麼應用場景

為了利用TensorFlow在現有的Spark和Hadoop叢集上進行深度學習。而不需要為深度學習設定單獨的叢集。

3.核心技術點是那些
輕鬆遷移所有現有的TensorFlow程式，<10行程式碼更改;
支援所有TensorFlow功能：同步/非同步訓練，模型/資料並行，推理和TensorBoard;
伺服器到伺服器的直接通訊在可用時實現更快的學習;
允許資料集在HDFS和由Spark推動的其他來源或由TensorFlow拖動;
輕鬆整合您現有的資料處理流水線和機器學習演算法（例如，MLlib，CaffeOnSpark）;
輕鬆部署在雲或內部部署：CPU和GPU，乙太網和Infiniband。

4.同類對比

相比於基於caffe的caffeOnSpark，基於TensorFlow的TensorFlowOnSpark支援更多的模型。

5.優勢劣勢

優勢：TensorFlowOnSpark是基於google的TensorFlow的實現，而TensorFlow有著一套完善的教程，內容豐富。
劣勢：開源時間不長，未得到充分的驗證。

6.發展前景

由於使用TensorFlow的人數較多，當需要在Spark或Hdfs上進行深度學習時，也會更傾向於使用TensorFlowOnSpark。前景應該不錯。

7.搭建
官方示例太坑了。很難搭建成功。主要說一些遇到的問題

下載git clone

[email protected]:yahoo/tensorflow.git
實現官方示例
1、安裝python2.7，
–安裝pip –安裝pydoop庫（在hadoop上使用python）–安裝numpy庫 –安裝TensorFlow庫
問題：1.pip pydoop安裝不上，沒有安裝hadoop，安裝hadoop仍安裝不上，可通過下載對應的包通過setup安裝
2.pip TensorFlow安裝不上，pip過程中會檢查numpy庫是否安裝，由於numpy問題造成的失敗可以先通過pip 安裝numpy
3.import tensorflow出現glibc等版本錯誤，建議使用新點的系統，centenos6 只支援到glibc1.2，即使安裝好了，也可能會出現GLIBC.XXX之類的錯誤。
4.安裝pip錯誤，安裝python過程缺少相關依賴的庫（如zlib等），安裝過程會有提示。安裝完成即可

2、安裝和編譯TensorFlow w/RDMA支援            (後面連結為參考連結)

     --安裝protoc 3.1  （https://github.com/google/protobuf/releases）
    --1下載對應的包（java）
    --2 安裝 ./autogen.sh  ./configure --prefix=/usr/local/protobuf  make make check make install ldconfig  (http://www.itdadao.com/articles/c15a1006495p0.html)
     --編譯TensorFlow的protos  （https://github.com/tensorflow/ecosystem/tree/master/hadoop）
    --protoc --proto_path=/opt/TensorFlowOnSpark/tensorflow/ --java_out=src/main/java/ /opt/TensorFlowOnSpark/tensorflow/tensorflow/core/example/{example,feature}.proto  (ecosystem/hadoop/ 下執行)
    --mvn clean package      mvn install
    --hadoop fs -put tensorflow-hadoop-1.0-SNAPSHOT.jar

執行命令執行

${SPARK_HOME}/bin/spark-submit --master yarn-cluster --deploy-mode cluster --queue ${QUEUE} --num-executors 4 --executor-memory 1G --archives hdfs:///user/${USER}/Python.zip#Python,/root/mnist/mnist.zip#mnist TensorFlowOnSpark-master/examples/mnist/mnist_data_setup.py --output mnist/csv --format csv
（http://www.jianshu.com/p/72cb5816a0f7）

TensorflowOnSpark 介紹與搭建

TensorflowOnSpark 介紹與搭建

Redis Cluster 介紹與搭建

Harbor倉庫介紹與搭建過程

crtmpserver流媒體伺服器的介紹與搭建 (轉載)

Redis 學習筆記（十四）Redis Cluster介紹與搭建

Redis Cluster介紹與搭建（一）

【Docker】基於例項專案的叢集部署（二）部署專案例項介紹與搭建

Docker應用容器引擎介紹與搭建

Spring的介紹與搭建

mybatis介紹與環境搭建

第一章 Linux系統介紹與環境搭建準備

持續集成-jenkins介紹與環境搭建

linux基本介紹與環境搭建

Django簡單介紹與環境搭建

訊息佇列-ActiveMQ學習筆記（一）-JMS介紹與環境搭建

IPFS（一）介紹與環境的搭建

Spring Cloud_1_介紹與環境搭建

區塊鏈開源專案Ripple一、簡單介紹與環境搭建，部署

Spring Cloud介紹與環境搭建(一)

55.exportfs命令 NFS客戶端問題 FTP介紹與使用vsftpd搭建ftp

TensorflowOnSpark 介紹與搭建

相關推薦