keras中文文件筆記17——將Keras作為tensorflow的精簡介面

阿新 • • 發佈：2019-01-01

將Keras作為tensorflow的精簡介面

在tensorflow中呼叫Keras層

讓我們以一個簡單的例子開始：MNIST數字分類。我們將以Keras的全連線層堆疊構造一個TensorFlow的分類器，

import tensorflow as tf
sess = tf.Session()

from keras import backend as K
K.set_session(sess)

然後，我們開始用tensorflow構建模型：

# this placeholder will contain our input digits, as flat vectors
img = tf.placeholder(tf.float32, shape=(None 
, 784))

用Keras可以加速模型的定義過程：

from keras.layers import Dense

# Keras layers can be called on TensorFlow tensors:
x = Dense(128, activation='relu')(img)  # fully-connected layer with 128 units and ReLU activation
x = Dense(128, activation='relu')(x)
preds = Dense(10, activation='softmax')(x)  # output layer with 10 units and a softmax activation

定義標籤的佔位符和損失函式：

labels = tf.placeholder(tf.float32, shape=(None, 10))

from keras.objectives import categorical_crossentropy
loss = tf.reduce_mean(categorical_crossentropy(labels, preds))

然後，我們可以用tensorflow的優化器來訓練模型：

from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data' 
, one_hot=True)

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
with sess.as_default():
    for i in range(100):
        batch = mnist_data.train.next_batch(50)
        train_step.run(feed_dict={img: batch[0],
                                  labels: batch[1]})

最後我們來評估一下模型效能：

from keras.metrics import categorical_accuracy as accuracy

acc_value = accuracy(labels, preds)
with sess.as_default():
    print acc_value.eval(feed_dict={img: mnist_data.test.images,
                                    labels: mnist_data.test.labels})

我們只是將Keras作為生成從tensor到tensor的函式（op）的快捷方法而已，優化過程完全採用的原生tensorflow的優化器，而不是Keras優化器，我們壓根不需要Keras的Model

關於原生TensorFlow和Keras的優化器的一點註記：雖然有點反直覺，但Keras的優化器要比TensorFlow的優化器快大概5-10%。雖然這種速度的差異基本上沒什麼差別。

訓練和測試行為不同

有些Keras層，如BN，Dropout，在訓練和測試過程中的行為不一致，你可以通過列印layer.uses_learning_phase來確定當前層工作在訓練模式還是測試模式。

如果你的模型包含這樣的層，你需要指定你希望模型工作在什麼模式下，通過Keras的backend你可以瞭解當前的工作模式：

from keras import backend as K
print K.learning_phase()

向feed_dict中傳遞1（訓練模式）或0（測試模式）即可指定當前工作模式：

# train mode
train_step.run(feed_dict={x: batch[0], labels: batch[1], K.learning_phase(): 1})

例如，下面程式碼示範瞭如何將Dropout層加入剛才的模型中：

from keras.layers import Dropout
from keras import backend as K

img = tf.placeholder(tf.float32, shape=(None, 784))
labels = tf.placeholder(tf.float32, shape=(None, 10))

x = Dense(128, activation='relu')(img)
x = Dropout(0.5)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(10, activation='softmax')(x)

loss = tf.reduce_mean(categorical_crossentropy(labels, preds))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
with sess.as_default():
    for i in range(100):
        batch = mnist_data.train.next_batch(50)
        train_step.run(feed_dict={img: batch[0],
                                  labels: batch[1],
                                  K.learning_phase(): 1})

acc_value = accuracy(labels, preds)
with sess.as_default():
    print acc_value.eval(feed_dict={img: mnist_data.test.images,
                                    labels: mnist_data.test.labels,
                                    K.learning_phase(): 0})

與變數名作用域和裝置作用域的相容

Keras的層與模型和tensorflow的命名完全相容，例如：

x = tf.placeholder(tf.float32, shape=(None, 20, 64))
with tf.name_scope('block1'):
    y = LSTM(32, name='mylstm')(x)

我們LSTM層的權重將會被命名為block1/mylstm_W_i, block1/mylstm_U, 等..
類似的，裝置的命名也會像你期望的一樣工作：

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops / variables in the LSTM layer will live on GPU:0

與Graph的作用域相容

任何在tensorflow的Graph作用域定義的Keras層或模型的所有變數和操作將被生成為該Graph的一個部分，例如，下面的程式碼將會以你所期望的形式工作

from keras.layers import LSTM
import tensorflow as tf

my_graph = tf.Graph()
with my_graph.as_default():
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops / variables in the LSTM layer are created as part of our graph

與變數作用域相容

變數共享應通過多次呼叫同樣的Keras層或模型來實現，而不是通過TensorFlow的變數作用域實現。TensorFlow變數作用域將對Keras層或模型沒有任何影響。更多Keras權重共享的資訊請參考這裡

Keras通過重用相同層或模型的物件來完成權值共享，這是一個例子：

# instantiate a Keras layer
lstm = LSTM(32)

# instantiate two TF placeholders
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = tf.placeholder(tf.float32, shape=(None, 20, 64))

# encode the two tensors with the *same* LSTM weights
x_encoded = lstm(x)
y_encoded = lstm(y)

收集可訓練權重與狀態更新

某些Keras層，如狀態RNN和BN層，其內部的更新需要作為訓練過程的一步來進行，這些更新被儲存在一個tensor tuple裡：layer.updates，你應該生成assign操作來使在訓練的每一步這些更新能夠被執行，這裡是例子：

from keras.layers import BatchNormalization

layer = BatchNormalization()(x)

update_ops = []
for old_value, new_value in layer.updates:
    update_ops.append(tf.assign(old_value, new_value))

注意如果你使用Keras模型，model.updates將與上面的程式碼作用相同（收集模型中所有更新）

另外，如果你需要顯式的收集一個層的可訓練權重，你可以通過layer.trainable_weights來實現，對模型而言是model.trainable_weights，它是一個tensorflow變數物件的列表：

from keras.layers import Dense

layer = Dense(32)(x)  # instantiate and call a layer
print layer.trainable_weights  # list of TensorFlow Variables

這些東西允許你實現你基於TensorFlow優化器實現自己的訓練程式

使用Keras模型與TensorFlow協作

將Keras Sequential模型轉換到TensorFlow中

假如你已經有一個訓練好的Keras模型，如VGG-16，現在你想將它應用在你的TensorFlow工作中，應該怎麼辦？

首先，注意如果你的預訓練權重含有使用Theano訓練的卷積層的話，你需要對這些權重的卷積核進行轉換，這是因為Theano和TensorFlow對卷積的實現不同，TensorFlow和Caffe實際上實現的是相關性計算。點選這裡檢視詳細示例。

假設你從下面的Keras模型開始，並希望對其進行修改以使得它可以以一個特定的tensorflow張量my_input_tensor為輸入，這個tensor可能是一個數據feeder或別的tensorflow模型的輸出

# this is our initial Keras model
model = Sequential()
first_layer = Dense(32, activation='relu', input_dim=784)
model.add(Dense(10, activation='softmax'))

你只需要在例項化該模型後，使用set_input來修改首層的輸入，然後將剩下模型搭建於其上：

# this is our modified Keras model
model = Sequential()
first_layer = Dense(32, activation='relu', input_dim=784)
first_layer.set_input(my_input_tensor)

# build the rest of the model as before
model.add(first_layer)
model.add(Dense(10, activation='softmax'))

在這個階段，你可以呼叫model.load_weights(weights_file)來載入預訓練的權重

然後，你或許會收集該模型的輸出張量：

output_tensor = model.output

對TensorFlow張量中呼叫Keras模型

Keras模型與Keras層的行為一致，因此可以被調用於TensorFlow張量上：

from keras.models import Sequential

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(10, activation='softmax'))

# this works! 
x = tf.placeholder(tf.float32, shape=(None, 784))
y = model(x)

注意，呼叫模型時你同時使用了模型的結構與權重，當你在一個tensor上呼叫模型時，你就在該tensor上創造了一些操作，這些操作重用了已經在模型中出現的TensorFlow變數的物件

多GPU和分散式訓練

將Keras模型分散在多個GPU中訓練

TensorFlow的裝置作用域完全與Keras的層和模型相容，因此你可以使用它們來將一個圖的特定部分放在不同的GPU中訓練，這裡是一個簡單的例子：

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:0

with tf.device('/gpu:1'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:1

注意，由LSTM層建立的變數將不會生存在GPU上，不管TensorFlow變數在哪裡建立，它們總是生存在CPU上，TensorFlow將隱含的處理裝置之間的轉換

如果你想在多個GPU上訓練同一個模型的多個副本，並在多個副本中進行權重共享，首先你應該在一個裝置作用域下例項化你的模型或層，然後在不同GPU裝置的作用域下多次呼叫該模型例項，如：

with tf.device('/cpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 784))

    # shared model living on CPU:0
    # it won't actually be run during training; it acts as an op template
    # and as a repository for shared variables
    model = Sequential()
    model.add(Dense(32, activation='relu', input_dim=784))
    model.add(Dense(10, activation='softmax'))

# replica 0
with tf.device('/gpu:0'):
    output_0 = model(x)  # all ops in the replica will live on GPU:0

# replica 1
with tf.device('/gpu:1'):
    output_1 = model(x)  # all ops in the replica will live on GPU:1

# merge outputs on CPU
with tf.device('/cpu:0'):
    preds = 0.5 * (output_0 + output_1)

# we only run the `preds` tensor, so that only the two
# replicas on GPU get run (plus the merge op on CPU)
output_value = sess.run([preds], feed_dict={x: data})

分散式訓練

通過註冊Keras會話到一個叢集上，你可以簡單的實現分散式訓練：

server = tf.train.Server.create_local_server()
sess = tf.Session(server.target)

from keras import backend as K
K.set_session(sess)

關於TensorFlow進行分散式訓練的配置資訊，請參考這裡

使用TensorFlow-serving匯出模型

TensorFlow-Serving是由Google開發的用於將TensoFlow模型部署於生產環境的工具

任何Keras模型都可以被TensorFlow-serving所匯出（只要它只含有一個輸入和一個輸出，這是TF-serving的限制），不管它是否作為TensroFlow工作流的一部分。事實上你甚至可以使用Theano訓練你的Keras模型，然後將其切換到tensorflow後端，然後匯出模型

如果你的graph使用了Keras的learning phase（在訓練和測試中行為不同），你首先要做的事就是在graph中硬編碼你的工作模式（設為0，即測試模式），該工作通過1）使用Keras的後端註冊一個learning phase常量，2）重新構建模型，來完成。

這裡是實踐中的示範：

from keras import backend as K

K.set_learning_phase(0)  # all new operations will be in test mode from now on

# serialize the model and get its weights, for quick re-building
config = previous_model.get_config()
weights = previous_model.get_weights()

# re-build a model where the learning phase is now hard-coded to 0
from keras.models import model_from_config
new_model = model_from_config(config)
new_model.set_weights(weights)

現在，我們可使用Tensorflow-serving來匯出模型，按照官方教程的指導：

from tensorflow_serving.session_bundle import exporter

export_path = ... # where to save the exported graph
export_version = ... # version number (integer)

saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
signature = exporter.classification_signature(input_tensor=model.input,
                                              scores_tensor=model.output)
model_exporter.init(sess.graph.as_graph_def(),
                    default_graph_signature=signature)
model_exporter.export(export_path, tf.constant(export_version), sess)

keras中文文件筆記17——將Keras作為tensorflow的精簡介面

將Keras作為tensorflow的精簡介面

在tensorflow中呼叫Keras層

訓練和測試行為不同

與變數名作用域和裝置作用域的相容

與Graph的作用域相容

與變數作用域相容

收集可訓練權重與狀態更新

使用Keras模型與TensorFlow協作

將Keras Sequential模型轉換到TensorFlow中

對TensorFlow張量中呼叫Keras模型

多GPU和分散式訓練

將Keras模型分散在多個GPU中訓練

分散式訓練

使用TensorFlow-serving匯出模型

keras中文文件筆記17——將Keras作為tensorflow的精簡介面

Keras中文文件總結

【Keras】中文文件學習筆記-快速上手Keras

Keras:基於Theano和TensorFlow的深度學習庫之中文文件

leaffLet學習筆記整理（API中文文件翻譯）

ASP.NET Core 中文文件第三章原理（17）為你的伺服器選擇合適版本的.NET框架

Spring Cloud Netflix中文文件翻譯筆記

Java程序設計---io流讀取文件內容並將其逆值輸出到控制臺

MFC 手動選擇文件夾並將文件夾地址從CString轉換為char[]數組

（轉）關於Tomcat的點點滴滴（體系架構、處理http請求的過程、安裝和配置、目錄結構、設置壓縮和對中文文件名的支持、以及Catalina這個名字的由來……等）

【POI】導出excel文件，不生成中間文件，直接將內存中的數據創建對象下載到瀏覽器

Windows bat 批處理文件筆記

linux列出文件夾下最近修改的文件，並將其copy到特定目錄

用linux文件處理三劍客將微信群成員導出的方法

redhat 紅帽 centos 7 中文文件夾改英文

linux下tomcat無法訪問中文路徑或中文文件

java基礎 File與遞歸練習使用文件過濾器篩選將指定文件夾下的小於200K的小文件獲取並打印按層次打印(包括所有子文件夾的文件)

inittab配置文件--筆記

.properties屬性文件筆記

sublime text3 左側目錄樹中文文件夾顯示方框

keras中文文件筆記17——將Keras作為tensorflow的精簡介面

將Keras作為tensorflow的精簡介面

在tensorflow中呼叫Keras層

訓練和測試行為不同

與變數名作用域和裝置作用域的相容

與Graph的作用域相容

與變數作用域相容

收集可訓練權重與狀態更新

使用Keras模型與TensorFlow協作

將Keras Sequential模型轉換到TensorFlow中

對TensorFlow張量中呼叫Keras模型

多GPU和分散式訓練

將Keras模型分散在多個GPU中訓練

分散式訓練

使用TensorFlow-serving匯出模型

相關推薦