Tensorflow深度學習之二十七:tf.nn.conv1d
一、conv1d
在NLP領域,甚至影象處理的時候,我們可能會用到一維卷積(conv1d)。所謂的一維卷積可以看作是二維卷積(conv2d)的簡化,二維卷積是將一個特徵圖在width和height兩個方向上進行滑窗操作,對應位置進行相乘並求和;而一維卷積則是隻在width或者說height方向上進行滑窗並相乘求和。
以下是Tensor Flow中關於tf.nn.conv1d的API註解:
Computes a 1-D convolution given 3-D input and filter tensors.
Given an input tensor of shape
[batch, in_width, in_channels]
if data_format is "NHWC", or
[batch, in_channels, in_width]
if data_format is "NCHW",
and a filter / kernel tensor of shape
[filter_width, in_channels, out_channels], this op reshapes
the arguments to pass them to conv2d to perform the equivalent
convolution operation.
Internally, this op reshapes the input tensors and invokes `tf.nn.conv2d`.
For example, if `data_format` does not start with "NC", a tensor of shape
[batch, in_width, in_channels]
is reshaped to
[batch, 1, in_width, in_channels],
and the filter is reshaped to
[1, filter_width, in_channels, out_channels].
The result is then reshaped back to
[batch, out_width, out_channels]
\(where out_width is a function of the stride and padding as in conv2d\) and
returned to the caller.
Args:
value: A 3D `Tensor`. Must be of type `float32` or `float64`.
filters: A 3D `Tensor`. Must have the same type as `input`.
stride: An `integer`. The number of entries by which
the filter is moved right at each step.
padding: 'SAME' or 'VALID'
use_cudnn_on_gpu: An optional `bool`. Defaults to `True`.
data_format: An optional `string` from `"NHWC", "NCHW"`. Defaults
to `"NHWC"`, the data is stored in the order of
[batch, in_width, in_channels]. The `"NCHW"` format stores
data as [batch, in_channels, in_width].
name: A name for the operation (optional).
Returns:
A `Tensor`. Has the same type as input.
Raises:
ValueError: if `data_format` is invalid.
二、詳解
conv1d的引數含義:(以NHWC格式為例,即,通道維在最後)
1、value:在註釋中,value的格式為:[batch, in_width, in_channels],batch為樣本維,表示多少個樣本,in_width為寬度維,表示樣本的寬度,in_channels維通道維,表示樣本有多少個通道。
事實上,也可以把格式看作如下:[batch, 行數, 列數],把每一個樣本看作一個平鋪開的二維陣列。這樣的話可以方便理解。
2、filters:在註釋中,filters的格式為:[filter_width, in_channels, out_channels]。按照value的第二種看法,filter_width可以看作每次與value進行卷積的行數,in_channels表示value一共有多少列(與value中的in_channels相對應)。out_channels表示輸出通道,可以理解為一共有多少個卷積核,即卷積核的數目。
3、stride:一個整數,表示步長,每次(向下)移動的距離(TensorFlow中解釋是向右移動的距離,這裡可以看作向下移動的距離)。
4、padding:同conv2d,value是否需要在下方填補0。
5、name:名稱。可省略。
三、程式碼示例:
conv1d簡單使用的程式碼如下:
import tensorflow as tf
import numpy as np
# 定義一個矩陣a,表示需要被卷積的矩陣。
a = np.array(np.arange(1, 1 + 20).reshape([1, 10, 2]), dtype=np.float32)
# 卷積核,此處卷積核的數目為1
kernel = np.array(np.arange(1, 1 + 4), dtype=np.float32).reshape([2, 2, 1])
# 進行conv1d卷積
conv1d = tf.nn.conv1d(a, kernel, 1, 'VALID')
with tf.Session() as sess:
# 初始化
tf.global_variables_initializer().run()
# 輸出卷積值
print(sess.run(conv1d))
結果如下:
[[[ 30.]
[ 50.]
[ 70.]
[ 90.]
[110.]
[130.]
[150.]
[170.]
[190.]]]