When To Multiply Inside Your Neural Network?

阿新 • • 發佈：2019-01-12

When To Multiply Inside Your Neural Network?

Typical neural networks consist of linear combinations of input features and Relu units built upon them, and nothing else. So is there a need to introduce explicit multiplications, either on inputs or inside the network?

But first, let’s consider why you should not multiply inside your neural network. Suppose you have a bunch of features and want to construct arbitrary multiplicative terms. The straightforward thing would be to feed them into the network after applying log()

. Multiplications turn to additions, job done! This is useful because of other reasons too. When you multiply a bunch of numbers, the product typically has a wide variance with either large or small values in the range. If you deal with the raw values, then a large region of your interest will likely get squashed. Mapping the features to log space prevents such catastrophes, so that’s definitely the first consideration before injecting multipliers into the network.

Secondly, neural networks can approximate arbitrary functions. And of course, it can approximate a multiplier as well. To see this, we train a single hidden layer neural network to learn multiplication. If you have Tensorflow installed, you can copy-paste the code below and run it.

''' Plot the root mean squared error of a single
hidden layer neural network while modeling x^2.
'''
import tensorflow as tf
import numpy as np

NUM_TRAIN_SAMPLES = 1000000
NUM_TEST_SAMPLES = 100000

feature_columns = [
  tf.feature_column.numeric_column(key='a'),
  tf.feature_column.numeric_column(key='b')]

a_train = np.random.rand(NUM_TRAIN_SAMPLES)
b_train = np.random.rand(NUM_TRAIN_SAMPLES)
train = tf.estimator.inputs.numpy_input_fn(
  x={'a': a_train, 'b': b_train},
  y=a_train * b_train, # model a * b
  batch_size=128,
  shuffle=True)

a_test = np.random.rand(NUM_TRAIN_SAMPLES)
b_test = np.random.rand(NUM_TRAIN_SAMPLES)
test = tf.estimator.inputs.numpy_input_fn(
  x={'a': a_test, 'b': b_test},
  y=a_test * b_test,
  shuffle=False)

range_test = np.arange(0.00, 10.0, 0.01)
ranget = tf.estimator.inputs.numpy_input_fn(
  x={'a': range_test, 'b': range_test},
  y=range_test * range_test,
  shuffle=False)

def estimate_error(num_hidden_units):
  model = tf.estimator.DNNRegressor(
      hidden_units=[num_hidden_units],
      feature_columns=feature_columns)
  model.train(input_fn=train, steps=100000)
  eval_result = model.evaluate(input_fn=test)
  mse = eval_result["average_loss"]**0.5
  print('rmse=%f'%mse)
  predictions = list(model.predict(input_fn=ranget))
  for ip, p in zip(range_test, predictions):
    v  = p["predictions"][0]
    print('x=%f, x^2=%f, model=%f'%(ip, ip*ip, v))

if __name__ == "__main__":
  estimate_error(80)

With 80 Relus, we get to a root mean squared error near 0.003, which seems like a reasonable approximation of a multiplier. The code above also feeds numbers in the (0, 1) range to both inputs of the multiplier, essentially evaluating x², and then compares it with the actual value of x². The plot below shows the comparison. Unsurprisingly, the model seems quite good at emulating multiplication.

So the case for multiplication inside the network seems dead? Not quite. Note that the model has just learnt to approximate the output for the examples in the training data, it understands nothing about multiplication. In other words, what the model has learnt doesn’t generalize to true multiplication. To understand the significance of this, we extend the plot above, to values beyond the (0, 1) range that the training data is restricted to. Now the plot looks like:

The neural network can masquerade as a multiplier, but that breaks down quite dramatically as you go beyond the values seen during training. This may be important if your training data doesn’t represent the entirety of the universe in which the model operates. For example, if your model is used in search ranking, your training data may be limited to results from the first page, whereas in reality the model is used to score documents that are 100x in volume, beyond what is logged as training data.

So if you have a strong intuition that multiplication is the right way to model certain relations, then it may be better to explicitly enforce that in the network. The effect of it may not be immediately apparent from the logged data that is split as train and test. An online test may be the best way to confirm your intuition. Best of luck!

See also:

When To Multiply Inside Your Neural Network?

When To Multiply Inside Your Neural Network?Typical neural networks consist of linear combinations of input features and Relu units built upon them, and no

How to Create a Simple Neural Network in Python

Neural networks (NN), also called artificial neural networks (ANN) are a subset of learning algorithms within the machine learning field that are loosely b

How to build your own Neural Network from scratch in Python

How to build your own Neural Network from scratch in PythonA beginner’s guide to understanding the inner workings of Deep LearningMotivation: As part of my

How to Visualize Your Recurrent Neural Network with Attention in Keras

Now for the interesting part: the decoder. For any given character at position t in the sequence, our decoder accepts the encoded sequence h=(h1,...,hT) as

Building your Deep Neural Network: Step by Step¶

pan auto plot chan arr src computing zeros rect Welcome to your week 4 assignment (part 1 of 2)! You have previously trained a 2-layer N

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition

機器學習屬於瓶頸特征 oid ack enter 變換表示基於貝葉斯的深度神經網絡自適應及其在魯棒自動語音識別中的應用直接貝葉斯DNN自適應使用高斯先驗對DNN進行MAP自適應為何貝葉斯在模型自適應中很有用？因為自適應問題可以視為後驗估計

Make your own neural network（Python神經網路程式設計）一

　　這本書應該算我第一本深度學習的程式碼入門書了吧，之前看阿里云云棲社和景略集智都有推薦這本書就去看了，　　成功建立了自己的第一個神經網路，也瞭解一些關於深度學習的內容，再加上這學期的概率論與數理統計的課，　　現在再來看李大大的機器學習課程，終於能看懂LogisticsRegression概率那部分公

Make your own neural network（Python神經網路程式設計）三

前兩篇程式碼寫了初始化與查詢，知道了S函式，初始權重矩陣。以及神經網路的計算原理，總這一篇起就是最重要的神經網路的訓練了。神經網路的訓練簡單點來講就是讓輸出的東西更加接近我們的預期。比如我們輸出的想要是1，但是輸出了-0.5，明顯不是我們想要的。誤差=（期望的數值）-（實際輸出），那麼我們的誤差就是

《An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its...》論文閱讀之CRNN

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition paper: CRNN 翻譯：CRNN

深度學習論文翻譯解析（二）：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

論文標題：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 論文作者： Baoguang Shi, Xiang B

When To Multiply Inside Your Neural Network?

When To Multiply Inside Your Neural Network?

When To Multiply Inside Your Neural Network?

How to Create a Simple Neural Network in Python

How to build your own Neural Network from scratch in Python

How to Visualize Your Recurrent Neural Network with Attention in Keras

Building your Deep Neural Network: Step by Step¶

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition

Make your own neural network（Python神經網路程式設計）一

Make your own neural network（Python神經網路程式設計）三

《An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its...》論文閱讀之CRNN

深度學習論文翻譯解析（二）：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

How to train Neural Network faster with optimizers?

DL4J: How to create a neural network that draws images

The 4 Convolutional Neural Network Models That Can Classify Your Fashion Images

Ask HN: How do you know when to switch teams within your company?

How to train your Neural Networks in parallel with Keras and Apache Spark

Why, How and When to Scale your Features

神經網路與深度學習第四周-Building your Deep Neural Network

論文筆記：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application

Make Your Own Neural Network（七）-----矩陣很有用

論文翻譯------Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

When To Multiply Inside Your Neural Network?

When To Multiply Inside Your Neural Network?

相關推薦