TensorFlow實戰筆記——tf.nn.nce_loss

阿新 • • 發佈：2019-02-01

先看看tensorflow的nce-loss的API：、

def nce_loss(weights, biases, inputs, labels, num_sampled, num_classes,
             num_true=1,
             sampled_values=None,
             remove_accidental_hits=False,
             partition_strategy="mod",
             name="nce_loss")

假設nce_loss之前的輸入資料是K維的，一共有N個類，那麼

weight.shape = (N, K)

bias.shape = (N)

inputs.shape = (batch_size, K)

labels.shape = (batch_size, num_true)

num_true : 實際的正樣本個數

num_sampled: 取樣出多少個負樣本

num_classes = N

sampled_values: 取樣出的負樣本，如果是None，就會用不同的sampler去取樣。待會兒說sampler是什麼。

remove_accidental_hits: 如果取樣時不小心取樣到的負樣本剛好是正樣本，要不要幹掉

partition_strategy：對weights進行embedding_lookup時並行查表時的策略。TF的embeding_lookup是在CPU裡實現的，這裡需要考慮多執行緒查表時的鎖的問題

nce_loss的實現邏輯如下：

_compute_sampled_logits: 通過這個函式計算出正樣本和取樣出的負樣本對應的output和label

sigmoid_cross_entropy_with_logits: 通過 sigmoid cross entropy來計算output和label的loss，從而進行反向傳播。這個函式把最後的問題轉化為了num_sampled+num_real個兩類分類問題，然後每個分類問題用了交叉熵的損傷函式，也就是logistic regression常用的損失函式。TF裡還提供了一個softmax_cross_entropy_with_logits的函式，和這個有所區別。

再來看看TF裡word2vec的實現，他用到nce_loss的程式碼如下

 loss=tf.reduce_mean(tf.nn.nce_loss(weights=nce_weights,
                                       biases=nce_biases,
                                       labels=train_labels,
                                       inputs=embed,
                                       num_sampled=num_sampled,
                                       num_classes=vocabulary_size))

可以看到，它這裡並沒有傳sampled_values，那麼它的負樣本是怎麼得到的呢？繼續看nce_loss的實現，可以看到裡面處理sampled_values=None的程式碼如下

if sampled_values is None:
      sampled_values = candidate_sampling_ops.log_uniform_candidate_sampler(
          true_classes=labels,
          num_true=num_true,
          num_sampled=num_sampled,
          unique=True,
          range_max=num_classes)

所以，預設情況下，他會用log_uniform_candidate_sampler去取樣。那麼log_uniform_candidate_sampler是怎麼取樣的呢？他的實現在這裡：

1、會在[0, range_max)中取樣出一個整數k

2、P(k) = (log(k + 2) - log(k + 1)) / log(range_max + 1)

可以看到，k越大，被取樣到的概率越小。那麼在TF的word2vec裡，類別的編號有什麼含義嗎？看下面的程式碼：

def build_dataset(words):
  count = [['UNK', -1]]
  count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
  dictionary = dict()
  for word, _ in count:
    dictionary[word] = len(dictionary)
  data = list()
  unk_count = 0
  for word in words:
    if word in dictionary:
      index = dictionary[word]
    else:
      index = 0  # dictionary['UNK']
      unk_count += 1
    data.append(index)
  count[0][1] = unk_count
  reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
  return data, count, dictionary, reverse_dictionary

可以看到，TF的word2vec實現裡，詞頻越大，詞的類別編號也就越小。因此，在TF的word2vec裡，負取樣的過程其實就是優先採詞頻高的詞作為負樣本。在提出負取樣的原始論文中, 包括word2vec的原始C++實現中。是按照熱門度的0.75次方取樣的，這個和TF的實現有所區別。但大概的意思差不多，就是越熱門，越有可能成為負樣本。作者：xlvector連結：https://www.jianshu.com/p/fab82fa53e16來源：簡書著作權歸作者所有。商業轉載請聯絡作者獲得授權，非商業轉載請註明出處。

TensorFlow實戰筆記——tf.nn.nce_loss

TensorFlow實戰筆記——tf.nn.nce_loss

TensorFlow學習筆記 —— tf.nn.nce_loss

tensorflow詳解-tf.nn.conv2d()，tf.nn.max_pool()

TensorFlow函式之tf.nn.relu()

Tensorflow學習筆記——tf.ummary用法

TensorFlow函式之tf.nn.conv2d()（附程式碼詳解）

tensorflow啟用函式--tf.nn.dropout

TensorFlow實戰筆記（17）---TFlearn

tensorflow中的tf.nn這類函式

TensorFlow學習筆記 —— tf.train.Optimizer

《tensorflow實戰筆記》通俗詳述RNN理論,LSTM理論,以及LSTM對於PTB資料集進行實戰

深度學習tensorflow實戰筆記（3）VGG-16訓練自己的資料並測試和儲存模型

深度學習tensorflow實戰筆記（1）全連線神經網路（FCN）訓練自己的資料（從txt檔案中讀取）

深度學習tensorflow實戰筆記（4）利用儲存的VGG-16CNN網路模型提取特徵

【TensorFlow】關於tf.nn.sparse_softmax_cross_entropy_with_logits（）

【TensorFlow】理解tf.nn.conv2d方法 ( 附程式碼詳解註釋 )

TensorFlow學習筆記-tf.estimator

tensorflow學習：tf.nn.conv2d 和 tf.layers.conv2d

tensorflow 學習筆記-- tf.reduce_max、tf.sequence_mask

TensorFlow實戰之tf.truncated_normal與tf.random_normal

TensorFlow實戰 筆記——tf.nn.nce_loss

相關推薦

TensorFlow實戰筆記——tf.nn.nce_loss