【量化】【TensorFlow】實現細節+重要程式碼片

阿新 • • 發佈：2019-02-04

// Quantization parameters, determining the mapping of quantized values
// to real values (i.e. determining how quantized values are mathematically
// interpreted).
//
// The correspondence is as follows:
//
//   real_value = scale * (quantized_value - zero_point);
//
// In other words, zero_point designates which quantized value corresponds to 

// the real 0 value, and scale designates the difference between the real values
// corresponding to consecutive quantized values differing by 1.
struct QuantizationParams {
  int32 zero_point = 0;
  double scale = 0.;
};

inline void FullyConnected(const uint8* input_data, const Dims<4>& input_dims,
                           int32 
 input_offset, const uint8* filter_data,
                           const Dims<4>& filter_dims, int32 filter_offset,
                           const int32* bias_data, const Dims<4>& bias_dims,
                           int32 output_offset, int32 output_multiplier,
                           int 
 output_shift, int32 output_activation_min,
                           int32 output_activation_max, uint8* output_data,
                           const Dims<4>& output_dims,
                           gemmlowp::GemmContext* gemm_context) {
  (void)gemm_context;  // only used in optimized code.
  TFLITE_DCHECK_LE(output_activation_min, output_activation_max);
  // TODO(benoitjacob): This really should be:
  //     const int batches = ArraySize(output_dims, 1);
  // but the current --variable_batch hack consists in overwriting the 3rd
  // dimension with the runtime batch size, as we don't keep track for each
  // array of which dimension is the batch dimension in it.
  const int batches = ArraySize(output_dims, 1) * ArraySize(output_dims, 2) *
                      ArraySize(output_dims, 3);
  const int output_depth = MatchingArraySize(filter_dims, 1, output_dims, 0);
  const int accum_depth = ArraySize(filter_dims, 0);
  TFLITE_DCHECK(IsPackedWithoutStrides(input_dims));
  TFLITE_DCHECK(IsPackedWithoutStrides(filter_dims));
  for (int b = 0; b < batches; ++b) {
    for (int out_c = 0; out_c < output_depth; ++out_c) {
      int32 acc = 0;
      for (int d = 0; d < accum_depth; ++d) {
        int32 input_val = input_data[b * accum_depth + d];
        int32 filter_val = filter_data[out_c * accum_depth + d];
        acc += (filter_val + filter_offset) * (input_val + input_offset);
      }
      if (bias_data) {
        acc += bias_data[Offset(bias_dims, out_c, 0, 0, 0)];
      }
      acc = MultiplyByQuantizedMultiplierSmallerThanOne(acc, output_multiplier,
                                                        output_shift);
      acc += output_offset;
      acc = std::max(acc, output_activation_min);
      acc = std::min(acc, output_activation_max);
      output_data[out_c + output_depth * b] = static_cast<uint8>(acc);
    }
  }
}

const MinMax& GetOrComputeMinMax(Model* model, const string& array_name) {
  auto& array = model->GetArray(array_name);
  // Normally we should have a MinMax recorded on this Array,
  // so we just use it.
  if (array.minmax != nullptr) {
    return *array.minmax;
  }

  // We don't have a MinMax. That's bad news: we need
  // the graph to provide MinMax info for all arrays in order
  // for inference to reproduce faithfully the same quantization
  // error as the training process had.
  //
  // But we still want to support a fallback for constant arrays,
  // just using the plain min and max computed from array elements.
  // We should hopefully never rely on that in production, as that
  // will not give very good accuracy as that typically won't be
  // exactly what the training process used. But it will be useful
  // to allow easily trying out quantization even if the graph
  // lacks some minmax information.
  if (array.buffer != nullptr) {
    LOG(WARNING)
        << "Constant array " << array_name
        << " lacks MinMax information. To make up for that, we will now compute"
        << " the MinMax from actual array elements. That will result in"
        << " quantization parameters that probably do not match whichever "
           "arithmetic"
        << " was used during training, and thus will probably be a cause of "
           "poor"
        << " inference accuracy.";
    CHECK(array.buffer->type == ArrayDataType::kFloat);
    const auto& data = array.GetBuffer<ArrayDataType::kFloat>().data;
    // We always want [min, max] to contain 0.
    float min = 0.f;
    float max = 0.f;
    for (auto val : data) {
      min = std::min(min, val);
      max = std::max(max, val);
    }
    auto& minmax = array.GetOrCreateMinMax();
    minmax.min = min;
    minmax.max = max;
    return minmax;
  }

  LOG(FATAL) << "Array " << array_name
             << " does not have MinMax information, "
                "and is not a constant array. Cannot "
                "proceed with quantization.";
}

# Define quantize_v2 here in order to make name the second-to-last attribute,
# because round_mode was added later.
@tf_export("quantize_v2")
@deprecation.deprecated(
    "2017-10-25",
    "`tf.quantize_v2` is deprecated, please use `tf.quantize` instead.")
def quantize_v2(input,  # pylint: disable=redefined-builtin
                min_range,
                max_range,
                T,
                mode="MIN_COMBINED",
                name=None,
                round_mode="HALF_AWAY_FROM_ZERO"):
  return gen_array_ops.quantize_v2(input,
                                   min_range,
                                   max_range,
                                   T=T,
                                   mode=mode,
                                   name=name,
                                   round_mode=round_mode)


quantize_v2.__doc__ = """Please use `tf.quantize` instead."""


# We want to expose tf.quantize instead of tf.quantize_v2; we can deprecate
# tf.quantize_v2 in next version of TensorFlow.
@tf_export("quantize")
def quantize(input,  # pylint: disable=redefined-builtin
             min_range,
             max_range,
             T,
             mode="MIN_COMBINED",
             round_mode="HALF_AWAY_FROM_ZERO",
             name=None):
  return gen_array_ops.quantize_v2(
      input,
      min_range,
      max_range,
      T,
      mode=mode,
      round_mode=round_mode,
      name=name)

tf.quantize(
    input,
    min_range,
    max_range,
    T,
    mode='MIN_COMBINED',
    round_mode='HALF_AWAY_FROM_ZERO',
    name=None
)

out[i] = (in[i] - min_range) * range(T) / (max_range - min_range)
if T == qint8, out[i] -= (range(T) + 1) / 2.0

【量化】【TensorFlow】實現細節+重要程式碼片

// Quantization parameters, determining the mapping of quantized values // to real values (i.e. determining how quantized value

【拔刀吧TensorFlow】MultiNet安裝記錄

MultiNet就是可以同時進行影象識別、影象分類和語義分割。跑通的demo效果如下：可以看到，語義分割識別出了道路（綠色），影象識別框選出了車輛，影象分類分類出了道路型別（左上角），作者說可以實現實時，README說FPS可以達到26+，很感興趣，於是安裝了一波，遇到些問題，現

【拔刀吧TensorFlow】Ubuntu16.04系統安裝問題總結（已更新）

重大更新！！！！！因為You do not appear to be using the NVIDIA X driver. 這樣的報錯，感覺雖然安裝了nvidia驅動，但是並沒有呼叫起來驅動，遂決定再次重做系統。這次重做的步驟如下： 1. 重做系統。 2. 重啟之後發現可以雙屏顯示，解析度正

【TensorFlow】tf.nn.conv2d_transpose是怎樣實現反捲積的？

三個月沒更新了啊，回來更一發～～ csdn上主要講一些coding過程中遇到的函式，問題，解決方案。偏實踐另外，如果你想看一些理論方面的東西，歡迎加我的知乎知乎主頁 csdn私信幾乎不看，有問題交流可以發郵箱：[email protected]或者知乎私

【TensorFlow實戰】用TensorFlow實現簡單的卷積神經網路

#本次將練習實現一個簡單的卷積神經網路，使用的資料集依然是MNIST， #預期可以達到99.2%左右的準確性 #使用兩個卷積層加上一個全連線蹭構建一個簡單但是非常具有代表性的卷積神經網路 #載入MNIST資料集，並且建立預設的Interactive Sessio

【TensorFlow】tf.nn.max_pool實現池化操作

版權宣告：本文為博主原創文章，轉載請註明出處。 https://blog.csdn.net/mao_xiao_feng/article/details/53453926 max pooling是CNN當中的最大值池化操作，其實用法和卷積很類似有些地方可以從卷積去參考

【TensorFlow】quantization量化

一、 Question 1:How does Tensorflow do quantization and dequantization? Details According to the blog post “https://petewarden.com/2016/05/

15.5.1【Task實現細節】生成的程式碼

　　還在嗎？我們開始吧。由於深入講解需上百頁的篇幅，因此這裡我不會講得太深。但我會提供足夠的背景知識，以有助於你對整個結構的理解。之後可通過閱讀我近些年來撰寫的部落格文章，來了解更加錯綜複雜的細節，或簡單地編寫一些非同步程式碼並反編譯。同樣地，這裡我只介紹非同步方法，它包含了所有有趣的機制，並且不需要處

15.5.6 【Task實現細節】跟蹤棧

　　談到棧幀（stack frame）時，可能會想到在方法中宣告的區域性變數。當然，可能還會注意到一些隱藏的區域性變數，如 foreach 迴圈中的迭代器。但棧上的內容不止這些，至少邏輯上是這樣。很多情況下，在一些表示式還沒有計算出來前，另一些中間表示式是不能使用的。最簡單的例子莫過於加法

15.5.2 【Task實現細節】骨架方法的結構

nsa 你會 move res condition trying 異步 .get fault 　　盡管骨架方法中的代碼非常簡單，但它暗示了狀態機的職責。代碼清單15-11生成的骨架方法如下所示： 1 [DebuggerStepThrough]

【Tensorflow】深度學習實戰06——Tensorflow實現ResNet

前言 ResNet（Residual Neural Network）由前微軟研究院的 Kaiming He 等4名華人提出（有興趣的可以點選這裡，檢視論文原文），通過使用 Residual Blocks 成功訓練152層深的神經網路，在 ILSVR

【Tensorflow】超參調整時對於模型更新輕量化的測試

0x00 前言由於各類模型在落地使用時都或多或少地需要一些超參的調整（學名調參、俗稱煉丹），但如果每次修改少量超參之後，都要把網路和模型重新初始化一遍，這樣就太花時間了，所以考慮能否儘量減少，甚至可以一次初始化，N個 for 迴圈的形式來解決超參測試呢？ 0x01 測試程式

【Tensorflow】邏輯斯特迴歸（Logistic Regression）的簡單實現

Introduction 為了簡單的介紹和講解一下Tensorflow的基本操作，我決定做一個簡單的邏輯斯特迴歸實現與程式碼講解，但不太會用Markdown的方式來展現一個JupyterNotebook，姑且就按照“說明—例項”的方式來解釋逐個程式碼塊好了

【Tensorflow】Python實現神經網路迴歸

環境 macOS，python3.6，tensorflow1.1.0 迴歸問題年份事故起數死亡人數受傷人數直接財產損失（萬元） 2003 80

【TensorFlow】tf.nn.conv2d是怎樣實現卷積的？

文章出處：http://blog.csdn.net/mao_xiao_feng/article/details/53444333 tf.nn.conv2d是TensorFlow裡面實現卷積的函式，參考文件對它的介紹並不是很詳細，實際上這是搭建卷積神經網路比較

【TensorFlow】tf.nn.conv2d是怎樣實現卷積的？有1*1（1×1）卷積介紹

除去name引數用以指定該操作的name，與方法有關的一共五個引數：第一個引數input：指需要做卷積的輸入影象，它要求是一個Tensor，具有[batch, in_height, in_width, in_channels]這樣的shape，具體含義是[訓練時一個batch的圖片數量, 圖片

【我的TensorFlow之路·4】關於CNN的一些細節問題

看CNN相關的東西也有一段時間了，但總是感覺深入不進去，這次又讀《面向機器智慧的TensorFlow實踐》這本書，補充了一些知識漏洞，以前不太注意的，或者直接拿來用的一些東西，現在有了更深入的瞭解。 1.步長設定步長是一種調整輸入張量維度

【TensorFlow】使用TensorFlow的Eager API實現線性迴歸

import tensorflow as tf import numpy as np import matplotlib.pyplot as plt 開啟eager模式 tf.enable_eager

【深度學習】使用tensorflow實現VGG19網路

接上一篇AlexNet，本文講述使用tensorflow實現VGG19網路。 VGG網路與AlexNet類似，也是一種CNN，VGG在2014年的 ILSVRC localization and classification 兩個問題上分別取得了第一名和第二名。VGG

【iOS開發-79】利用Modal方式實現控制器之間的跳轉

article 運用 mis cli 控制 present 沒有 dismiss 導航控制器利用Modal方法。事實上就是以下兩個方法的運用。Modal方式的切換效果是從底部呈現。 -(void)clickModal{ WPViewController *wp

【量化】【TensorFlow】實現細節+重要程式碼片

相關推薦