SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

阿新 • • 發佈：2018-12-13

一、論文概述

SenseGen這篇論文是17年發表在PerCom Workshops上的一篇論文，來自加州大學洛杉磯分校（University of California at Los Aneles，UCLA）網路與嵌入式系統實驗室（Netoworked & Embedded Systems Laboratory，NESL），最早作者是放在arXiv上。SenseGen借鑑生成對抗網路的思想訓練生成器，進而生成逼真的感測器資料，但是由於生成器最頂層採用了通過高斯混合模型（Gaussian Mixture Model，GMM）構建的混合密度網路（Mixture Density Network，MDN），不能把判別網路的誤差反向傳播給生成網路，所以只能單獨訓練生成網路和判別網路。因此，SenseGen本質上只是單一的訓練一個生成模型來合成逼真的感測器資料，並不是傳統意義上的生成對抗網路模型。

整體而言，細究本文存在很多的細節性描述漏洞和不足，但在生成對抗網路廣泛應用於影象、視訊、文字的研究背景下，SenseGen在一定程度上可以說是初步開啟了探索生成對抗網路思想在感測器資料合成方面的研究。

二、研究背景

考慮到大資料分析的背景下的使用者隱私問題，通過SenseGen生成模型來合成逼真的感測器資料用來替代使用者的真實資料，合成的感測器資料能夠在保護使用者隱私的同時維持與真實資料相同的統計特性，進而保證資料分析的可用性質量。

眾所周知，評價合成感測器資料的生成模型的效能是比較困難的，主要原因有：（1）很難找到一種評價感測器資料生成效果、逼真度好壞的標準；（2）同時，還必須要避免生成模型對原始感測器資料的過擬合，避免其輸出對原始感測器資料的簡單記憶

三、方法

SenseGen的目的就是訓練一個能夠生成逼真且保持真實分佈特性的生成網路

（一）生成網路

3層LSTM + 2層全連線神經網路 + 1層MDN構建的，

核心在於最後一層混合密度網路（Mixture Density Network，MDN），是高斯混合分佈於神經網路的結合。

最後一個全連線神經網路的72維輸出被平均分為三個部分，每個部分的維度為24，

1~24：每個高斯分佈在混合分佈中的權重 $\pi$ ，即24個高斯模型組成的混合分佈

25~48：每個高斯分佈的引數 $\mu$

49~72：每個高斯分佈的引數 $\sigma$

輸出結果的概率分佈表示為： $pr(x_{x+1}|\pi_{t},\mu_{t},\sigma_{t})=\sum_{k=1}^{24}\pi_{t}^{k}(x_{1...t})*\mathcal{N}(x_{x+1};\mu_{t}^{k}(x_{1...t},\sigma_{t}^{k}(x_{1...t})))$

預測值： $x_{x+1}\sim pr(x_{x+1}|\pi_{t}(x_{1...t}),\mu_{t}(x_{1...t}),\sigma_{t}(x_{1...t}))$

生成網路的訓練優化策略為：RMSProp

成本函式（cost function）： $\mathcal{F}^{\mathcal{G}}(\theta_{\mathcal{G}})=-\sum_{t=1}}^{T}\log{(pr(x_{x+1}|\pi_{t}(x_{1...t}),\mu_{t}(x_{1...t}),\sigma_{t}(x_{1...t})))}$

（二）判別網路

為了能夠評估生成的感測器資料與真實感測器資料之間的相似度，構建了一個判別網路來區分生成資料和真實資料

1層LSTM + 1層全連線網路構建，如上圖所示

分別含有64個和16個單元

啟用函式：sigmoid

輸出值代表輸入資料來自真實資料的概率： $D(x_{test})=Pr(x_{test}\in \mathcal{X}_{true})$ ，即真實輸入資料的輸出結果目標為1，合成輸入資料的輸出結果目標為0。

每個mini-batch有m個樣本，每個樣本包含400個取樣點

判別網路的訓練目標為交叉熵損失： $\mathcal{L}^{D}(\theta_{\mathcal{D}})=-\left(\sum_{i=1}^{m}\log{\mathcal{D}(\mathcal{X}_{true}^{(i)})}+\log(1-\mathcal{D}(\mathcal{X}_{gen}^{(i)}))\right)$

四、實驗及效果

訓練階段生成網路負對數似然損失的（Negative Log likelihood cost，NLL ）

判別網路鑑別真假資料的精度效果圖：

真實加速資料與SenseGen生成的加速度資料效果對比圖：

五、總結

SenseGen借鑑生成對抗網路的思想來訓練生成網絡合成逼真的感測器資料，但是並不是真正利用生成對抗網路的思想，（1）生成網路和判別網路單獨訓練，生成網路的訓練並不依賴於判別網路的誤差反向傳播；（2）生成網路直接學習原始感測器資料，而不是學習隨機噪聲分佈到真實資料分佈之間的對映關係。而且SenseGen文中很多描述與公開的原始碼之間對應不起來，細究論文一些內容也存在很多漏洞，但整體的思想還是可以借鑑用於感測器資料生成研究中。

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

一、論文概述 SenseGen這篇論文是17年發表在PerCom Workshops上的一篇論文，來自加州大學洛杉磯分校（University of California at Los Aneles，UCLA）網路與嵌入式系統實驗室（Netoworked & Embedded Syste

A Deep Learning-Based System for Vulnerability Detection(二)

　　接著上一篇，這篇研究實驗和結果。 A.用於評估漏洞檢測系統的指標 TP：為正確檢測到漏洞的樣本數量 FP：為檢測到虛假漏洞樣本的數量(誤報) FN：為未檢真實漏洞的樣本數量(漏報) TN：未檢測到漏洞樣本的數量　　這篇文獻廣泛使用指標假陽性率(FPR),假陰性率(FNR),真陽性率或者召回率

《PCANet: A Simple Deep Learning Baseline for Image Classification》

對照論文中的示例圖和文章給出的程式碼來梳理從圖中看到，整個網路有三個關鍵步驟，Patch-mean removal 、 PCA filter convolution與Binary quantization &mapping ，分別是區域性均值化、

PCANet: A Simple Deep Learning Baseline for Image Classification?--名詞解釋

1 上取樣與下采樣縮小影象（或稱為下采樣（subsampled）或降取樣（downsampled））的主要目的有兩個：使得影象符合顯示區域的大小生成對應影象的縮圖下采樣原理：對於一幅影象I尺寸為M*N，對其進行s倍下采樣，即得到(M/s)*(N/s)尺寸的得解析度影象，當然s應該是

Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study

Non-contrast head CT scan is the current standard for initial imaging of patients with head trauma or stroke symptoms. We aimed to develop and validate a s

How to build a Deep Learning Image Classifier for Game of Thrones dragons

Performance of most flavors of the old generations of learning algorithms will plateau. Deep learning, training large neural networks, is scalable and perf

【Person Re-ID】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

Introduction Person Re-ID目前依然是一項十分具有挑戰的任務。姿勢，視角，光照，背景和遮擋都給這項任務帶來困難。傳統的方法通過學習low-level特徵，比如顏色、外形、區域性描述子等來描述一個人。而CNN通過學習high-lev

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

摘要 Person re-identification (ReID) is an important task in computer vision. Recently, deep learning with a metric learning loss has becom

『論文閱讀』A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems

AbstractMULTI-VIEW-DNN聯合了多個域做的豐富特徵，使用multi-view DNN模型構建推薦，包括app、新聞、電影和TV，相比於最好的演算法，老使用者提升49%，新使用者提升110%。並且可以輕鬆的涵蓋大量使用者，解決冷啟動問題。主要做user embedding的過程，通多使用者在多

【論文閱讀】Learning a Deep Convolutional Network for Image Super-Resolution

開發十年，就只剩下這套架構體系了！ >>>

A Deep Learning Based DDoS Detection System in Software-Defined Networking (SDN)

標題：基於深度學習的軟體定義網路（SDN）DDoS檢測系統來源：Security and Safety 時間：2016年11月摘要分散式拒絕服務（DDoS）是當今組織網路基礎架構遇到的最普遍的攻擊之一。本文在軟體定義網路（SDN）環境中提出了基於深度學習的多向量DDoS檢測系統。 SDN提供了針對

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

這篇文章來大致介紹一下ConvLSTM的基本原理和應用場景。個人認為有時候對於一個演算法改進很大程度上會受到應用場景的啟示，比如現在要說的這篇。不知道論文作者當時想到這個idea時是不是也是這樣。 1.論文的核心思想先來想象一下這麼一個應用場景：根據某個城市歷史的降雨量資

Embed,encode,attend,predict:the new deep learning formula for state-of-the -art NLP models

轉載來自：https://explosion.ai/blog/deep-learning-formula-nlp 在過去六個月，一種強大的新型神經網路工具出現應用於自然語言處理。新型的方法可以總結為四步驟：嵌入（embed），編碼（encode），加入（atte

Ranking Popular Deep Learning Libraries for Data Science

Much of our curriculum is based on feedback from corporate and government partners about the technologies they are using and learning. In addition to their

Deep Learning Courses For NLP Market Research Report 2018 by Coursera, Stanford University, Udemy , UpX Academy, Class Central,

Deep learning process for the NLP market confirms that increasing applicability in customer-centric organizations is one of the key factors that can positi

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

A Deep Learning-Based System for Vulnerability Detection(二)

《PCANet: A Simple Deep Learning Baseline for Image Classification》

PCANet: A Simple Deep Learning Baseline for Image Classification?--名詞解釋

Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study

How to build a Deep Learning Image Classifier for Game of Thrones dragons

【Person Re-ID】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

【論文筆記】Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification

『論文閱讀』A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems

【論文閱讀】Learning a Deep Convolutional Network for Image Super-Resolution

A Deep Learning Based DDoS Detection System in Software-Defined Networking (SDN)

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Embed,encode,attend,predict:the new deep learning formula for state-of-the -art NLP models

Ranking Popular Deep Learning Libraries for Data Science

Deep Learning Courses For NLP Market Research Report 2018 by Coursera, Stanford University, Udemy , UpX Academy, Class Central,

docker to create awesome Deep Learning Environments for R (or Python) PT I | AITopics

The AI Paradox: How A Deep Learning Startup Is Building Successful AI Solutions

EmoPy: a machine learning toolkit for emotional expression

Guide to choose right deep Learning framework for your AI project

Building A Deep Learning Model using Keras

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

相關推薦