《Fluency Boost Learning and Inference for Neural Grammatical Error Correction》論文總結

阿新 • • 發佈：2019-02-08

核心思想

這篇論文的核心思想其實很簡單，就是通過有效地增加訓練資料，來使模型的推斷結果更加正確。具體就是使用模型推斷的n-best結果來生成新的訓練資料，用於訓練。

增加訓練資料這個步驟是很關鍵的。

傳統的做法

想到增加訓練資料，一個很正常的想法就是，人為製造一些含有錯誤資訊的訓練資料對。操作步驟為：

從訓練資料對dataset中選取訓練資料對，即(src, tgt)
合理修改src中的字元，變成src’
修改之後的src’與tgt組成一個新的資料對
重複上述步驟若干次，得到不少新的訓練資料對dataset’
將dataset和dataset’一起用於模型訓練。

論文的做法

但是本論文的做法不同。它的想法其實也挺正常。具體的做法是：

對每一個src，使用模型推斷，得到多個推斷結果(n-best)
對每一個推斷結果，計算一個flunecy分數
抽取出所有分數低於正確推斷結果(認為是n-best的第一個)的推斷結果
對於選取出的每一個推斷結果，與tgt句子組成新的訓練資料對，叫做fluency boost sentence pair，這些資料對用於後續的訓練

上述做法就是論文的做法。這種做法與傳統的增加資料的做法相比，有一個明顯的優勢就是：

模型推斷的結果，更能反映當前模型的資訊，用它來反饋給模型，能夠更加有效地糾正模型。

因此，個人覺得這種做法訓練出的模型效能要優於傳統的增加訓練資料的做法。
並且，使用fluency boost learning可以多回合進行逐步糾錯，在連續錯誤的情況下，能夠逐步糾正詞語，使得整個推斷流程的詞語上下文變得清晰。

幾個要點

論文有幾個要點，如下：

如何計算fluency分數？
fluency boost learning也有多種型別

計算fluency分數

fluency分數的計算很簡單，公式如下：

其中x代表句子，f(x) 即 fluency score，H(x)即x的交叉熵。

fluency boost 的種類

fluency boost leanring 有三種方式：

Back-boost learning
Self-boost learning
Dual-boost learning

Back-boost借鑑於NMT的Back translation，是講一個流暢的句子轉換成一個含有錯誤的句子。論文給了一個虛擬碼：

Self-boost允許模型自己生成候選結果。論文的虛擬碼如下：

back-boost和self-boost是從不同的層面生成不流暢的句子用於提升模型的效能。Dual-boost則是兩者的結合。虛擬碼如下：

然後論文還給出了一些資料測試結果對比，有興趣的可以通過文章開頭的論文連結，下載論文檢視。

目前，還不知道哪裡有開源的實現。或許你可以試著自己去實現一個。嘿嘿。

注：

論文指出NEC不同於NMT，NEC的目標是不改變原句子的意思的前提下使句子更流暢。

聯絡我

個人公眾號，你可能會有興趣：

《Fluency Boost Learning and Inference for Neural Grammatical Error Correction》論文總結

核心思想這篇論文的核心思想其實很簡單，就是通過有效地增加訓練資料，來使模型的推斷結果更加正確。具體就是使用模型推斷的n-best結果來生成新的訓練資料，用於訓練。增加訓練資料這個步驟是很關鍵的。傳統的做法想到增加訓練資料，一個很正

論文筆記 Multiomdal Learning and Reasoning for Visual Question Answering (NIPS 2017)

文章的主要貢獻點如下：值得學習的是，文章的寫作挺好的。文章的一個主要思想就是modular neural network，通過學習關於question與image的多模態(multimodal)與多方面(multifaceted)的表徵，在VQA1.與VQA2.0上取得不錯效果。

Encoding concepts, categories and classes for neural networks

Encoding concepts, categories and classes for neural networksIn a previous post, we explained how neural networks work to predict a continuous value (like

《Object Storage on CRAQ: High-throughput chain replication for read-mostly workloads》論文總結

# CRAQ 論文總結 **說明**：本文為論文 **《Object Storage on CRAQ: High-throughput chain replication for read-mostly workloads》** 的個人理解，難免有理解不到位之處，歡迎交流與指正。 **論文地址**：[C

《Learning both Weights and Connections for Efficient Neural Networks》論文筆記

1. 論文思想深度神經網路在計算與儲存上都是密集的，這就妨礙了其在嵌入式裝置上的運用。為了解決該問題，便需要對模型進行剪枝。在本文中按照網路量級的排序，使得通過只學習重要的網路連線在不影響精度的情況下減少儲存與計算量。論文中的方法分為三步：首先，使用常規方法訓練模型；使用剪枝策略進

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

本部落格參考了微信公眾號“AI晶片演算法”中的原創文章《Google CVPR2018 8bit 量化論文》一文。 1、本文的貢獻本文主要側重將推斷中的浮點數運算量化為整數運算（Integer-Arithmetic-Only），最終將權重和啟用函式量化為8-bit，

Learning both Weights and Connections for Efficient Neural Network -- 論文筆記

這是2015年斯坦福和英偉達的一篇論文。 1.簡介：通過修剪訓練後網路中的不重要連線（connections），來減少網路所需要的引數，減少記憶體和cpu的消耗，使網路更加適應在移動裝置上執行。 2.idea思想： 1）首先訓練整個網路

CVPR 2017：See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-based Person Re-identification

network 測試 eee 分享 The 因此進行最大變化 [1] Z. Zhou, Y. Huang, W. Wang, L. Wang, T. Tan, Ieee, See the Forest for the Trees: Joint Spatial and

【論文閱讀】韓鬆《Efficient Methods And Hardware For Deep Learning》節選《Learning both Weights and Connections 》

Pruning Deep Neural Networks 本節內容主要來自NIPS 2015論文《Learning both Weights and Connections for Efﬁcient Neural Networks》。這部分主要介紹如何剪枝網路

Strong Baselines for Neural Semi-supervised Learning under Domain Shift半監督學習

2018 ACL 論文 Strong Baselines for Neural Semi-supervised Learning under Domain Shift 不同資料集的遷移學習 MT-Tri方法在情感分析上（無監督域適應）超過DANN方法半監督學習結合了監督學習和無監督學

MACNN-Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition

《Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition》是微軟亞洲研究院17年出的一篇細粒度影象識別論文，它的姊妹篇是《Look Closer to

Classification and inference with machine learning

machine learning作業代寫、代做Markdown留學生作業、代寫python, C/C++程式語言作業Project 3Classification and inference with machine learningThis notebook is arranged in cells. Te

Building Fast and Compact Convolutional Neural Networks for Offline HCCR

--pattern recognition 2017 摘要：像其他的計算機視覺技術一樣，離線的手寫文字識別使用CNN方法取得了很好的效果。但是需要非常複雜的網路才可以取得較好的效果。這樣的網路直觀地看起來計算成本過高，並且需要儲存大量引數，這使得它們在行動式裝置中部署

Machine Learning is Fun! Part 3: Deep Learning and Convolutional Neural Networks

We can train this kind of neural network in a few minutes on a modern laptop. When it’s done, we’ll have a neural network that can recognize pictures of “8

An Introduction to Deep Learning and Neural Networks

aitopics.org uses cookies to deliver the best possible experience. By continuing to use this site, you consent to the use of cookies. Learn more » I und

Artificial Intelligence, Machine Learning and Neural Networks – Keeping Things in Perspective

It is an overarching computer science discipline that deals with making machines think like humans, having consciousness and the ability to adjust to the c

Can neural networks, deep learning and GPUs help your business now?

Events If you want to exploit machine learning and AI, the range of technologies and techniques available can appear dizzying. Luckily, there's just one we

Opinionated openness: Facebook AI research strategy, ecosystem, and target audience for Deep Learning, and the nuances of using

Chintala's take is that some people would have to be assigned on something like this anyway. If PyTorch had not been created, the other option would be to

Top 10 Machine Learning, Deep Learning, and Data Science Courses for Beginners (Python and R)

Data Science, Machine Learning, Deep Learning, and Artificial intelligence are really hot at this moment and offering a lucrative career to programmers wi

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection

We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination fo

《Fluency Boost Learning and Inference for Neural Grammatical Error Correction》論文總結

核心思想

傳統的做法

論文的做法

幾個要點

計算fluency分數

fluency boost 的種類

聯絡我

相關推薦