1. 程式人生 > >Clarifications re “Adversarial Review of Adversarial Learning of Nat Lang” post

Clarifications re “Adversarial Review of Adversarial Learning of Nat Lang” post

Clarifications re “Adversarial Review of Adversarial Learning of Nat Lang” post

Wow. That piece about the bad adversarial NLG paper really struck a nerve. Its been getting tons of attention, and some very positive comments. Thanks!

There are also a few points that people (especially, I think, younger researchers) raise, either on the web or in private, along the lines of this comment on

reddit:

You could take Goodfellows original GAN paper, and critique it in a similar way: It’s not as good as the state of the art, it’s only on toy datasets, etc. Yet that method has been called “the coolest idea in machine learning the last 20 years”. And it probably is.
Now, you shouldn’t overstate your results. But are they really doing that? The blog author has a gripe with the paper title, because it claims to generate “natural language”, when the language doesn’t seem natural. I took that to just mean that it tries to generate human language as opposed to, say, a programming language.
The author seems to find some kind of arrogance in the paper that I really don’t see from the examples given.

This triggered me to write a few clarification.

First and foremost, I would like to reiterate that this particular paper was spectacularly bad in my view on many levels, but my broader criticism was on a trend, not on a single paper. Now, for some specific points:

My criticism is not about the paper not getting state-of-the-art results.

The focus on SOTA results is very overrated in my view, especially in deep learning, where so many things are going on beyond the innovation described in each work. I don’t need to see SOTA results, I want to see a convincing series of experiments, showing that the proposed method does something relevant, new and interesting.

My criticism is not about the paper using a toy task, or a toy grammar.

It is OK to use toy tasks. It is often desirable to use a toy task. For example, I could imagine some very interesting research that uses even smaller grammars than the one used by the authors in their simplest task. The idea would be to construct a grammar to demonstrate some phenomena, and then correlate it with learnability, for example. But the toy task must be meaningful and relevant, and you have to explain why it is meaningful and relevant. And, I think it goes without saying, you should understand the toy task you are using. Here, the authors clearly had no idea what the grammar they were using is doing. Not only that they don’t distinguish lexical rules from non-terminal productions, they didn’t even realize the vast majority of the production rules in the grammar file were not being used.

My criticism is not about the paper “not solving natural language generation”.

Of course the paper did not solve natural language generation. That’s exactly the point, no single paper can “solve” NLG (what does that even mean?), like no single biology paper will solve cancer. But the paper should be clear about the scope of the work it is actually doing. In the title, in the abstract, in the text.

(another point on “natural language”: the reddit comment above says “I took that to mean it tries to generate human language as opposed to, say, a programming language”. That’s the problem. The paper claims to generate human language, but is not evaluated on human language but instead only on very narrow fragments of human language, which are waaay much more similar to a very simple, stripped down programming language, without semantics, than it is to human languages. The paper also does not evaluate on any property that relates to the language being “natural” or “human”. This makes the paper very misleading in its description of what it is doing. It mislead that reader on reddit. It likely mislead many others as well. I tend to believe the authors did that by ignorance rather than maliciously. This is precisely where the arrogance comes in: working in a field that you do not understand, while not realizing that you do not understand it, or even that it is a complex field that needs understanding, and making broad, unsubstantiated and misleading claims as a result.)

My criticism is not about the paper being incremental.

This is very much related to the point above. I don’t have a problem with incremental papers. Most papers are incremental. That’s how progress is made, in small, incremental step. (It is true that there’s also a trend in deep-learning, fueled by arxiv-mania, of slicing things a bit too thin, pushing out papers for miniscule increments. Let’s put that aside for the current discussion.) Incrementality is perfectly fine, but you have to clearly define your contribution, position it w.r.t existing work, and precisely state (and evaluate) your increment.

Combining the points above, if the paper had only simple CFG experiments, but was titled (and written to support) something like “An Adversarial Training Method for Discrete Sequences that can Recover Short Context-Free Fragments”, and had a discussion of rules in the CFG and the kinds of structures they capture, followed by a proper, convincing evaluation, and a statement that the sentence set could very easily be learned by an RNN but not by any previous GAN-based method, yet the current GAN captures them, and that this is a first step in something that could at some point lead to NLG — this would actually be a solid paper that I’d happily accept to a conference. (not necessarily an NLP conference, this depends on other factors as well, for example the form of the CFG they were using and the classes of structures it captures.)

相關推薦

Clarifications reAdversarial Review of Adversarial Learning of Nat Langpost

Clarifications re “Adversarial Review of Adversarial Learning of Nat Lang” postWow. That piece about the bad adversarial NLG paper really struck a nerve. I

1705.Person Re-Identification by Deep Joint Learning of Multi-Loss Classification 論文閱讀筆記

Person Re-Identification by Deep Joint Learning of Multi-Loss Classification 本文采用多loss分類聯合訓練同時學習行人條紋區域性特徵和全域性特徵,受益於區域性和全域性學習到的特徵具有

論文:Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey翻譯工作

**關於對抗性攻擊對深度學習威脅的研究** Naveed Akhtar and Ajmal Mian ACKNOWLEDGEMENTS: The authors thank Nicholas Carlini (UC Berkeley) and Dimit

Generative Adversarial Networks and the Rise of Fake Faces: an Intellectual Property Perspective

The tremendous growth in the artificial intelligence (AI) sector over the last several years may be attributed in large part to the proliferation of so-

LAPGAN:Deep Generative/Image Models using a Laplacian Pyramid of Adversarial Networks 使用拉普拉斯金字塔的GAN

Emily Denton Dept. of Computer Science Courant Institute New York University Soumith Chintala

GANs學習系列(7): 拉普拉斯金字塔生成式對抗網路Laplacian Pyramid of Adversarial Networks

【前言】      本文首先介紹生成式模型,然後著重梳理生成式模型(Generative Models)中生成對抗網路(Generative Adversarial Network)的研究與發展。作者按照GAN主幹論文、GAN應用性論文、GAN相關論文分類整理了45篇近

生成式對抗網路GAN研究進展(四)——Laplacian Pyramid of Adversarial Networks,LAPGAN

【前言】     本文首先介紹生成式模型,然後著重梳理生成式模型(Generative Models)中生成對抗網路(Generative Adversarial Network)的研究與發展。作者按照GAN主幹論文、GAN應用性論文、GAN相關論文

Review of Machine Learning With R

Tweet Share Share Google Plus How do you get started with machine learning in R? In this post yo

Review: the foundation of the transaction、Transaction characteristics in Spring

img 不可重復讀 操作 邏輯 AD 連續 tro required cit 1.什麽是事務: 事務是程序中一系列嚴密的操作,所有操作執行必須成功完成,否則在每個操作所做的更改將會被撤銷,這也是事務的原子性(要麽成功,要麽失敗)。 2.事務特性: 事務特性分為四個:原子

The Nature of Statistical Learning Theory.pdf

computer nat cti svm learning UNC roo ima discuss 下載地址:網盤下載 內容簡介 · · · · · ·The aim of this book is to discuss the fundamental ideas whi

The way of Webpack learning (I.) -- Configure Webpack from zero(從零開始配置webpack)

-- UNC 初始 exp light 方法 name npm .html 學習之路基於webpack3.10.0,webpack4.0之後更新。 一:開始前的配置 1、初始化項目,其實就是新建一個package.json文件,後面的命令依賴裏面的配置項。 npm ini

Theories of Deep Learning

tun topo dice stand 9.4 mar sem speed 2.0 https://stats385.github.io/readings Lecture 1 – Deep Learning Challenge. Is There Theory

A journey of English learning two

    This mini-story learning process made me feel like I was back to my original state, almost every time I could not get in

A journey of English learning one

    The ten day is fleeting, and the acceptance of the third little stories is coming soon, though this is often topic,But f

李巨集毅機器學習 P13 Brief Introduction of Deep Learning 筆記

deep learning的熱度增長非常快。 下面看看deep learning的歷史。 最開始出現的是1958年的單層感知機,1969年發現單層感知機有限制,到了1980年代出現多層感知機(這和今天的深度學習已經沒有太大的區別),1986年又出現了反向傳播演算法(通常超過3

李巨集毅機器學習 P15 “Hello world” of deep learning 筆記

我們今天使用Keras來寫一個deep learning model。 tensorflow實際上是一個微分器,它的功能比較強大,但同時也不太好學。因此我們學Keras,相對容易,也有足夠的靈活性。 李教授開了一個玩笑: 下面我們來寫一個最簡單的deep learning mo

機器學習與深度學習系列連載: 第二部分 深度學習(九)Keras- “hello world” of deep learning

Keras Kearas 是深度學習小白程式碼入門的最佳工具之一。 如果想提升、練習程式碼能力,還是建議演算法徒手python實現。 複雜的深度神經網路專案還是推薦TensorFlow或者Pytorch Keras是一個高層神經網路API,Keras由純Pyt

Application of deep learning in Industrial area

Application of deep learning in Industrial area https://www.vision-systems.com/articles/print/volume-22/issue-10/departments/technology-trends/machi

Best (and Free!!) Resources to Understand Nuts and Bolts of Deep Learning

The internet is filled with tutorials to get started with Deep Learning. You can choose to get started with the superb Stanford courses CS221&nbs

Incremental Learning of Object Detectors without Catastrophic Forgetting詳解

Incremental Learning of Object Detectors without Catastrophic Forgetting詳解 最近由於專案的需要在研究incremental learning在目標檢測方面的應用,剛好讀到了INRIA在2007年的一篇paper,採