Reinforcement Learning with Long Short-Term Memory

阿新 • • 發佈：2022-04-15

鄭重宣告：原文參見標題，如有侵權，請聯絡作者，將會撤銷釋出！

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, (2002): 1475.0-1482.0

Abstract

　　本文介紹了使用長短期記憶迴圈神經網路的強化學習：RL-LSTM。使用Advantage(λ)學習和定向探索的無模型RL-LSTM可以解決相關事件之間存在長期依賴關係的非馬爾可夫任務。這在T形迷宮任務以及杆平衡任務的困難變化中得到了證明。

1 Introduction

　　強化學習(RL)是一種基於延遲獎勵訊號學習如何行為的方法[12]。強化學習面臨的更重要挑戰之一是環境狀態的一部分對智慧體隱藏的任務。此類任務稱為非馬爾可夫任務或部分可觀察馬爾可夫決策過程。許多現實世界的任務都有這個隱藏狀態的問題。例如，在導航任務中，環境中的不同位置可能看起來相同，但一個相同的動作可能會導致不同的下一個狀態或獎勵。因此，隱藏狀態使RL更加真實。然而，這也讓它變得更加困難，因為現在智慧體不僅需要學習從環境狀態到動作的對映，為了獲得最佳效能，它通常還需要確定它處於哪種環境狀態。

Long-term dependencies.

2 LSTM

Memory cells.

Activation updates.

3 RL-LSTM

Reinforcement Learning with Long Short-Term Memory

Reinforcement Learning with Long Short-Term Memory

筆記：Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification

A Long Short-Term Memory for AI Applications in Spike-based Neuromorphic Hardware

強化學習論文研讀（四）——Deep Reinforcement Learning with Double Q-Learning

Improving Generalization in Reinforcement Learning with Mixture Regularization

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

Online and Offline Reinforcement Learning by Planning with a Learned Model

Deep Learning with pytorch筆記（第三章）

【論文筆記（5）ECCV2020】Graph convolutional networks for learning with few clean and many noisy labels

機器學習類條件隨機標籤噪聲情況下的二分類問題研究復現NIPS論文learning with noisy label（logistic & C-SVM）

Reinforcement Learning (DQN) 中經驗池詳細解釋

論文記載： Deep Reinforcement Learning for Traffic LightControl in Vehicular Networks

MFMARL(Mean Field Multi-Agent Reinforcement Learning)實現

讀論文--Characterizing Attacks on Deep Reinforcement Learning

Evaluating the Performance of Reinforcement Learning Algorithms

Detecting Rewards Deterioration in Episodic Reinforcement Learning

DIVIDEMIX: LEARNING WITH NOISY LABELS AS SEMI-SUPERVISED LEARNING

Decoupling Value and Policy for Generalization in Reinforcement Learning

Game Theory and Multi-agent Reinforcement Learning筆記上

FetchSGD: Communication-Efficient Federated Learning with Sketching

Reinforcement Learning with Long Short-Term Memory

相關推薦