Offline Evaluation of Online Reinforcement Learning Algorithms

阿新 • • 發佈：2021-10-17

發表時間：2016（AAAI2016）
文章要點：通常大家做offline評估的時候都是去評估一個訓好的fixed的策略，這篇文章就說我想在offline的setting 下去評估一個演算法好不好。根據這個出發點，大致思路是先根據收集的data去弄一個evaluator出來，然後RL演算法去和這個evaluator互動，互動的過程既是policy更新的過程，也是評估的過程。文章一共提出了三個演算法，第一個就是直接取樣動作，然後和evaluator互動並更新。第二個是用rejection sampling來修正估計，然後用接收的樣本來更新policy。第三個是在episode上做rejection sampling，而不是在單個樣本上。
總結：

這個setting離我有點遠，看不大明白在幹啥，也不懂contribution在哪。
疑問：其實我是不太明白這個paper的點在哪，不清楚這個evaluation能用到哪。而且實驗部分的比較是比哪個evaluation的方式更準嗎？但是好像也沒提在哪個RL演算法上比的，只說了evaluation的比較物件是model based approach。搞不懂呀，罷了罷了。

Offline Evaluation of Online Reinforcement Learning Algorithms

Offline Evaluation of Online Reinforcement Learning Algorithms

Evaluating the Performance of Reinforcement Learning Algorithms

Online and Offline Reinforcement Learning by Planning with a Learned Model

HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

ON THE ROLE OF PLANNING IN MODEL-BASED DEEP REINFORCEMENT LEARNING

windows伺服器新增磁碟後，提示The disk is offline because of policy set by an administrator的解決辦法

【論文精讀】TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task

Reinforcement Learning (DQN) 中經驗池詳細解釋

論文記載： Deep Reinforcement Learning for Traffic LightControl in Vehicular Networks

MFMARL(Mean Field Multi-Agent Reinforcement Learning)實現

Sample pipeline for text feature extraction and evaluation of sklearn

強化學習論文研讀（四）——Deep Reinforcement Learning with Double Q-Learning

讀論文--Characterizing Attacks on Deep Reinforcement Learning

ABB AC 900F學習筆記81：8.4 Offline functions of the display unit-41

Detecting Rewards Deterioration in Episodic Reinforcement Learning

Decoupling Value and Policy for Generalization in Reinforcement Learning

Game Theory and Multi-agent Reinforcement Learning筆記上

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning

Explainable Reinforcement Learning Through a Causal Lens

Improving Generalization in Reinforcement Learning with Mixture Regularization

Offline Evaluation of Online Reinforcement Learning Algorithms

相關推薦