转载

DeepMind团队的《Deep Reinforcement Learning in Large Discrete Action Spaces》

Paper: Deep Reinforcement Learning in Large Discrete Action Spaces

Authors: G Dulac-Arnold, R Evans, H v Hasselt, P Sunehag, T Lillicrap, J Hunt, T Mann, T Weber, T Degris, B Coppin

Link: http://arxiv.org/abs/1512.07679

前阵刚炒作过AlphaGo的DeepMind本月4号更新了《Deep Reinforcement Learning in Large Discrete Action Spaces》第二版,看起来真的将RL用在推荐系统了。

不过看起来文章的最大创新只是引入了action embedding,具体如何做embedding的并未展开,估计是离不开word2vec的。文章提出了如图所示的Wolpertinger Policy网络,然后用Deep Deterministic Policy Gradient (DDPG)进行训练。最后的实验没看到大的突破,目前还是花招,期待未来更大的突破。

DeepMind团队的《Deep Reinforcement Learning in Large Discrete Action Spaces》

原文  https://www.52ml.net/17118.html
正文到此结束
Loading...