AlphaGo原来是这样运行的,一文详解多智能体强化学习( 十 )
[10] Foerster J, Farquhar G, Afouras T, et al. Counterfactual Multi-Agent Policy Gradients[J]. arXiv: Artificial Intelligence, 2017.
[11] Sunehag P, Lever G, Gruslys A, et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning.[J]. arXiv: Artificial Intelligence, 2017.
[12] Rashid T, Samvelyan M, De Witt C S, et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[J]. arXiv: Learning, 2018.
[13] OpenAI Five, OpenAI, https://blog.openai.com/openai-five/, 2018.
[14] Vinyals, O., Babuschkin, I., Czarnecki, W.M. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019).
[15] P. Long, T. Fan, X. Liao, W. Liu, H. Zhang and J. Pan, ''Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning,'' 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, 2018, pp. 6252-6259, doi: 10.1109/ICRA.2018.8461113.
[16] Y. F. Chen, M. Everett, M. Liu and J. P. How, ''Socially aware motion planning with deep reinforcement learning,'' 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, 2017, pp. 1343-1350, doi: 10.1109/IROS.2017.8202312.
【AlphaGo原来是这样运行的,一文详解多智能体强化学习】[17] Hernandez-Leal P , Kartal B , Taylor M E . A survey and critique of multiagent deep reinforcement learning[J]. Autonomous Agents & Multi Agent Systems, 2019(2).
推荐阅读
- 支付宝|支付宝五福活动抢先开始了!网友:原来今年可以提前集
- 外星人|外星生命原来是地球帮忙创造的:专家揭秘其中过程
- 科普|为什么吃完火锅总有一身味儿?原来跟它没关系
- AMD|果然这样!X光下看AMD Zen4:16核心只是开胃菜
- 股票|美国男子炒特斯拉股票狂赚上千万美元被捕:原来是空手套白狼
- 眼睛|揉眼睛、滴眼药水…这些你以为的好习惯原来这么伤眼
- 奔驰|奔驰的特斯拉 长这样
- 家电|拆开才知道 原来这些家电和手机震动是一个原理
- 生科医学|身高猛窜、突然变丑:原来它在作怪!
- 衣服|被裹8件衣服2层厚被子 五月大婴儿险出事!冬天这样穿后果很严重
