Citation: | Yi Xie, Zhongyi Liu, Zhao Liu, Yijun Gu. Cooperative Multi-Agent Reinforcement Learning with Constraint-Reduced DCOP[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2017, 26(4): 525-533.doi:10.15918/j.jbit1004-0579.201726.0412 |
[1] |
Guestrin C, Lagoudakis M, Parr R. Coordinated reinforcement learning[C]//Proc of International Conference on Machine Learning, 2002.
|
[2] |
Watkings C J C H. Learning from delayed rewards[D]. Cambridge:Cambridge University, 1989.
|
[3] |
Zhang C, Lesser V. Coordinating multiagent reinforcement learning with limited communication[C]//Proc of International Conference on Autonomous Agents and Multiagent Systems, 2013.
|
[4] |
Zhang C, Abdallah S, Lesser V. Integrating organizational control into multiagent learning[C]//Proc of International Conference on Autonomous Agents and Multiagent Systems, 2009.
|
[5] |
Kok J R, Vlassis N. Collaborative multiagent reinforcement learning by payoff propagation[J]. Journal of Machine Learning Research, 2006, 7:1789-1828.
|
[6] |
Shi J, Malik J. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8):888-905.
|
[7] |
Zafarani R, Abbasi M A, Liu H. Social media mining[M]. Cambridge:Cambridge University Press, 2014.
|
[8] |
Chechetka A, Sycara K. No-commitment branch and bound search for distributed constraint optimization[C]//Proc of International Conference on Autonomous Agents and Multiagent Systems, 2006.
|
[9] |
Petcu A, Faltings B. DPOP:A scalable method for multiagent constraint optimization[C]//Proc of International Joint Conference on Artificial Intelligence, 2005.
|
[10] |
Mailler R, Lesser V. Solving distributed constraint optimization problems using cooperative mediation[C]//Proc of International Conference on Autonomous Agents and Multiagent Systems, 2004.
|
[11] |
Lesser V, Ortiz C L, Tambe M. Distributed sensor networks:a multiagent perspective[M]. Dordrecht:Kluwer Academic Publisher, 2003.
|
[12] |
Zha H, He X, Ding C, et al. Spectral relaxation for k-means clustering[C]//Proc of Neural Information Processing Systems, 2001.
|
[13] |
Dechter R. Bucket elimination:a unifying framework for reasoning[J]. Artificial Intelligence, 1999, 114:41-85.
|
[14] |
Kumar A, Zilberstein S. Scalable multiagent planning using probabilistic inference[C]//Proc of International Joint Conference on Artificial Intelligence, 2011.
|
[15] |
Sutton R S, Barto A G. Reinforcement learning:an introduction[M]. Boston:Massachusetts Institute of Technology Press, 1998.
|