Citation: | TONG Liang, LU Ji-lian. Multi-Agent Reinforcement Learning Algorithm Based on Action Prediction[J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 2006, 15(2): 133-137. |
[1] |
Sutton R S. Learning to predict by the methods of tem por al differ ence[J]. Machine Learning, 1988, 7(3): 9-44.
|
[2] |
Littman M L. Markov games as a framework for multi agent reinforcement learning[Z]. 11th International Con ference on Machine Lear ning, San Francisco, CA, 1994.
|
[3] |
Hu Junling, Wellman M P. Multiagent r einforcementlearning: T heoretical framew ork and an alg orithm [Z]. Fifteenth International Conf on Machine Learning, Wis consin, 1998.
|
[4] |
Claus C, Boutilier C. The dyanmics of r einforcementlearning in cooperative multiagent systems [A]. MenloPar k. Proceedings of the Fifteenth Natio nal Conferenceon Artificial Intelligence [c]. [s. l.]: AAAI Press,1998. 746-752.
|
[5] |
Watkins C, Dayan P. Q lear ning [J]. Machine Learn ing, 1992, 8(4): 279-292.
|
[6] |
Sutton R S. Temporal cr edit assignment in r einforcementlearning [D]. Amherst, MA: Universit y of Mas sachusetts, 1984.
|
[7] |
Oh C H, Nakashima T, Ishibuchi H. Initialization of Q v alues by fuzzy rules for accelerating Q lear ning [Z]. In ternational Joint Confer ence on Neural Networks, An chor ag e, Alaska, 1998.
|