Citation: | Hu Guanghua, Wu Cangpu. Incremental Multi Step R Learning [J].JOURNAL OF BEIJING INSTITUTE OF TECHNOLOGY, 1999, 8(3): 245-250. |
[1] |
Schwartz A. Areinforcement learning method for maximizing undiscounted rewards. In: SaittaL,ed. Proceeding of the Tenth International Conference on Machine Learning. Amherst: Morgan Kaufmann, 1993.298- 305
[2] Mahadevan S. Average reward reinforcement learning: foundations, algorithms and empirical results. Machine Learning, 1996, 22:159-195 [3] Tadepalli P, Ok D. Model??based average reward reinforcement learning. Artificial Intelligence, 1998, 100:177-224 |
[2] |
Peng J, Williams R J. Increment multi step Q-learning. Machine learning, 1996, 22:283-290
[5] Bertsekas D P. Dynamic programming: Deterministic and stochastic methods. Englewood Cliffs: Prentice Hall, 1987 [6] Cichosz P, Mulawka J J. Fast and efficient reinforcement learning with truncated temporal differences. In: Prieditis A, Russell S, ed. Proceeding of the Twelfth International Conference on Ma??chine Learning. San Francisco: Mo rgan Kaufmann, 1995.99-107 |