HOME JOURNALS CONTACT

Journal of Applied Sciences

Year: 2002 | Volume: 2 | Issue: 3 | Page No.: 408-415
DOI: 10.3923/jas.2002.408.415
Automatic Tuning of Q-learning Algorithms Parameters
M.R. Meybodi and S. Hodjat

Abstract: This paper describes a general approach for automatically tuning of reinforcement learning algorithms` parameters. In this approach a reinforcement learning agents` parameters are tuned by other more simple algorithms of reinforcement learning. We will explain this approach by tuning one of the parameters of a Q-learning and statistical clustering algorithm. The results of tuning this parameter will be described by some simple examples. Comparing the result of an algorithm using automatically tuned parameter and the algorithms with fixed parameters will show that the former is generally more flexible and capable of performing better in most cases.

Fulltext PDF

How to cite this article
M.R. Meybodi and S. Hodjat, 2002. Automatic Tuning of Q-learning Algorithms Parameters. Journal of Applied Sciences, 2: 408-415.

Keywords: Reinforcement learning, hierarchical learning, parameter tuning, Q-learning, learning automata and statistical clustering

REFERENCES

  • Dayan, P. and G.E. Hinton, 1993. Feudal Reinforcement Learning. In: Advancess in Neural in Formation Processing Systems 5, Hanson, S.J., J.D. Cowan and C.L. Giles (Eds.). CA Morgan Kaufman, San Mateo


  • Harmon, M.E. and S.S. Harmon, 1996. Reinforce learning: Tutorial. http://www.nbu.bg/cogs/events/2000/Readings/Petrov/rltutorial.pdf.


  • Hodjat, S. and M.R. Meybodi, 1996. Fine tuning of Q-learning automata (IN Farsi). Proceedings of the 2nd Annual Conference of Computer Society of Iran, (ACCSI`96), Tehran, Iran, pp: 209-220.


  • Hodjat, S., 1997. An artifical lab for creating and comparing learning algorithms. M.Sc. Thesis, Department of Computer Engineering, Amirkabir University, Thran, IRan.


  • Kaelbling, L.P., M.L. Littman and A.W. Moore, 1996. Reinforcement learning: A survey. J. Artificial Intell. Res., 4: 237-285.
    Direct Link    


  • Krinskii, V.I., 1964. Asymptotically optimal automaton with exponential seed of convergence. Biofizica, 9: 484-487.
    PubMed    Direct Link    


  • Krylov, V.Y., 1964. One stochastic automaton which is asymptotically optimal in ramdom medium. Automata Remote Control, 24: 1114-1116.


  • Mathadvan, S. and J. Connel, 1991. Scaling reinforcement learning to robotics by exploiting the subsumption architecture. Proceedings of the 8th International Workshop on Machine Learning, (IWMC`91), Morgan Kaufmann, pp: 328-332.


  • Mahadevan, S. and J. Connel, 1991. Automatic programming of behavior-based robots using reinforcement learning. Proceedings of the Artifical Intelligence, (AI`91), Pittsburgh, PA, pp: 311-365.


  • Meybodi, M.R. and S. Lakshmivarahan, 1982. E-Optimality of a general class of absorbing barrier learning algorithms. Inform. Sci., 28: 1-20.


  • Narendra, K.S. and A.L. Thathachar, 1989. Learning Automata. Prentic Hall, Englewood Cliffs, NJ., USA


  • Schalkoff, R., 1991. Pattern Recognition. Wiley International, New Jercy, USA


  • Rseltsin, M.L., 1962. On the behavior of finite automata in random media. Automata Remote Control, 22: 1345-1354.


  • Watkins, C.J.C.H., 1989. Learning from delayed rewards. Ph.D. Thesis, Kings College, Cambridge, England.

  • © Science Alert. All Rights Reserved