Journal of Applied Sciences1812-56541812-5662Asian Network for Scientific Information10.3923/jas.2009.2056.2066AkramizadehA.AfsharA.MenhajMohammad-B112009911In this study, Q-learning has been extended to multiagent systems where a kind of ranking in action selection has been set among several self-interested agents. The process of learning is regarded as a sequence of situations modeled as extensive form games with perfect information. Each agent decides on its actions, in different subgames the higher level agents have decided on, based on its preferences affected by the lower level agents’ preferences. These modified Q-values, called associative Q-values, are the estimations of possible utilities gained over a subgame with respect to the lower level agents’game preferences. A kind of social convention can be addressed in extensive form games providing the ability to better deal with multiplicity in equilibrium points as well as decreasing complexity of computations with respect to normal form games. This new process is called extensive Markov game which is proved to be a kind of generalized Markov decision process. It is also provided a comprehensive review on the related concepts and definitions previously developed for normal form games. Some analytical discussions on the convergence and the computation space are also included. A numerical example affords more elaboration on the proposed method.]]>Busoniu, L., R. Babuska and B. De Schutter,2008Claus, C. and C. Boutilier,1998Conitzer, V. and T. Sandholm,2003Filar, J. and K. Vrieze,1997Fundenberg, D. and D.K. Levine,1998Hu, J. and P. Wellman,1998Kapetanakis, S. and D. Kudenko,2002Kohlberg, E. and J.F. Mertens,1986Kononen, V.,2004Littman, M.L.,1996Littman, M.L.,2001Nash, J.,1951Osborne, M.J.,2000Owen, G.,1995Panait, L. and S. Luke,2005Puterman, M.L.,1994Shoham, Y., R. Powers and T. Grenager,2006Shoham, Y., R. Powers and T. Grenager,2003Stone, P. and M. Veloso,2000Sutton, R., D. McAllester, S. Singh and Y. Mansour,2000Sutton, R., D. Precup and S. Singh,1999Tan, M.,1993Wang, X. and T. Sandholm,2002Watkins, C.J.C.H.,1989Weiss, G.,1999Daskalakis, C., P.W. Goldberg and C.H. Papadimitriou,2005Hu, J. and M.P. Wellman,2003Laslier, J.F. and B. Walliser,2005