Title:

Equity and power in a cooperative trialanderror game

General solution concepts in cooperative game theory are static, e.g., the core, the Shapley value and the Nash bargaining solution. Dynamic implementation procedures have been proposed in order to support these static solution concepts. This thesis studies an Ndimensional Markov chain motivated by a dynamic interactive trialanderror learning model. The state space of the Markov chain is based on a cooperative game (v,N) whose characteristic function v is superadditive and monotone, with conditions on v ensuring nonemptiness of the core. Agents repeatedly bargain over a cooperative surplus by submitting their demand for their share. Each round the payable coalition is chosen, the feasible coalition with the maximum sum of demands. Players in the payable coalition receive their demands as payoffs, the other players receive no payoff. Players adjust their demands according to the following rule: In an efficient state (where the demand sum of all players equals the total surplus, 1) one player is chosen uniformly at random and increases his demand by ε. If demands sum to 1 + ε, one player not in the payable coalition is then chosen to reduce her demand with probability proportional to the size of her demand. An individual’s demand update decision in the learning model is based solely on the observation of his last payoff. Individual updates are in the tradition of reinforcement learning, aspiration adaption, and fictitious play. Selten (1972) found empirical evidence for an inherent equity principle in many outcomes of experimental cooperative bargaining games. By construction, the dynamic learning model presented in this thesis also has an inherent equity principle. The model is a simple modification (and the limit process) of a model introduced by Nax (2010). To our knowledge, this thesis presents the first general results of such a dynamic learning model for general 3player games and all interesting cases of 4player games. The transition probabilities of the Markov process studied in this thesis are the transition probabilities between efficient states, obtained by the two steps from an efficient state to a state with demand sum 1+ε and back, of the described trial and error process. The process is a biased random walk on the simplex of efficient states, of which the polytope formed by the grid of core points forms the subset of particular interest. For general Nplayer games we introduce a coalition structure that exhibits an asymmetry of power between its members: the asymmetric coalition set. We believe the concept of an asymmetric coalition set to be both novel and relevant to the study of dynamic learning models with incremental demand updates for general cooperative games. Along a face of the core polytope generated by an asymmetric coalition set, the asymmetric face, the bias of the process is determined by the interplay between two dynamics: the inherent equity bias, which “drags” the process towards equity, and the asymmetric power, which “drags” the process away from equity. If the core polytope does not contain an asymmetric face, the equity bias of the random walk determines the expected movement along the faces of the polytope. The process can only leave the core polytope from a state on an asymmetric face. We study a special Markov chain in dimension N derived from the Nplayer bargaining game, where no coalitional constraints are present. Then the bias of the random walk is solely determined by the inherent equity principle: the random walk drifts towards equity, and the equilibrium distribution is concentrated around the equal split, the most equitable allocation. For N = 3, no asymmetric coalition set exists. We show that the set of recurrent states of the Markov chain is the “core polygon”, formed by the grid points in the core. The cooperative outcome co is the unique vector in the core with smallest L2distance from the equal split. At every state of the core polygon outside a small ball around co, the random walk moves in expectation over one time step towards co. The equilibrium distribution of the Markov chain is concentrated around the vector co. For 3player games this vector equals the egalitarian allocation, a concept developed by Dutta and Ray (1989). For N ≥ 4, games (v,N) can contain an asymmetric coalition set. For N = 4 the only possible asymmetric coalition set is formed by two distinct two player coalitions. We give three example games (v,4) with combinatorially isomorphic core. Each of the example games has an asymmetric edge in the core. Along the asymmetric edge the inherent equity bias creates a drift dynamic “down” the asymmetric edge, and the asymmetric power creates a drift dynamic “up” the asymmetric edge. In each example game the asymmetric power is extreme, zero or moderate respectively: the equilibrium distribution of the process is concentrated at the “upper” endpoint, the “lower” endpoint (which is co) or around a demand vector in the interior of the asymmetric edge. Furthermore we give simulation results, which indicate that the concept of asymmetric power can be generalized to other dynamic learning processes. Coupling is a powerful and elegant probabilistic tool with which one is often able to calculate tight bounds on the speed of convergence to equilibrium of Markov chains. We believe this technique to be novel to the study of dynamic stochastic learning processes in evolutionary game theory and hence present a general introduction to the technique. We use coupling arguments to show rapid mixing for the cooperative game process for the Nplayer bargaining game and for general 3player games.
