# The Master Equation and the Convergence Problem in Mean Field Games

In Mean Field Theory (MFT) one studies models for decision making in a large population of $N$ interaction agents. The agents are supposed to have only marginal impact so that the behaviour of one (average) agent is determined by its own state and by a distribution that describes the states of all the others while these are varying in time. The states of the agents are thus described by a stochastic differential equation (SDE) according to which the system evolves in time to some steady equilibrium state. The solution of this optimal control problem is described by a continuous Hamilton-Jacobi-Bellman equation expressing necessary and sufficient conditions for optimality of a value function that at every instant of time determines the optimal behaviour of an agent, depending on the state of the system. Thus the mathematical disciplines involved are mainly control theory and (stochastic) differential equations. Mean Field Games (MFGs) thus assume actually a continuous set of infinitesimal agents rather than a large set of discrete players. Since around 2006-2007, where the framework was properly formulated the subject has boomed. These models were originally introduced because of their applications in economics but later also many other applications in engineering and informatics emerged, and it became an applied mathematical topic that deserves further analysis in its own right, independent of the concrete application.

In a finite Nash differential game (NDG), the state variables are also functions of a continuous time, but there is a discrete set of $N$ players. If $N$ is not too large, then such an $N$-Nash system has been investigated and its Nash equilibrium is well understood. To find an equilibrium state, there is a set of coupled differential equations that describes the time evolution of the individual value functions for the $N$ agents. The value function of an agent evolves in time depending on the time-varying states of all the agents in the system and this defines the instantaneous behaviour of the agent. Given the value functions at some horizon time $T$ for all the agents, the system has to be solved backward in time, and once these value functions for the individual players are known, one can compute their individual trajectories forward in time, where there is some individual noise added for every player as well as some global common noise for the whole system. This is how an $N$-Nash differential game is solved. One would hope that the equilibrium of an <$N$-Nash system tends to an equilibrium for the corresponding MFG with a continuum of players as $N$ tends to infinity. The purpose of this book is to prove that under appropriate conditions, in some sense, this is true for a large class of MFGs,

Instead of analysing the complicated Nash system with $N$ really large, the authors use as their main approach the *Master Equation* (ME), which is an MFT concept imported from systems in physics and chemistry. This ME describes the expected asymptotic behaviour, and thus avoid the complexity of the huge Nash game. This reduces the system of infinitely many differential equations to just one stochastic differential equation whose solution is a trajectory for some average value function $U$. This $U$ depends on time $t$ and the state of the system. According to MFT the state is split into the state $x$ (a $d$-dimensional vector) of the individual agent (because of symmetry it does not matter which one) and a distribution $m$ characterizing some average state of all the other players. It is then proved that under appropriate conditions the value function $v$ of an individual player (any player) of an $N$-player differential Nash-system converges as $N\to\infty$ to the equilibrium solution provided by the Master Equation at a rate of $1/N$. Also the state trajectory $X$ of any player convergences in a probabilistic sense like $1/N^{\frac{1}{d+8}}$ to the associated asymptotic trajectory, (solutions of the McKean-Vlasov SDEs using the $U$ previously obtained).

This project started with the intenton of writing a paper, but by making it somewhat self-contained and because of the generality of the result, it grew out into a book. Here "self-contained" needs to be understood as "for someone familiar with Nash systems" since it requires some knowledge of the subject that is silently assumed. If this knowledge is not present, then some extra reading of the cited references will be required to understand the details. For example, the first chapter is a rather extensive introduction to the problem and the concepts used, it gives a summary of the results to be proved and surveys how this is structured in subsequent chapters with some guidelines on what to read, depending on the knowledge and the interest of the reader. Equation (1.2) and (1.3) on page 4 describe a Nash system as

\begin{eqnarray*}

&&-\partial_t v^{N,i}(t,\boldsymbol{x})-\sum_{j=1}^N\Delta_{x_j} v^{N,i}(t,\boldsymbol{x})-

\beta \sum_{j,k=1}^N \mathrm{Tr} D^2_{x_j,x_k} v^{N,i}(t,\boldsymbol{x})\\

&&+H(x_i,D_{x_i},D_{x_i}v^{N,i}(t,\boldsymbol{x}))+

\sum_{j\ne i} D_p H(x_j,D_{x_j}v^{N,i}(t,\boldsymbol{x}))\cdot D_{x_j} v^{N,i}(t,\boldsymbol{x})=F^{N,i}(\boldsymbol{x}),~~~~(t,\boldsymbol{x})\in[0,T]\times (\mathbb{R}^d)^N,\\

&&v^{N,i}(T,\boldsymbol{x})=G^{N,i}(\boldsymbol{x}),~~~\boldsymbol{x}\in(\mathbb{R}^d)^N,~~~i\in\{1,\ldots,N\},

\end{eqnarray*}

and the individual trajectories of the agents by

\[

dX_{i,t}=-D_p H(X_{i,t},Dv^{N,i}(t,\boldsymbol{X}_t))dt+\sqrt{2} dB_t^i+\sqrt{2\beta}dW_t,~~t\in[0,T],~~i\in\{1,\ldots,N\}.

\]

It is explained that the first system is the Nash system considered with $v^{N,i}$ the unknown value functions depending on $\boldsymbol{x}=(x_1,\ldots,x_N)$, $H$ is the Hamiltonian, $\beta\ge0$ is a parameter, and $T\ge0$ is the time horizon. The second describes the optimal trajectories $X_{i,t}$ of the states of the players, with $B_t^i$ individual noise, and $W_t$ some common noise (both are Brownian motions). This is about all the explanation given, so that a reader unfamiliar with the subject will have some problem already on page 4. Although the statement is more fully introduced in chapter 2, and there is an extra appendix with explanation about derivatives with respect to random variables, there is no further explanation about the notation <$D_p$ or $\Delta_{x_i}$ or about an expression for the Hamiltonian (except that it is related to the cost that agent $i$ has to pay). However, if one is familiar with the basics, then the introduction of the main results and the proofs are well explained. For the MFG system, and the Master Equation, similar expressions are introduced, except that the state is split up as $\boldsymbol{x}=(x,m)$ where $x$ is the state of an (infinitesimal) individual and $m$ the distribution of the state of all the others. Since $m=m(t)$ is time-varying, also its evolution over time has to be monitored separately. Thus we get a similar but somewhat different set of equations for the MFG and the ME. Since these require derivatives with respect to the distribution $m$, the appendix is needed to explain this concept in more detail. The ME is an essential element in this book, and although known in other fields, it can be slightly different as it is applied here. So care has been taken to explain it in the present situation also on an intuitive basis in the introductory chapter. This is helpful to assimilate the subsequent chapters.

The analysis holds under several restrictions that are carefully explained with many links to the literature. For example boundary problems for $t\in[0,T]$ are avoided by assuming periodic (in time) solutions, $F$ and $G$ satisfy some monotonicity condition, $H$, $F$, and $G$ are supposed to be smooth enough. Some of these restrictions can be removed or generalized but they are maintained here mainly for simplicity. Although it is not explained, the current approach is potentially useful for numerical implementation. Also the full proof in chapter 3 of the existence of the equilibrium for the MFG assumes the first order case ($\beta=0$ and thus no common noise). The second order system ($\beta>0$ with common noise) is further explored in chapters 4 and the ME in chapter 5, with the eventual convergence proof in chapter 6.

This book appears as a volume in the *Annals of Mathematics Studies* and it is a major contribution to the state of the art in MFGs which is a must read for researchers in the field. It seems like several preliminary versions of this text were previously made available on the Web. There are few typos, which is an achievement for a book with that many formulas. Even with that many formulas and technicalities, the book is still quite readable, because the authors use the book format (and not a more compact paper format) to explain all their steps carefully. Because of its structured approach, it could be used as a textbook for an advanced course on the subject.

**Submitted by Adhemar Bultheel |

**21 / Oct / 2019