The Power Game

Equilibrium Characterization and Full Result Map

This tab replaces the short existence statement with the actual equilibrium object and the result set from the earlier paper, translated into the discrete-time model.

Correction. The right characterization is not just "a stationary equilibrium exists." It is the Bellman-Nash fixed point together with the support equal-payoff conditions at every gap state. The earlier continuous-time threshold results are kept as local approximations, not as the exact global discrete equilibrium.

1. Exact Equilibrium Object

The state is the capability gap \(g\). Each firm chooses training \(x_i\). Training demand is \(T_i=x_i^2/2\), total training is \(T=(x_1^2+x_2^2)/2\), and the compute rental price is \(R(T)=\bar W/(\bar Q-T)\).

\tilde g=(1-\mu)g+x_1-x_2, \qquad P(g'\mid g,x_1,x_2) \text{ maps } \tilde g \text{ back to } G.

The frontier payoff from the post-training gap is

\pi(g)= \omega\left[ s\left(L(ag)-\frac12\right) +(1-s)\left[2\sinh(ag/2)\right]^+ \right], \qquad L(z)=\frac{1}{1+e^{-z}}.

Stage payoffs are

u_1(g,x_1,x_2)=\pi(\tilde g)-R(T)\frac{x_1^2}{2}, \qquad u_2(g,x_1,x_2)=\pi(-\tilde g)-R(T)\frac{x_2^2}{2}.

2. Bellman-Nash Characterization

A stationary Markov equilibrium is a pair of mixed rules \(g\mapsto\sigma_i(g)\) and values \(V_i(g)\). Given the rival's rule and the value function, the payoff from choosing a pure action \(x_i\) at state \(g\) is

W_i(g,x_i;\sigma_j,V_i) = \sum_{x_j\in X}\sigma_j(x_j\mid g) \left[ u_i(g,x_i,x_j) +\beta\sum_{g'\in G}P(g'\mid g,x_i,x_j)V_i(g') \right].

The exact equilibrium conditions are the support conditions:

W_i(g,x_i;\sigma_j,V_i)=V_i(g) \quad\text{if}\quad \sigma_i(x_i\mid g)>0,

W_i(g,x_i;\sigma_j,V_i)\le V_i(g) \quad\text{if}\quad \sigma_i(x_i\mid g)=0.

Values are the equilibrium expected payoffs:

V_i(g)= \sum_{x_i\in X}\sigma_i(x_i\mid g) W_i(g,x_i;\sigma_j,V_i).

Symmetry lets the computation impose

V_2(g)=V_1(-g), \qquad \sigma_2(x\mid g)=\sigma_1(x\mid -g).

This is the actual finite-state equilibrium characterization. A pure-strategy equilibrium is the special case where each support has one action; then each chosen action is an argmax of \(W_i\).

3. Finite-Horizon Characterization

For \(H\) periods, set \(V_{i,0}(g)=0\). With \(h\) periods left, solve the finite matrix game with payoff

U_{i,h}(g,x_1,x_2) = u_i(g,x_1,x_2) + \beta\sum_{g'\in G}P(g'\mid g,x_1,x_2)V_{i,h-1}(g').

The Nash equilibrium of this matrix game gives \(\sigma_{i,h}(g)\) and \(V_{i,h}(g)\). This is exact backward induction, not an approximation.

4. Main Results Carried Over

Result	Characterization	Status in discrete time
Two faces	Household value saturates, while producer value is \(A(q)=Ke^{aq}\). Relative producer value is \(e^{ag}\).	Exact.
Induced power game	One-shot hazards give \(H(g)=L(\theta ag)\), \(P(g)=H(g)-1/2\), \(p_1=\theta a/4\), and \(p_3=-(\theta a)^3/48\).	Exact for one-shot frontier tasks.
Saturate and settle	If frontier stakes are absent, saturated consumer value removes the marginal racing motive; imitation closes the gap.	Exact once \(\omega=0\) and consumer marginal value is flat.
Dynamic equilibrium	The equilibrium is the Bellman-Nash support system above. It returns state-contingent training, compute demand, values, and transition dynamics.	Exact finite-state characterization.
Compute-market discipline	\(R(T)=\bar W/(\bar Q-T)\) and \(E_T=\bar WT/(\bar Q-T)\). Escalation shows up as compute rents when \(T\) approaches capacity.	Exact accounting.
Composition	Priority tasks are discouraging when behind; repeatable Bertrand tasks give comeback incentives and a kink at parity.	Exact payoff objects; smooth \(s^*=1/(1+\theta^3)\) is only the old local smooth projection.
Fragile parity	In continuous time, parity is locally stable when \(\Lambda(\omega)<\mu\), where \(\Lambda(\omega)=\Lambda_1\omega+\Lambda_2\omega^2\), and tips when the inequality reverses.	Appendix/local approximation to the discrete game.
Escalation	When the prize rides the frontier, the race intensifies even at parity; the continuous-time approximation acts like \(\beta_{\mathrm{eff}}=\beta+a/2\).	Discrete version enters through state-dependent frontier value and continuation values.
Access market	The static no-learning limit price is \(P_0(g)=e^{ag}\). With leakage, sell only when access rents cover the dynamic value lost to the rival.	Exact static benchmark plus dynamic threshold condition.

5. Local Closed-Form Formulas From the Earlier Paper

These are the earlier paper's closed-form continuous-time/local formulas. In the revised paper they should be read as approximations to the finite-state discrete equilibrium, not as substitutes for solving the Bellman-Nash system.

\Lambda(\omega)=\Lambda_1\omega+\Lambda_2\omega^2, \qquad \psi=\frac{h^2}{w}.

\Lambda_1=\frac{2\psi\beta p_1}{\rho+\mu}, \qquad \Lambda_2= \frac{12\psi^2p_1|p_3|} {(\rho+\mu)(\rho+2\mu)(\rho+3\mu)}.

\omega^* = \frac{-\Lambda_1+ \sqrt{\Lambda_1^2+4\Lambda_2\mu}} {2\Lambda_2}.

Parity is locally stable when \(\Lambda(\omega)<\mu\). It is locally unstable when \(\Lambda(\omega)>\mu\). With one-shot hazards, \(p_1=\theta a/4\) and \(|p_3|=(\theta a)^3/48\), so larger \(a\), \(\theta\), \(\beta\), or \(h^2/w\) makes parity more fragile; larger \(\mu\) or \(\rho\) stabilizes it.

T^* = \frac{1}{2} \left( \frac{h\omega p_1}{w(\rho+\mu)} \right)^2.

This is the old interior fixed-price cheap-race formula: lower \(w\) raises training and lowers the instability threshold. In the discrete compute-market version, \(w\) is replaced by the endogenous rental price \(R(T)\), so escalation appears as movement toward capacity and higher compute rents.

s^*=\frac{1}{1+\theta^3}.

This is the smooth composition threshold: priority-like value destabilizes; repeatable rental value stabilizes. Under homogeneous Bertrand repeatable services, the payoff is kinked at parity, so the discrete model computes the equilibrium directly rather than relying on this smooth threshold.

P_0(g)=e^{ag}, \qquad \text{sell access iff } E(1-e^{-ag})\ge \ell(E)\Delta V(g).

The access result says the leader can extract the static quality gap, but it refuses access near parity when leakage is valuable to the rival.

6. What the Equilibrium Returns

Policy functions: training distributions \(\sigma_i(g)\) at each gap.
Values: \(V_i(g)\), so we know the value of leading or lagging.
Compute demand: \(T(g)\) and rental price \(R(T(g))\).
Gap dynamics: whether parity is locally stable, contested, or tipping on the chosen grid.
Access rule: whether the leader sells or forecloses at each gap once leakage is specified.

7. What Must Be Computed

The one-, two-, and three-period games are exact by backward induction.
The infinite-horizon game is exact after choosing a finite grid and an equilibrium-selection rule.
The old closed-form threshold is useful, but it is not the full global equilibrium of the discrete model.
The next computational step is solving the stationary support system above and reporting the resulting training, price, and gap dynamics.

What Changed

Paper Reframe

Modeling Choice

Main Theoretical Changes

Verification Brought Alongside It

Current Status

Equilibrium Characterization and Full Result Map

1. Exact Equilibrium Object

2. Bellman-Nash Characterization

3. Finite-Horizon Characterization

4. Main Results Carried Over

5. Local Closed-Form Formulas From the Earlier Paper

6. What the Equilibrium Returns

7. What Must Be Computed