Hebel

Game theory

Prisoner's Dilemma

Why two rational actors end up harming each other even though cooperating would leave both better off.

Definition

The Prisoner’s Dilemma is the foundational model of non-cooperative game theory. Two actors independently choose between cooperation and defection. Defection is the strictly dominant strategy for both, so they land in the worse equilibrium (mutual defection) even though mutual cooperation would be superior.

Structure

The defining inequality is T > R > P > S: T = temptation (defect while the other cooperates), R = reward for mutual cooperation, P = penalty for mutual defection, S = the “sucker’s payoff” (cooperate while the other defects). Because defecting yields the higher individual payoff regardless of the other’s choice, both rational players defect — and receive P instead of the better R. It is a symmetric variable-sum game with a single, suboptimal Nash equilibrium.

When it applies

Anywhere short-term self-interest undermines long-term mutual stability: oligopoly price wars, arms races, environmental degradation, the failure of international treaties. Whenever each actor has an incentive to “cheat” even though everyone would be better off cooperating.

Leverage points

Change the payoff matrix or rules so cooperation becomes rational: iteration (a repeated game enables reciprocity and reputation-building, raising the cost of defection), external enforcement (regulation/penalties push the temptation payoff below the reward), and binding contracts or third-party arbitration that eliminate the dominant defection strategy.

Examples

Two competitors cutting prices until neither has margin left; nations arming because neither trusts the other; fishing fleets overharvesting a shared stock. The N-player version becomes the Tragedy of the Commons.

Payoff matrix

Player 2: CooperatePlayer 2: Defect
Player 1: Cooperate(Reward, Reward)(Sucker, Temptation)
Player 1: Defect(Temptation, Sucker)(Penalty, Penalty)

T > R > P > S — defection dominates; the equilibrium (Penalty, Penalty) is worse for both than (Reward, Reward).

Model it in Hebel

Build this pattern as a causal loop and simulate it.

Get invited

Related concepts

Sources: von Neumann & Morgenstern (1944), Theory of Games and Economic Behavior