Chapter 5 introduced consumer theory through utility maximization and the Lagrangian. This chapter strips away the crutch of specific functional forms and builds the theory from axiomatic foundations. We ask: when can preferences be represented by a utility function? What properties must demand functions satisfy? And under what conditions does a system of competitive markets allocate resources efficiently?
The shift in method is from computation to proof. Part II solved optimization problems. Part III proves theorems — establishing which results are robust and which depend on special assumptions.
Prerequisites: Chapters 6–7. Mathematical prerequisites: real analysis basics (open/closed sets, continuity, fixed-point theorems), convex analysis, matrix algebra. See Appendix A.
Named literature: Mas-Colell, Whinston & Green (MWG); Debreu Theory of Value; Arrow & Debreu (1954); Varian Microeconomic Analysis.
The standard axioms:
Proof sketch. Fix a ray $\{te : t \geq 0\}$ where $e = (1,1,\ldots,1)$. For each $x$, by completeness and continuity, there exists a unique $t(x) \geq 0$ such that $x \sim t(x)e$. Set $u(x) = t(x)$. Transitivity ensures the representation is consistent; continuity ensures $u$ is continuous.
The utility function is ordinal — any monotonic transformation $v = g(u)$ with $g' > 0$ represents the same preferences. Cardinal properties (magnitudes of utility differences) are meaningless.
Consider lexicographic preferences on $\mathbb{R}^2_+$: $x \succ y$ if $x_1 > y_1$, or $x_1 = y_1$ and $x_2 > y_2$.
Completeness: Satisfied — for any $x, y$, either $x_1 > y_1$, $y_1 > x_1$, or $x_1 = y_1$ and we compare $x_2, y_2$.
Transitivity: Satisfied — if $x \succ y$ and $y \succ z$, then $x \succ z$ (follows from transitivity of $>$ on $\mathbb{R}$).
Continuity: Fails. Consider $y = (1, 1)$. The set $\{x : x \succ y\}$ includes $(1, 1.5)$ but not $(0.999, 100)$. The "at least as good" set is not closed — there is a jump at $x_1 = 1$.
Consequence: No continuous utility function represents lexicographic preferences. This shows that continuity is essential for Debreu's utility representation theorem.
Instead of assuming preferences, we can infer them from observed choices.
Formally: if $x$ is revealed preferred to $y$ ($xRy$: $x$ chosen at prices where $y$ was affordable), then $y$ is not revealed preferred to $x$.
SARP is necessary and sufficient for observed choices to be consistent with utility maximization (Afriat's theorem). WARP is necessary but not sufficient in general (though it is sufficient with two goods).
A consumer's choices at two price-income situations:
| Situation | Prices $(p_1, p_2)$ | Chosen bundle $(x_1, x_2)$ | Expenditure |
|---|---|---|---|
| A | (1, 2) | (4, 2) | 8 |
| B | (2, 1) | (2, 4) | 8 |
Check WARP: At prices A, could the consumer afford bundle B? \$1(2) + 2(4) = 10 > 8$. No. At prices B, could the consumer afford bundle A? \$1(4) + 1(2) = 10 > 8$. No. WARP is satisfied — the data are consistent with utility maximization.
Enter price vectors and chosen bundles for up to 6 observations. The checker will test WARP and SARP automatically.
| Obs. | $p_1$ | $p_2$ | $x_1$ | $x_2$ | Expenditure |
|---|---|---|---|---|---|
| 1 | 8.0 | ||||
| 2 | 8.0 | ||||
| 3 | 6.0 | ||||
| 4 | — | ||||
| 5 | — | ||||
| 6 | — |
Interactive 10.1. Enter price-bundle observations and test for revealed preference consistency. WARP checks direct pairwise reversals; SARP checks for cycles of any length. Violations are highlighted with explanations.
You now have the formal content of "rationality": completeness, transitivity, continuity — the axioms required for utility functions to exist — and WARP/SARP, which make rationality empirically testable from observed choices.
Rational choice is now precise: complete + transitive + continuous preferences guarantee a utility function exists (Debreu's representation theorem). WARP and SARP give empirical tests — if you chose bundle $A$ when $B$ was affordable, you should never choose $B$ when $A$ is affordable at those prices. Violations of SARP are violations of rationality, full stop. The entire apparatus of welfare economics — the welfare theorems you'll prove in §11.6–11.7, the duality framework in §11.3–11.4, mechanism design in Chapter 12 — requires these axioms. Without them, utility functions don't exist, consumer surplus is undefined, and "efficiency" loses its formal meaning.
Completeness is implausible for complex choices — people genuinely don't have well-defined preferences over all possible bundles (Sen 1997). Transitivity fails systematically: preference reversals between gambles (Grether & Plott 1979) are robust and replicable across decades of experiments. Context dependence — decoy effects, framing effects, anchoring — violates the independence of irrelevant alternatives that WARP requires. The Allais paradox shows that expected utility's independence axiom fails even among trained decision theorists. These aren't occasional lapses by confused subjects; they're systematic patterns that survive incentives, experience, and high stakes. If the axioms fail, the utility function doesn't exist, and welfare analysis — which depends on maximizing a well-defined objective — loses its foundations entirely.
The mainstream response is twofold. First, at the individual level, violations are real but the stakes in most laboratory experiments are trivial — people may satisfice over small gambles but optimize over consequential decisions (mortgages, career choices, firm strategy). Second, at the market level, competition and selection may eliminate irrational agents: Alchian (1950) and Friedman (1953) argued that firms behaving as if they maximize profits survive, regardless of their actual decision process. The "as if" defense says that even if individuals aren't literally maximizing utility, markets behave as if they were — because competitive pressure weeds out consistently irrational behavior. This defense is powerful but depends on the speed and completeness of market discipline.
The axioms are best understood as a benchmark, not a description of how people actually decide. They tell you what consistency requires, and deviations from them are informative — they point to specific psychological mechanisms (loss aversion, probability weighting, framing effects, present bias) that can themselves be modeled formally. The revealed preference framework is valuable precisely because it's testable: SARP doesn't ask whether people feel rational, it checks whether their choices are consistent. The question is whether the violations that laboratory experiments document survive the aggregation and competition of real markets.
The "as if" defense works only if markets discipline irrational behavior quickly. But does arbitrage actually eliminate biases, or can noise traders survive and move prices? Come back in Chapter 19 (§19.1–19.2, §19.8), where prospect theory provides a formal alternative to expected utility, and behavioral finance tests whether biases survive in the one market — financial markets — where you'd most expect them to be eliminated. The DSSW noise trader model and limits-to-arbitrage literature give the surprising answer.
Saint-Paul argues the internal logic of behavioral economics points toward hard paternalism, not the gentle kind. If people violate the axioms systematically, who decides what "better" means?
IntermediateThe viral slogan meets the First Welfare Theorem. Some fortunes are market failures; others are surplus creation. The word "every" is where the claim breaks.
AdvancedChapter 5 solved the primal problem: maximize utility subject to a budget. The dual problem minimizes expenditure to achieve a target utility level.
The solution is the Hicksian (compensated) demand $h(p, \bar{u})$:
The indirect utility function $V(p, m)$ gives the maximum utility achievable at prices $p$ with income $m$:
$$V(p, m) = \max_{x} \; u(x) \quad \text{s.t.} \quad p \cdot x \leq m$$The key duality relationships:
Roy's identity provides a shortcut for deriving Marshallian demand from the indirect utility function:
Intuition for Roy's identity: A small increase in $p_i$ has two effects on welfare (measured by $V$): (1) it directly reduces utility by making good $i$ more expensive (the numerator $\partial V/\partial p_i < 0$), and (2) the magnitude of this effect is proportional to how much of good $i$ the consumer buys ($x_i$) times the marginal utility of income ($\partial V/\partial m$). Dividing (1) by the marginal utility of income gives the quantity of good $i$.
CES utility: $u(x_1, x_2) = (x_1^\rho + x_2^\rho)^{1/\rho}$, $\rho < 1$, $\rho \neq 0$.
The expenditure function is: $e(p, \bar{u}) = \bar{u} \cdot (p_1^r + p_2^r)^{1/r}$ where $r = \rho/(\rho - 1)$.
Hicksian demand (Shephard's lemma): $h_i = \bar{u} \cdot p_i^{r-1} / (p_1^r + p_2^r)^{(r-1)/r}$.
As $\rho \to 0$ (elasticity of substitution $\sigma = 1/(1-\rho) \to 1$), this converges to the Cobb-Douglas case.
Cobb-Douglas utility $u = x_1^{0.5} x_2^{0.5}$ with income $m = 10$. Slide $p_1$ to see how all three representations — budget-line tangency, Marshallian demand, and expenditure function — encode the same information.
Interactive 11.2. Three views of the same consumer. Left: indifference curve tangent to budget line (primal). Center: Marshallian demand for good 1 as a function of $p_1$. Right: expenditure function $e(p_1, p_2, \bar{u})$ needed to achieve the current utility level. All three encode the same preferences.
The Slutsky equation from Chapter 5 (Eq. 6.7) generalizes to a matrix. Define the Slutsky (substitution) matrix with entries:
If demand is generated by utility maximization, the Slutsky matrix must be:
These are testable restrictions — if observed demand violates them, it cannot have been generated by a rational consumer maximizing a well-behaved utility function.
Cobb-Douglas demand: $x_1 = am/p_1$, $x_2 = (1-a)m/p_2$.
$S_{12} = \partial x_1/\partial p_2 + x_2 \cdot \partial x_1/\partial m = 0 + [(1-a)m/p_2] \cdot [a/p_1] = a(1-a)m/(p_1 p_2)$
$S_{21} = \partial x_2/\partial p_1 + x_1 \cdot \partial x_2/\partial m = 0 + [am/p_1] \cdot [(1-a)/p_2] = a(1-a)m/(p_1 p_2)$
$S_{12} = S_{21}$ ✓
Adjust the price of good 1 to see how Marshallian demand, Hicksian (compensated) demand, and the income effect respond. Uses Cobb-Douglas utility $u(x_1,x_2)=x_1^a x_2^{1-a}$ with $a=0.6$, $p_2=3$, $m=120$.
Figure 11.2. Left: Slutsky decomposition in commodity space. The original bundle (blue), compensated bundle (orange, on original indifference curve at new prices), and new bundle (green). The substitution effect moves from blue to orange; the income effect moves from orange to green. Right: Slutsky matrix entries $S_{11}$ and $S_{12}$ as $p_1$ varies, confirming negative semidefiniteness ($S_{11} \leq 0$) and symmetry.
Consider an economy with $I$ consumers and $L$ goods. Consumer $i$ has endowment $\omega_i \in \mathbb{R}^L_+$ and preferences $\succsim_i$.
At prices $p$, consumer $i$'s wealth is $m_i = p \cdot \omega_i$. She demands $x_i(p, m_i)$.
Aggregate excess demand:
Equilibrium requires $z(p^*) = 0$.
Implications: (1) If $L - 1$ markets clear, the $L$th clears automatically. (2) Only relative prices matter — we can normalize one price to 1 (the numeraire).
Proof strategy (sketch). Normalize prices to the unit simplex $\Delta$. Define a price-adjustment map $f: \Delta \to \Delta$ that raises the price of goods in excess demand. By Brouwer's fixed-point theorem, $f$ has a fixed point $p^*$. At the fixed point, $z(p^*) = 0$ — all markets clear.
For a 2-consumer, 2-good economy, the Edgeworth box provides a complete visualization. The box dimensions equal total endowments. Consumer 1's origin is at bottom-left, consumer 2's at top-right. Every point in the box is a feasible allocation.
Two consumers with Cobb-Douglas preferences. Drag the endowment point to explore how the Walrasian equilibrium, contract curve, and core change.
Figure 11.1 (Interactive). The Edgeworth box. The orange dot is the endowment. The green dot is the Walrasian equilibrium. The red curve is the contract curve (all Pareto-efficient allocations). The shaded core region shows allocations both consumers prefer to the endowment. The budget line passes through the endowment with slope $-p_x/p_y$.
Consumer 1: $u_1 = x_1^{1/2}y_1^{1/2}$, endowment $(4, 0)$. Consumer 2: $u_2 = x_2^{1/2}y_2^{1/2}$, endowment $(0, 4)$.
Market clearing gives $p_x = p_y$, and the equilibrium allocation is $x_1^* = y_1^* = 2$, $x_2^* = y_2^* = 2$.
Each consumer trades half their endowment for the other good, ending up with equal amounts of both goods.
Proof. We proceed by contradiction. Suppose the Walrasian equilibrium allocation $x^*$ at prices $p^*$ is not Pareto optimal. Then there exists a feasible allocation $x'$ with everyone at least as well off and someone strictly better off.
Step 1. For consumer $j$ who is strictly better off: since $x_j^*$ was utility-maximizing and $x_j'$ is strictly preferred, $x_j'$ must have been unaffordable: $p^* \cdot x_j' > p^* \cdot \omega_j$.
Step 2. For every consumer $i$: by local nonsatiation, $p^* \cdot x_i' \geq p^* \cdot \omega_i$.
Step 3. Summing: $\sum_i p^* \cdot x_i' > \sum_i p^* \cdot \omega_i$.
Step 4. But feasibility requires $\sum_i x_i' = \sum_i \omega_i$, giving $\sum_i p^* \cdot x_i' = \sum_i p^* \cdot \omega_i$. Contradiction. $\square$
The proof uses only local nonsatiation and budget exhaustion. It does not require convexity, differentiability, or any specific functional form. This generality is what makes the theorem powerful.
Interpretation. The First Welfare Theorem is the formal statement of Adam Smith's "invisible hand." Competitive markets produce an allocation that no rearrangement can improve upon without making someone worse off. But the assumptions (complete markets, price-taking, no externalities, no public goods, full information) define exactly when the invisible hand fails.
Consumer 1: $u_1 = x_1^{1/2}y_1^{1/2}$, endowment $(4, 0)$. Consumer 2: $u_2 = x_2^{1/2}y_2^{1/2}$, endowment $(0, 4)$.
From Example 11.4, the equilibrium is $x_1^* = y_1^* = x_2^* = y_2^* = 2$ at $p_x = p_y$.
Check Pareto optimality: At the equilibrium, $MRS_1 = y_1/x_1 = 1$ and $MRS_2 = y_2/x_2 = 1$. Since $MRS_1 = MRS_2 = p_x/p_y$, the indifference curves are tangent — the allocation is on the contract curve.
Verify no Pareto improvement: Any reallocation giving Consumer 1 more of good $x$ (say $x_1 = 3$) requires $x_2 = 1$. Then $u_1 = \sqrt{3 \cdot y_1}$ and $u_2 = \sqrt{1 \cdot y_2}$ with $y_1 + y_2 = 4$. For Consumer 1 to gain ($u_1 > \sqrt{4} = 2$), we need $y_1 > 4$, so $y_1 > 4/3$, leaving $y_2 < 8/3$, giving $u_2 = \sqrt{8/3} < 2 = u_2^*$. Consumer 2 is worse off. No Pareto improvement exists.
The Walrasian equilibrium lies on the contract curve (Pareto efficient). Toggle "Pareto improvements?" to verify: at the equilibrium, the lens-shaped region where both consumers can gain is empty. At the endowment, it is not.
Interactive 10.3. Toggle between viewing the equilibrium (where no Pareto improvements exist) and the endowment (where the shaded lens shows mutually beneficial trades). The equilibrium's position on the contract curve proves efficiency visually.
Dan Riffle, AOC's former policy aide, turned this line into a social media mantra — shared millions of times, printed on T-shirts, chanted at rallies. The claim is stark: billionaires don't exist because they created extraordinary value. They exist because the system is broken — tax loopholes, monopoly power, rigged rules. The First Welfare Theorem you just proved gives you the tools to test this precisely: does extreme wealth concentration represent the market working correctly (and we just dislike the endowment), or the market failing (and efficiency is not achieved)?
AdvancedInterpretation. The Second Welfare Theorem says efficiency and equity are separable problems. Society can choose any Pareto-efficient distribution through two steps:
The markets will then produce a competitive equilibrium that is both efficient (by the First Welfare Theorem) and achieves the desired distribution.
Why it matters for policy. Don't distort markets to achieve equity (that sacrifices efficiency). Instead, use lump-sum transfers to redistribute, then let markets work. The right-wing implication: let markets operate freely. The left-wing implication: redistribute as much as you want. Both can be achieved simultaneously — in theory.
Why it fails in practice. Lump-sum transfers require information about individuals' types that the government does not have. Real-world redistribution uses distortionary taxes (income, capital gains, wealth) that change incentives and create deadweight loss. This information problem is the subject of mechanism design (Chapter 12) and optimal taxation (Chapter 16).
In large economies, the set of core allocations (allocations that no coalition can improve upon) shrinks to the set of Walrasian equilibrium allocations. This is the core equivalence theorem — competitive equilibrium is the unique outcome that survives competition among all possible coalitions.
We model Maya's lemonade market as a 2-consumer, 2-good Edgeworth box exchange economy.
Setup: Maya and Alex. Two goods: lemonade ($L$) and cookies ($C$). Maya starts with 45 lemonade and 0 cookies. Alex starts with 0 lemonade and 40 cookies.
Preferences: $u_M = L_M^{0.5}C_M^{0.5}$, $u_A = L_A^{0.3}C_A^{0.7}$.
Market clearing gives $p_L/p_C = 8/15 \approx 0.533$.
Equilibrium: Maya: $(L_M, C_M) = (22.5, 12)$. Alex: $(L_A, C_A) = (22.5, 28)$.
By the First Welfare Theorem, this allocation is Pareto optimal.
Arrow-Debreu (1954): The Existence Proof. Kenneth Arrow and Gerard Debreu proved that a competitive equilibrium exists under weak assumptions (convex preferences, no externalities). Using Kakutani's fixed-point theorem, they showed that a set of prices exists clearing all markets simultaneously — formalizing Adam Smith's "invisible hand" two centuries after The Wealth of Nations.
The mathematical achievement was remarkable: reducing the problem to showing that a certain correspondence (excess demand as a function of prices) satisfies the conditions for a fixed point. The result required only local nonsatiation and convexity — not differentiability or specific functional forms.
Debreu's Theory of Value (1959) distilled this framework into a rigorous axiomatic system, earning him the 1983 Nobel Prize. Arrow had already received the Nobel in 1972 for his broader contributions to general equilibrium and social choice. Their existence proof remains the mathematical foundation for welfare economics and the two welfare theorems proved in this chapter.
You now have the formal welfare theorems — the definitive statement of when and why competitive markets produce efficient outcomes, and the precise conditions under which any efficient outcome can be decentralized.
The First Welfare Theorem delivers the strongest possible efficiency result: if preferences are locally nonsatiated and markets are complete and competitive, then every Walrasian equilibrium is Pareto optimal. You saw the proof — it works by contradiction, exploiting the fact that any Pareto improvement would require someone to afford a bundle they couldn't at equilibrium prices. The Second Welfare Theorem completes the picture: under convexity, any Pareto optimal allocation can be achieved as a competitive equilibrium after appropriate lump-sum redistribution of endowments. Together, these theorems say that the market mechanism is both sufficient for efficiency (First WT) and flexible enough to achieve any efficient outcome society desires (Second WT). The price system simultaneously solves the information problem (no planner needed) and the coordination problem (all markets clear).
The conditions of the First Welfare Theorem are exacting, and every one of them fails in important real-world markets. Complete markets require a market for every good, every state of the world, every date — this fails massively (you cannot buy insurance against most life risks, future markets are thin, contingent claims are incomplete). Price-taking fails in any market with significant firms (tech, pharma, airlines). No externalities fails for climate, pollution, network effects, and knowledge spillovers. Greenwald and Stiglitz (1986) proved the devastating result: whenever markets are incomplete — which is always — competitive equilibria are generically constrained-inefficient. That is, there exist interventions using only the same information and instruments available to markets that are Pareto improving. The theorem doesn't say markets are bad; it says the conditions for the First Welfare Theorem are a knife-edge that reality never hits.
The profession's relationship with the welfare theorems matured considerably after Greenwald-Stiglitz. The theorems are now understood not as claims that markets work, but as a diagnostic framework: they identify exactly which conditions must hold for efficiency, and deviations from those conditions point precisely to where intervention might help. The Second Welfare Theorem's promise — that you can separate efficiency from equity — is formally correct but practically hollow. Lump-sum transfers require the government to know each individual's type (ability, preferences, endowment) without distorting behavior. Any feasible transfer instrument (income tax, wealth tax, means-tested benefits) changes incentives and creates deadweight loss. This is the Mirrlees (1971) insight: optimal taxation is a constrained problem precisely because the Second Welfare Theorem's instrument doesn't exist.
The welfare theorems are the most important results in economics — not because they prove markets work, but because they identify exactly when and why markets work or fail. Understanding the theorems is prerequisite to intelligent intervention: every market failure is a specific violation of a specific condition. The First Welfare Theorem is a conditional claim, and the conditions rarely hold in full — but they hold approximately in enough settings to explain why markets coordinate as well as they do. The Second Welfare Theorem is theoretically beautiful and practically cruel: it tells you equity and efficiency are separable, then makes the separation instrument informationally infeasible. Real policy lives in the second-best world where every redistribution creates distortions.
If markets fail when the welfare theorem conditions aren't met, is there a systematic way to design better institutions? The welfare theorems tell you when markets work but not what to do when they don't. Come back in Chapter 12 (§12.1–12.5), where mechanism design asks exactly this question. The revelation principle, VCG mechanisms, and matching markets show that economic theory can engineer efficient outcomes — sometimes outperforming both unregulated markets and blunt government intervention. That's the final stop on this question.
The viral slogan meets the First Welfare Theorem. Some fortunes are market failures; others are surplus creation. The word "every" is where the claim breaks.
AdvancedSanders' viral rallying cry meets Arrow's 1963 paper. The moral force is real — but declaring a right doesn't solve the allocation problem.
Intermediate| Label | Equation | Description |
|---|---|---|
| Eq. 11.1 | $e(p, \bar{u}) = \min p \cdot x$ s.t. $u(x) \geq \bar{u}$ | Expenditure minimization |
| Eq. 11.2 | $h_i = \partial e / \partial p_i$ | Shephard's lemma |
| Eq. 11.3–11.4 | $e(p, V(p,m)) = m$; $V(p, e(p,\bar{u})) = \bar{u}$ | Duality identities |
| Eq. 11.5 | $h(p, \bar{u}) = x(p, e(p, \bar{u}))$ | Hicksian = Marshallian at compensated income |
| Eq. 11.6 | $x_i = -(\partial V/\partial p_i)/(\partial V/\partial m)$ | Roy's identity |
| Eq. 11.7 | $S_{ij} = \partial h_i/\partial p_j = \partial x_i/\partial p_j + x_j \partial x_i/\partial m$ | Slutsky matrix entry |
| Eq. 11.8 | $z(p) = \sum_i x_i(p) - \sum_i \omega_i$ | Aggregate excess demand |