Part I treated demand and supply curves as given. We drew them, shifted them, and measured the surplus they generated. But where do these curves come from? This chapter answers that question by deriving demand from the optimization problem of the consumer and supply from the optimization problem of the firm.
The shift in method is significant. Part I used algebra and geometry. This chapter introduces constrained optimization — maximizing an objective function subject to a constraint — using calculus and Lagrangian methods. The payoff is that demand and supply curves stop being assumptions and become consequences of deeper primitives: preferences, technology, and prices.
The chapter is long because it covers two parallel theories — consumer theory and producer theory — that mirror each other in structure. The consumer maximizes utility subject to a budget constraint; the firm minimizes cost subject to an output target (or maximizes profit subject to technology). Both lead to tangency conditions, and both generate the curves we took as given in Part I.
Prerequisites: Chapters 2 and 3. Mathematical prerequisites: multivariable calculus, constrained optimization (see Appendix A for review).
The consumer chooses among bundles of goods — combinations like "3 apples and 2 bananas" or "5 hours of leisure and \$100 of consumption." To model this choice, we need a way to represent the consumer's preferences — their ranking of different bundles.
For preferences to be well-behaved enough to model mathematically, we require three axioms:
Under these conditions, a fundamental theorem guarantees the existence of a utility function $U(x_1, x_2)$ — a real-valued function that assigns a number to each bundle such that:
Higher utility means more preferred. But the numbers themselves carry no meaning beyond ranking. Any monotonic transformation $V = g(U)$ (where $g$ is strictly increasing) represents the same preferences. This is what we mean by ordinal utility: only the ordering matters.
Properties of indifference curves (given well-behaved preferences): (1) Downward-sloping: more of one good requires giving up some of the other. (2) Cannot cross: would violate transitivity. (3) Higher curves = higher utility. (4) Convex to the origin (if preferences are convex): mixtures are preferred to extremes.
Along an indifference curve, $dU = 0$:
What this says: The MRS tells you your personal exchange rate between two goods. If your MRS is 3, you would give up 3 units of good 2 for 1 more unit of good 1 and feel equally happy. It equals the ratio of how much extra happiness each good gives you.
Why it matters: This is how economists measure "how much you want something" without using money. It captures trade-offs purely in terms of your own preferences, and it is the slope of the indifference curve at any point.
See Full Mode for the derivation.The MRS is the ratio of marginal utilities. Diminishing MRS: for convex preferences, the MRS decreases as the consumer moves down the indifference curve (more $x_1$, less $x_2$). Intuitively: the more lemonade you already have, the less you're willing to give up cookies for yet another cup.
| Name | $U(x_1, x_2)$ | MRS | Key feature |
|---|---|---|---|
| Cobb-Douglas | $x_1^a x_2^b$ | $(a/b)(x_2/x_1)$ | Constant budget shares |
| Perfect substitutes | $ax_1 + bx_2$ | $a/b$ (constant) | May buy only one good |
| Perfect complements | $\min(ax_1, bx_2)$ | Undefined at kink | Fixed consumption ratio |
| Quasilinear | $v(x_1) + x_2$ | $v'(x_1)$ | No income effect on $x_1$ |
| CES | $(x_1^\rho + x_2^\rho)^{1/\rho}$ | $(x_2/x_1)^{1-\rho}$ | Nests all of the above |
The slope $-p_1/p_2$ is the market rate of exchange: to buy one more unit of good 1 (costing $p_1$), the consumer must give up $p_1/p_2$ units of good 2.
Drag the sliders to change prices and income. Watch the budget line pivot and shift in real time.
Figure 5.0. The budget constraint shows all affordable bundles. Changing a price pivots the line around the other intercept; changing income shifts it parallel. The slope $-p_1/p_2$ is the market rate of exchange.
The Lagrange multiplier $\lambda$ is the marginal utility of income — the increase in maximum utility from an additional dollar of budget.
First-order conditions:
What this says: The consumer picks the best affordable bundle. The Lagrangian is the calculus machinery for solving this, but the result is simple: spend your budget so that the last dollar spent on each good gives you the same boost in happiness. If coffee gives you more happiness-per-dollar than tea, buy more coffee until the extra enjoyment per dollar is equalized.
Why it matters: This "equal bang for the buck" principle is the foundation of all demand theory. It explains why people diversify their spending rather than buying only one good, and it generates the demand curves from Chapter 2.
See Full Mode for the derivation.The consumer allocates spending so that the marginal utility per dollar is the same for both goods: $MU_1/p_1 = MU_2/p_2 = \lambda$. Dividing the first two conditions:
$U = x_1^{1/2} x_2^{1/2}$. Tangency: $x_2/x_1 = p_1/p_2$, so $x_2 = (p_1/p_2)x_1$.
Substituting into the budget constraint: \$1p_1 x_1 = m$.
Marshallian demand: $x_1^* = m/(2p_1)$, $x_2^* = m/(2p_2)$.
The consumer spends exactly half her income on each good — the constant budget share property of Cobb-Douglas preferences.
This visualization shows the deep connection: as $p_1$ changes, the optimal bundle traces out the demand curve for good 1. The demand curve IS the set of optimal points at different prices.
Figure 5.1a. Budget line and indifference curves. The optimal bundle is at the tangency point.
Figure 5.1b. The demand curve for good 1, traced out by varying $p_1$.
$U = \ln(x_1) + x_2$. Tangency: \$1/x_1 = p_1/p_2$, so $x_1^* = p_2/p_1$.
Budget: $x_2^* = m/p_2 - 1$.
Demand for $x_1$ depends only on the price ratio, not on income — the hallmark of quasilinear utility. There are no income effects on good 1.
When the price of a good changes, two things happen simultaneously:
What this says: When a price changes, two things happen simultaneously. First, the good becomes relatively more or less expensive compared to alternatives, so you substitute (the substitution effect — always pushes you away from the pricier good). Second, the price change makes you effectively richer or poorer, changing how much of everything you buy (the income effect). The Slutsky equation says: total response = substitution effect + income effect.
Why it matters: This decomposition explains why demand curves almost always slope downward (both effects reinforce for normal goods), and identifies the rare exception: Giffen goods, where the income effect is so strong it overwhelms substitution, making people buy more of something when its price rises.
See Full Mode for the derivation.| Good type | Substitution effect | Income effect | Total effect of price increase |
|---|---|---|---|
| Normal good | − (buy less) | − (poorer → buy less) | Unambiguously − |
| Inferior good | − (buy less) | + (poorer → buy more) | Usually − |
| Giffen good | − (buy less) | + (income effect dominates) | + (demand rises) |
Slide $p_1$ downward to see the price decrease decomposed into a substitution effect (movement along the original indifference curve) and an income effect (movement to a higher indifference curve).
Figure 5.2. Hicks decomposition of a price decrease. A = original bundle, B = compensated bundle (substitution effect), C = new bundle (income effect). The substitution effect moves along the original IC; the income effect shifts to a higher IC.
For Cobb-Douglas, the Engel curve is a straight line through the origin: $x_1 = am/p_1$, linear in $m$. The budget share is always $a$, regardless of income.
Adjust income with the slider to see how the optimal bundle shifts. The left panel shows budget lines and indifference curves; the right panel traces the Engel curve. Toggle between a normal good (Cobb-Douglas) and an inferior good (modified utility where demand bends back at high income).
Figure 5.4. Left: budget lines and indifference curves at different income levels. As income rises, the optimal bundle shifts outward along the income-consumption path. Right: the Engel curve plots quantity of good 1 (horizontal) against income (vertical). For a normal good (Cobb-Douglas), the Engel curve is linear. For an inferior good, it bends back at high income.
where $A > 0$ is total factor productivity and $\alpha \in (0,1)$ is the output elasticity of capital.
Marginal products: $MP_K = \alpha Y/K$, $MP_L = (1-\alpha)Y/L$. Both are positive and diminishing.
What this says: The MRTS tells you how many units of capital you can replace with one more worker while keeping output the same. It is the production analog of the consumer's MRS. When you already have lots of capital relative to labor, one extra worker is very productive (high MRTS); when you have lots of workers already, each additional one adds less.
Why it matters: This ratio determines the shape of the isoquant (the production equivalent of an indifference curve) and drives the firm's input choice. The firm will keep substituting the cheaper input for the more expensive one until the trade-off rate matches the relative input prices.
See Full Mode for the derivation.| Type | Condition | Meaning |
|---|---|---|
| CRS | $f(tK,tL) = tY$ | Doubling inputs doubles output |
| IRS | $f(tK,tL) > tY$ | Doubling inputs more than doubles output |
| DRS | $f(tK,tL) < tY$ | Doubling inputs less than doubles output |
$Y = K^{0.3}L^{0.8}$: $f(tK,tL) = t^{1.1}Y$. Since \$1.1 > 1$: increasing returns to scale.
The cost-minimizing condition (from the FOCs of the Lagrangian):
What this says: To produce at the lowest cost, the firm adjusts its mix of workers and machines until the "bang for the buck" is equal across inputs. If hiring one more worker adds more output per dollar than renting one more machine, hire the worker. Keep adjusting until the last dollar spent on labor and the last dollar spent on capital contribute equally to output.
Why it matters: This is the producer's version of the consumer's "equal marginal utility per dollar" rule. It explains why firms change their input mix when wages or interest rates change, and it generates the cost curves that underpin supply.
See Full Mode for the derivation.This perfectly parallels the consumer's $MRS = p_1/p_2$.
The firm chooses inputs to minimize cost. Adjust factor prices and watch the isocost line pivot and the optimal $K/L$ ratio change.
Figure 5.3. Cost minimization: the firm chooses the input mix where the isoquant ($\bar{Y} = 100$) is tangent to the lowest isocost line. The tangency condition is $MRTS = w/r$. When labor gets more expensive, the firm substitutes toward capital.
$Y = K^{0.5}L^{0.5}$, $w = 10$, $r = 20$. Produce $\bar{Y} = 100$.
$MRTS = K/L = w/r = 0.5$, so $K = 0.5L$.
$(0.5L)^{0.5} \cdot L^{0.5} = 100 \Rightarrow L^* = 141.4$, $K^* = 70.7$.
$TC = 10(141.4) + 20(70.7) = \\$1{,}828$. Since labor is cheaper, the firm uses more labor than capital.
In the short run, at least one input is fixed (typically capital: $K = \bar{K}$). In the long run, all inputs are variable.
| Cost concept | Symbol | Definition |
|---|---|---|
| Fixed cost | $FC$ | Cost of fixed inputs ($r\bar{K}$) |
| Variable cost | $VC$ | Cost of variable inputs ($wL(Q)$) |
| Total cost | $TC$ | $FC + VC$ |
| Marginal cost | $MC$ | $dTC/dQ$ |
| Average total cost | $AC$ | $TC/Q$ |
| Average variable cost | $AVC$ | $VC/Q$ |
| Average fixed cost | $AFC$ | $FC/Q$ (always declining) |
Key relationships:
The firm has $TC = 50 + 2Q + 0.05Q^2$. Adjust the market price to see the firm's profit-maximizing output and whether it earns profit or loss.
Figure 5.4. Short-run cost curves. The firm produces where $P = MC$ (on the rising portion). Green shading = profit; red shading = loss. Below the shutdown point ($AVC_{min}$), the firm produces nothing.
In the long run, the firm can choose any level of capital. The long-run average cost (LRAC) curve is the envelope of all short-run AC curves — each corresponding to a different level of fixed capital.
Why LRAC is typically U-shaped:
The output level at the bottom of the LRAC is the minimum efficient scale (MES) — the smallest output at which LRAC is minimized.
Each short-run AC curve corresponds to a different capital level. Drag the slider to highlight a specific SRAC curve and see how it relates to the LRAC envelope.
Figure 5.5. The long-run AC curve (black) is the envelope of short-run AC curves. Each SRAC corresponds to a different factory size. The highlighted SRAC (bold) shows the current capital level. The firm can move along LRAC in the long run by adjusting capital.
First-order condition:
What this says: A competitive firm should keep producing as long as the price it receives for one more unit exceeds the cost of making that unit. Stop when they are equal. Producing beyond that point means each additional unit costs more to make than it earns.
Why it matters: This single rule — price equals marginal cost — is where supply curves come from. The firm's supply curve is literally its marginal cost curve. It connects the abstract calculus of profit maximization to the supply-and-demand diagrams from Chapter 2.
See Full Mode for the derivation.The profit-maximizing rule: produce where price equals marginal cost. The firm should keep producing as long as the revenue from one more unit ($P$) exceeds the cost ($MC$). The firm's supply curve is the portion of its MC curve above $AVC_{min}$.
Why $P = MC$ is the supply curve — the deep connection. In Chapter 2, we drew the supply curve as upward-sloping. Now we see where it comes from: it is the firm's marginal cost curve. The supply curve slopes upward because marginal cost is increasing — not because we assumed it, but because it follows from diminishing marginal returns.
$TC = 50 + 2Q + 0.5Q^2$. At $P = 12$: $P = MC$ gives \$12 = 2 + Q$, so $Q^* = 10$.
$\Pi = 12(10) - [50 + 20 + 50] = 0$. Zero economic profit — the long-run competitive equilibrium.
A competitive firm has production function $Y = 10L^{0.5}$, faces wage $w = 20$ and output price $P = 8$.
Step 1 — Find the profit function. Revenue: $R = PY = 8 \times 10L^{0.5} = 80L^{0.5}$. Cost: $C = wL = 20L$. Profit: $\Pi = 80L^{0.5} - 20L$.
Step 2 — FOC. $d\Pi/dL = 40L^{-0.5} - 20 = 0 \implies L^{-0.5} = 0.5 \implies L^* = 4$.
Step 3 — Compute output and profit. $Y^* = 10(4)^{0.5} = 20$. Revenue = \$1 \times 20 = 160$. Cost = \$10 \times 4 = 80$. Profit = \$10.
Verify: $P \times MP_L = w$ at the optimum: \$1 \times 10 \times 0.5 \times 4^{-0.5} = 8 \times 2.5 = 20 = w$. ✓
Cost structure: $FC = \\$10$/day (stand rental). Materials: $\\$1.50$/cup. Maya's labor: 10 cups/hour at opportunity cost $\\$15$/hr, so $\\$1.50$/cup.
$TC = 20 + 3Q$, $MC = 3$, $AVC = 3$, $AC = 20/Q + 3$.
From Chapter 2: $P^* = \\$1.75$. But $MC = \\$1.00 > P^*$. Maya should not operate. Every cup loses $\\$1.25$.
However, if we exclude her opportunity cost (accounting profit only), $AVC_{materials} = \\$1.50$, and $P = 2.75 > 1.50$. She earns $\\$16.25$/day in accounting profit but $-\\$13.75$/day in economic profit. The economist says: Maya, your time is worth $\\$120$/day at the bookstore.
| Label | Equation | Description |
|---|---|---|
| Eq. 5.1 | $MRS = MU_1/MU_2$ | Marginal rate of substitution |
| Eq. 5.2 | $\max U(x_1,x_2)$ s.t. $p_1 x_1 + p_2 x_2 = m$ | Consumer's problem |
| Eq. 5.3 | $\mathcal{L} = U + \lambda(m - p_1 x_1 - p_2 x_2)$ | Lagrangian |
| Eq. 5.4 | FOCs: $MU_i = \lambda p_i$; budget binds | First-order conditions |
| Eq. 5.5 | $MRS = p_1/p_2$ | Tangency condition |
| Eq. 5.6 | $x_i^* = a_i m / p_i$ | Cobb-Douglas Marshallian demand |
| Eq. 5.7 | $\partial x_1/\partial p_1 = \partial x_1^h/\partial p_1 - x_1 \partial x_1/\partial m$ | Slutsky equation |
| Eq. 5.8 | $Y = AK^\alpha L^{1-\alpha}$ | Cobb-Douglas production function |
| Eq. 5.9 | $MRTS = MP_L/MP_K$ | Marginal rate of technical substitution |
| Eq. 5.10 | $\min wL + rK$ s.t. $f(K,L) = \bar{Y}$ | Cost minimization problem |
| Eq. 5.11 | $MRTS = w/r$ | Cost-minimizing input ratio |
| Eq. 5.12 | $\max \Pi = PQ - TC(Q)$ | Profit maximization |
| Eq. 5.13 | $P = MC$ | Profit-maximizing output rule |