Part I treated demand and supply curves as given. We drew them, shifted them, and measured the surplus they generated. But where do these curves come from? This chapter answers that question by deriving demand from the optimization problem of the consumer and supply from the optimization problem of the firm.
The shift in method is significant. Part I used algebra and geometry. This chapter introduces constrained optimization — maximizing an objective function subject to a constraint — using calculus and Lagrangian methods. The payoff is that demand and supply curves stop being assumptions and become consequences of deeper primitives: preferences, technology, and prices.
The chapter is long because it covers two parallel theories — consumer theory and producer theory — that mirror each other in structure. The consumer maximizes utility subject to a budget constraint; the firm minimizes cost subject to an output target (or maximizes profit subject to technology). Both lead to tangency conditions, and both generate the curves we took as given in Part I.
Prerequisites: Chapters 2 and 3. Mathematical prerequisites: multivariable calculus, constrained optimization (see Appendix A for review).
The consumer chooses among bundles of goods — combinations like "3 apples and 2 bananas" or "5 hours of leisure and \$100 of consumption." To model this choice, we need a way to represent the consumer's preferences — their ranking of different bundles.
For preferences to be well-behaved enough to model mathematically, we require three axioms:
Under these conditions, a fundamental theorem guarantees the existence of a utility function $U(x_1, x_2)$ — a real-valued function that assigns a number to each bundle such that:
Higher utility means more preferred. But the numbers themselves carry no meaning beyond ranking. Any monotonic transformation $V = g(U)$ (where $g$ is strictly increasing) represents the same preferences. This is what we mean by ordinal utility: only the ordering matters.
Properties of indifference curves (given well-behaved preferences): (1) Downward-sloping: more of one good requires giving up some of the other. (2) Cannot cross: would violate transitivity. (3) Higher curves = higher utility. (4) Convex to the origin (if preferences are convex): mixtures are preferred to extremes.
Along an indifference curve, $dU = 0$:
What this says: The MRS tells you your personal exchange rate between two goods. If your MRS is 3, you would give up 3 units of good 2 for 1 more unit of good 1 and feel equally happy. It equals the ratio of how much extra happiness each good gives you.
Why it matters: This is how economists measure "how much you want something" without using money. It captures trade-offs purely in terms of your own preferences, and it is the slope of the indifference curve at any point.
What changes: As you consume more of good 1 and less of good 2, your willingness to trade shrinks — each additional unit of good 1 is less valuable when you already have a lot. This "diminishing MRS" gives indifference curves their bowed-in shape.
In Full Mode, Eq. 5.1 derives this formally from the total differential of the utility function.The MRS is the ratio of marginal utilities. Diminishing MRS: for convex preferences, the MRS decreases as the consumer moves down the indifference curve (more $x_1$, less $x_2$). Intuitively: the more lemonade you already have, the less you're willing to give up cookies for yet another cup.
| Name | $U(x_1, x_2)$ | MRS | Key feature |
|---|---|---|---|
| Cobb-Douglas | $x_1^a x_2^b$ | $(a/b)(x_2/x_1)$ | Constant budget shares |
| Perfect substitutes | $ax_1 + bx_2$ | $a/b$ (constant) | May buy only one good |
| Perfect complements | $\min(ax_1, bx_2)$ | Undefined at kink | Fixed consumption ratio |
| Quasilinear | $v(x_1) + x_2$ | $v'(x_1)$ | No income effect on $x_1$ |
| CES | $(x_1^\rho + x_2^\rho)^{1/\rho}$ | $(x_2/x_1)^{1-\rho}$ | Nests all of the above |
The slope $-p_1/p_2$ is the market rate of exchange: to buy one more unit of good 1 (costing $p_1$), the consumer must give up $p_1/p_2$ units of good 2.
Drag the sliders to change prices and income. Watch the budget line pivot and shift in real time.
Figure 5.0. The budget constraint shows all affordable bundles. Changing a price pivots the line around the other intercept; changing income shifts it parallel. The slope $-p_1/p_2$ is the market rate of exchange.
What this says: The Lagrangian packages a constrained optimization problem (maximize utility subject to a budget constraint) into a single expression. Instead of juggling two separate conditions, the mathematician combines them into one function and optimizes freely.
Why it matters: Every consumer demand curve and every cost curve in microeconomics comes from solving a Lagrangian — it is the engine behind the entire chapter. The shadow price λ tells you exactly how much one more dollar of income would increase your utility.
What changes: When the budget constraint tightens (income falls), the Lagrange multiplier (shadow price of the constraint) rises, meaning each additional dollar of income is worth more. When prices change, the optimal bundle shifts along the budget line.
In Full Mode, the Lagrangian expression derives this formally.The Lagrange multiplier $\lambda$ is the marginal utility of income — the increase in maximum utility from an additional dollar of budget.
First-order conditions:
What this says: The consumer picks the best affordable bundle. The Lagrangian is the calculus machinery for solving this, but the result is simple: spend your budget so that the last dollar spent on each good gives you the same boost in happiness. If coffee gives you more happiness-per-dollar than tea, buy more coffee until the extra enjoyment per dollar is equalized.
Why it matters: This "equal bang for the buck" principle is the foundation of all demand theory. It explains why people diversify their spending rather than buying only one good, and it generates the demand curves from Chapter 2.
What changes: When prices change, the "bang for the buck" shifts. If good 1 gets cheaper, its happiness-per-dollar rises, so you buy more of it until the marginal enjoyment drops back to equality. When income rises, you can afford more of both goods, but the ratio stays the same for Cobb-Douglas preferences.
In Full Mode, Eqs. 5.2-5.4 derive the first-order conditions from the Lagrangian.The consumer allocates spending so that the marginal utility per dollar is the same for both goods: $MU_1/p_1 = MU_2/p_2 = \lambda$. Dividing the first two conditions:
At the optimum, the consumer equalizes the happiness-per-dollar across all goods. This principle leads directly to the tangency condition:
$U = x_1^{1/2} x_2^{1/2}$. Tangency: $x_2/x_1 = p_1/p_2$, so $x_2 = (p_1/p_2)x_1$.
Substituting into the budget constraint: \$1p_1 x_1 = m$.
Marshallian demand: $x_1^* = m/(2p_1)$, $x_2^* = m/(2p_2)$.
The consumer spends exactly half her income on each good — the constant budget share property of Cobb-Douglas preferences.
What this says: With Cobb-Douglas preferences, the consumer always spends a fixed fraction of income on each good — regardless of prices. If the utility exponents are equal, she splits her budget 50/50. The demand for each good is simply income divided by twice its price.
Why it matters: This "constant budget share" result is the signature of Cobb-Douglas utility. It makes these preferences the workhorse model in economics: demand is easy to compute, and the income elasticity is always 1 (spending on each good rises proportionally with income).
What changes: When price doubles, quantity demanded halves (unit elastic demand). When income doubles, quantity demanded doubles. The budget share stays fixed no matter what — a strong and testable prediction.
In Full Mode, Example 6.1 derives the Marshallian demand step by step from the tangency condition.This visualization shows the deep connection: as $p_1$ changes, the optimal bundle traces out the demand curve for good 1. The demand curve IS the set of optimal points at different prices.
Figure 5.1a. Budget line and indifference curves. The optimal bundle is at the tangency point.
Figure 5.1b. The demand curve for good 1, traced out by varying $p_1$.
$U = \ln(x_1) + x_2$. Tangency: \$1/x_1 = p_1/p_2$, so $x_1^* = p_2/p_1$.
Budget: $x_2^* = m/p_2 - 1$.
Demand for $x_1$ depends only on the price ratio, not on income — the hallmark of quasilinear utility. There are no income effects on good 1.
What this says: With quasilinear preferences, the consumer has a "satiation point" for good 1 that depends only on relative prices. Any extra income goes entirely to good 2. This means income changes have zero effect on the demand for good 1.
Why it matters: Quasilinear utility isolates the substitution effect perfectly — since there is no income effect on good 1, the Slutsky decomposition simplifies dramatically. Economists use this as a benchmark to study pure substitution behavior.
What changes: When the price of good 1 rises, the consumer buys less of it (pure substitution). When income rises, all extra spending goes to good 2 — the Engel curve for good 1 is perfectly vertical.
In Full Mode, Example 6.2 derives the demands from the tangency condition.When the price of a good changes, two things happen simultaneously:
What this says: When a price changes, two things happen simultaneously. First, the good becomes relatively more or less expensive compared to alternatives, so you substitute (the substitution effect — always pushes you away from the pricier good). Second, the price change makes you effectively richer or poorer, changing how much of everything you buy (the income effect). The Slutsky equation says: total response = substitution effect + income effect.
Why it matters: This decomposition explains why demand curves almost always slope downward (both effects reinforce for normal goods), and identifies the rare exception: Giffen goods, where the income effect is so strong it overwhelms substitution, making people buy more of something when its price rises.
What changes: When the good takes up a small share of the budget (like salt), the income effect is negligible and substitution dominates — the demand curve definitely slopes down. When the good takes up a large share of the budget AND is inferior (like a staple food for a very poor household), the income effect can be large enough to overwhelm substitution, potentially creating a Giffen good.
In Full Mode, Eq. 5.7 derives this decomposition formally.| Good type | Substitution effect | Income effect | Total effect of price increase |
|---|---|---|---|
| Normal good | − (buy less) | − (poorer → buy less) | Unambiguously − |
| Inferior good | − (buy less) | + (poorer → buy more) | Usually − |
| Giffen good | − (buy less) | + (income effect dominates) | + (demand rises) |
Slide $p_1$ downward to see the price decrease decomposed into a substitution effect (movement along the original indifference curve) and an income effect (movement to a higher indifference curve).
Figure 5.2. Hicks decomposition of a price decrease. A = original bundle, B = compensated bundle (substitution effect), C = new bundle (income effect). The substitution effect moves along the original IC; the income effect shifts to a higher IC.
For Cobb-Douglas, the Engel curve is a straight line through the origin: $x_1 = am/p_1$, linear in $m$. The budget share is always $a$, regardless of income.
Adjust income with the slider to see how the optimal bundle shifts. The left panel shows budget lines and indifference curves; the right panel traces the Engel curve. Toggle between a normal good (Cobb-Douglas) and an inferior good (modified utility where demand bends back at high income).
Figure 5.4. Left: budget lines and indifference curves at different income levels. As income rises, the optimal bundle shifts outward along the income-consumption path. Right: the Engel curve plots quantity of good 1 (horizontal) against income (vertical). For a normal good (Cobb-Douglas), the Engel curve is linear. For an inferior good, it bends back at high income.
where $A > 0$ is total factor productivity and $\alpha \in (0,1)$ is the output elasticity of capital.
Marginal products: $MP_K = \alpha Y/K$, $MP_L = (1-\alpha)Y/L$. Both are positive and diminishing.
What this says: The marginal product of each input tells you how much extra output you get from one more unit of that input, holding the other fixed. For Cobb-Douglas, each input's marginal product is proportional to its average product (total output divided by the amount of that input).
Why it matters: Diminishing marginal products are the engine behind upward-sloping cost curves. Adding more workers to a fixed factory eventually yields less and less extra output per worker, which means each additional unit of output costs more to produce.
What changes: Doubling capital while holding labor fixed does NOT double the marginal product of capital — it falls. But doubling both inputs together (with CRS) doubles output and leaves marginal products unchanged.
In Full Mode, the marginal products are derived by differentiating the Cobb-Douglas production function.What this says: The MRTS tells you how many units of capital you can replace with one more worker while keeping output the same. It is the production analog of the consumer's MRS. When you already have lots of capital relative to labor, one extra worker is very productive (high MRTS); when you have lots of workers already, each additional one adds less.
Why it matters: This ratio determines the shape of the isoquant (the production equivalent of an indifference curve) and drives the firm's input choice. The firm will keep substituting the cheaper input for the more expensive one until the trade-off rate matches the relative input prices.
What changes: As the firm uses more labor relative to capital, each additional worker adds less output (diminishing marginal product), so the MRTS falls. This is why isoquants are bowed inward — the same logic as diminishing MRS for consumers.
In Full Mode, Eq. 5.9 derives the MRTS from the marginal products of the Cobb-Douglas production function.| Type | Condition | Meaning |
|---|---|---|
| CRS | $f(tK,tL) = tY$ | Doubling inputs doubles output |
| IRS | $f(tK,tL) > tY$ | Doubling inputs more than doubles output |
| DRS | $f(tK,tL) < tY$ | Doubling inputs less than doubles output |
$Y = K^{0.3}L^{0.8}$: $f(tK,tL) = t^{1.1}Y$. Since \$1.1 > 1$: increasing returns to scale.
What this says: To check returns to scale, ask: if I double all inputs, does output more than double, exactly double, or less than double? Add the exponents — if they sum to more than 1, doubling inputs more than doubles output (increasing returns).
Why it matters: Returns to scale determine market structure. With increasing returns, larger firms have lower unit costs, which tends toward natural monopoly. With constant returns, firm size is indeterminate — perfectly competitive markets are possible.
What changes: If the exponents sum to exactly 1 (like standard Cobb-Douglas with $\alpha + (1-\alpha) = 1$), we get constant returns. Larger exponent sums mean stronger scale economies; smaller sums mean scale diseconomies.
In Full Mode, Example 6.3 tests returns to scale by scaling all inputs by factor $t$.The cost-minimizing condition (from the FOCs of the Lagrangian):
What this says: To produce at the lowest cost, the firm adjusts its mix of workers and machines until the "bang for the buck" is equal across inputs. If hiring one more worker adds more output per dollar than renting one more machine, hire the worker. Keep adjusting until the last dollar spent on labor and the last dollar spent on capital contribute equally to output.
Why it matters: This is the producer's version of the consumer's "equal marginal utility per dollar" rule. It explains why firms change their input mix when wages or interest rates change, and it generates the cost curves that underpin supply.
What changes: When wages rise relative to the rental rate of capital, the firm substitutes toward capital (more machines, fewer workers). When interest rates rise, the firm substitutes toward labor. The firm always moves along the isoquant toward the relatively cheaper input.
In Full Mode, Eqs. 5.10-5.11 derive the cost-minimizing condition from the Lagrangian.This perfectly parallels the consumer's $MRS = p_1/p_2$.
The firm chooses inputs to minimize cost. Adjust factor prices and watch the isocost line pivot and the optimal $K/L$ ratio change.
Figure 5.3. Cost minimization: the firm chooses the input mix where the isoquant ($\bar{Y} = 100$) is tangent to the lowest isocost line. The tangency condition is $MRTS = w/r$. When labor gets more expensive, the firm substitutes toward capital.
$Y = K^{0.5}L^{0.5}$, $w = 10$, $r = 20$. Produce $\bar{Y} = 100$.
$MRTS = K/L = w/r = 0.5$, so $K = 0.5L$.
$(0.5L)^{0.5} \cdot L^{0.5} = 100 \Rightarrow L^* = 141.4$, $K^* = 70.7$.
$TC = 10(141.4) + 20(70.7) = \\$1{,}828$. Since labor is cheaper, the firm uses more labor than capital.
What this says: When labor costs half as much as capital per unit, the firm uses twice as many workers as machines. The cheaper input gets used more intensively — the firm tilts its input mix toward whatever is the better deal.
Why it matters: This is why manufacturing moves to low-wage countries (labor is cheap relative to capital there) and why automation increases when wages rise (capital becomes relatively cheaper). The cost-minimizing input ratio responds directly to relative input prices.
What changes: If the wage doubled from $10 to $20, the firm would use equal amounts of labor and capital (K/L = 1 instead of 0.5), and total cost would rise. The firm substitutes away from the input that got more expensive.
In Full Mode, Example 6.4 solves the cost minimization step by step.In the short run, at least one input is fixed (typically capital: $K = \bar{K}$). In the long run, all inputs are variable.
| Cost concept | Symbol | Definition |
|---|---|---|
| Fixed cost | $FC$ | Cost of fixed inputs ($r\bar{K}$) |
| Variable cost | $VC$ | Cost of variable inputs ($wL(Q)$) |
| Total cost | $TC$ | $FC + VC$ |
| Marginal cost | $MC$ | $dTC/dQ$ |
| Average total cost | $AC$ | $TC/Q$ |
| Average variable cost | $AVC$ | $VC/Q$ |
| Average fixed cost | $AFC$ | $FC/Q$ (always declining) |
Key relationships:
What this says: A firm's costs break down simply. Fixed costs (rent, equipment) don't change with output. Variable costs (labor, materials) rise as you produce more. Marginal cost is the cost of making one more unit. Average cost is total cost spread across all units.
Why it matters: The shapes of these curves drive every supply decision. The U-shape of average cost comes from spreading fixed costs (pulls it down) battling diminishing returns (pushes it up). Marginal cost always crosses average cost at the bottom of the U — think of it like your GPA: a new grade above your average pulls it up, below pulls it down.
What changes: When fixed costs rise, the average cost curve shifts up but marginal cost is unchanged — the shutdown point stays the same but the break-even point rises. When variable costs rise (e.g., higher wages), both MC and AVC shift up, raising the shutdown price.
In Full Mode, the cost summary table shows the formal definitions and calculus notation.The firm has $TC = 50 + 2Q + 0.05Q^2$. Adjust the market price to see the firm's profit-maximizing output and whether it earns profit or loss.
Figure 5.4. Short-run cost curves. The firm produces where $P = MC$ (on the rising portion). Green shading = profit; red shading = loss. Below the shutdown point ($AVC_{min}$), the firm produces nothing.
In the long run, the firm can choose any level of capital. The long-run average cost (LRAC) curve is the envelope of all short-run AC curves — each corresponding to a different level of fixed capital.
Why LRAC is typically U-shaped:
The output level at the bottom of the LRAC is the minimum efficient scale (MES) — the smallest output at which LRAC is minimized.
Each short-run AC curve corresponds to a different capital level. Drag the slider to highlight a specific SRAC curve and see how it relates to the LRAC envelope.
Figure 5.5. The long-run AC curve (black) is the envelope of short-run AC curves. Each SRAC corresponds to a different factory size. The highlighted SRAC (bold) shows the current capital level. The firm can move along LRAC in the long run by adjusting capital.
First-order condition:
What this says: A competitive firm should keep producing as long as the price it receives for one more unit exceeds the cost of making that unit. Stop when they are equal. Producing beyond that point means each additional unit costs more to make than it earns.
Why it matters: This single rule — price equals marginal cost — is where supply curves come from. The firm's supply curve is literally its marginal cost curve. It connects the abstract calculus of profit maximization to the supply-and-demand diagrams from Chapter 2.
What changes: When the market price rises, the firm produces more (moves up its MC curve). When costs increase (MC shifts up), the firm produces less at any given price. If the price falls below the minimum of average variable cost, the firm shuts down entirely — producing would lose money on every unit.
In Full Mode, Eqs. 5.12-5.13 derive the profit-maximizing condition from the first-order condition.The profit-maximizing rule: produce where price equals marginal cost. The firm should keep producing as long as the revenue from one more unit ($P$) exceeds the cost ($MC$). The firm's supply curve is the portion of its MC curve above $AVC_{min}$.
Why $P = MC$ is the supply curve — the deep connection. In Chapter 2, we drew the supply curve as upward-sloping. Now we see where it comes from: it is the firm's marginal cost curve. The supply curve slopes upward because marginal cost is increasing — not because we assumed it, but because it follows from diminishing marginal returns.
$TC = 50 + 2Q + 0.5Q^2$. At $P = 12$: $P = MC$ gives \$12 = 2 + Q$, so $Q^* = 10$.
$\Pi = 12(10) - [50 + 20 + 50] = 0$. Zero economic profit — the long-run competitive equilibrium.
What this says: At a price of $12, the firm produces 10 units and exactly breaks even — zero economic profit. This is what long-run competitive equilibrium looks like: entry and exit drive the price to the point where firms earn just enough to cover all costs, including the opportunity cost of capital.
Why it matters: Zero economic profit does not mean the firm is failing — it means the firm earns a normal return on its investment. Positive economic profit attracts entry, pushing prices down. Negative economic profit triggers exit, pushing prices up. The market converges to zero economic profit.
What changes: If the price rose above $12, the firm would produce more and earn positive profit, attracting new entrants. If the price fell below the break-even point, the firm would eventually exit.
In Full Mode, Example 6.5 solves the profit maximization numerically.A competitive firm has production function $Y = 10L^{0.5}$, faces wage $w = 20$ and output price $P = 8$.
Step 1 — Find the profit function. Revenue: $R = PY = 8 \times 10L^{0.5} = 80L^{0.5}$. Cost: $C = wL = 20L$. Profit: $\Pi = 80L^{0.5} - 20L$.
Step 2 — FOC. $d\Pi/dL = 40L^{-0.5} - 20 = 0 \implies L^{-0.5} = 0.5 \implies L^* = 4$.
Step 3 — Compute output and profit. $Y^* = 10(4)^{0.5} = 20$. Revenue = \$1 \times 20 = 160$. Cost = \$10 \times 4 = 80$. Profit = \$10.
Verify: $P \times MP_L = w$ at the optimum: \$1 \times 10 \times 0.5 \times 4^{-0.5} = 8 \times 2.5 = 20 = w$. ✓
What this says: The firm hires workers until the revenue generated by the last worker exactly equals the wage. Hiring one more worker beyond that point would cost more than the revenue they generate.
Why it matters: This is "P = MC" expressed in terms of the labor market: hire until the value of the marginal product equals the wage. It explains labor demand — firms hire more workers when the output price rises or when workers become more productive.
What changes: If the output price rose from $8 to $10, the firm would hire more workers (labor becomes more valuable). If wages rose, the firm would hire fewer workers. Diminishing returns mean each additional worker adds less revenue than the last.
In Full Mode, Example 6.6 derives the optimal labor choice from the first-order condition of the profit function.Cost structure: $FC = \\$10$/day (stand rental). Materials: $\\$1.50$/cup. Maya's labor: 10 cups/hour at opportunity cost $\\$15$/hr, so $\\$1.50$/cup.
$TC = 20 + 3Q$, $MC = 3$, $AVC = 3$, $AC = 20/Q + 3$.
From Chapter 2: $P^* = \\$1.75$. But $MC = \\$1.00 > P^*$. Maya should not operate. Every cup loses $\\$1.25$.
However, if we exclude her opportunity cost (accounting profit only), $AVC_{materials} = \\$1.50$, and $P = 2.75 > 1.50$. She earns $\\$16.25$/day in accounting profit but $-\\$13.75$/day in economic profit. The economist says: Maya, your time is worth $\\$120$/day at the bookstore.
| Label | Equation | Description |
|---|---|---|
| Eq. 5.1 | $MRS = MU_1/MU_2$ | Marginal rate of substitution |
| Eq. 5.2 | $\max U(x_1,x_2)$ s.t. $p_1 x_1 + p_2 x_2 = m$ | Consumer's problem |
| Eq. 5.3 | $\mathcal{L} = U + \lambda(m - p_1 x_1 - p_2 x_2)$ | Lagrangian |
| Eq. 5.4 | FOCs: $MU_i = \lambda p_i$; budget binds | First-order conditions |
| Eq. 5.5 | $MRS = p_1/p_2$ | Tangency condition |
| Eq. 5.6 | $x_i^* = a_i m / p_i$ | Cobb-Douglas Marshallian demand |
| Eq. 5.7 | $\partial x_1/\partial p_1 = \partial x_1^h/\partial p_1 - x_1 \partial x_1/\partial m$ | Slutsky equation |
| Eq. 5.8 | $Y = AK^\alpha L^{1-\alpha}$ | Cobb-Douglas production function |
| Eq. 5.9 | $MRTS = MP_L/MP_K$ | Marginal rate of technical substitution |
| Eq. 5.10 | $\min wL + rK$ s.t. $f(K,L) = \bar{Y}$ | Cost minimization problem |
| Eq. 5.11 | $MRTS = w/r$ | Cost-minimizing input ratio |
| Eq. 5.12 | $\max \Pi = PQ - TC(Q)$ | Profit maximization |
| Eq. 5.13 | $P = MC$ | Profit-maximizing output rule |