Chapter 5Consumer and Producer Theory

Introduction

Part I treated demand and supply curves as given. We drew them, shifted them, and measured the surplus they generated. But where do these curves come from? This chapter answers that question by deriving demand from the optimization problem of the consumer and supply from the optimization problem of the firm.

The shift in method is significant. Part I used algebra and geometry. This chapter introduces constrained optimization — maximizing an objective function subject to a constraint — using calculus and Lagrangian methods. The payoff is that demand and supply curves stop being assumptions and become consequences of deeper primitives: preferences, technology, and prices.

The chapter is long because it covers two parallel theories — consumer theory and producer theory — that mirror each other in structure. The consumer maximizes utility subject to a budget constraint; the firm minimizes cost subject to an output target (or maximizes profit subject to technology). Both lead to tangency conditions, and both generate the curves we took as given in Part I.

By the end of this chapter, you will be able to:
  1. Set up and solve a consumer's utility maximization problem using the Lagrangian
  2. Derive Marshallian demand functions from utility maximization
  3. Decompose price changes into income and substitution effects (Slutsky equation)
  4. Set up and solve a firm's cost minimization and profit maximization problems
  5. Derive short-run and long-run cost curves from a production function
  6. Classify returns to scale

Prerequisites: Chapters 2 and 3. Mathematical prerequisites: multivariable calculus, constrained optimization (see Appendix A for review).

5.1 Preferences and Utility

The consumer chooses among bundles of goods — combinations like "3 apples and 2 bananas" or "5 hours of leisure and \$100 of consumption." To model this choice, we need a way to represent the consumer's preferences — their ranking of different bundles.

Preferences. A binary relation $\succsim$ on the set of bundles. We write $x \succsim y$ to mean "the consumer weakly prefers bundle $x$ to bundle $y$." Strict preference ($x \succ y$) means $x$ is strictly better. Indifference ($x \sim y$) means both are equally good.

For preferences to be well-behaved enough to model mathematically, we require three axioms:

Completeness. An axiom of rational preferences requiring that for any two bundles $x$ and $y$, the consumer can rank them: either $x \succsim y$, or $y \succsim x$, or both (indifference). The consumer is never "unable to decide."
Transitivity. An axiom of rational preferences requiring that if $x \succsim y$ and $y \succsim z$, then $x \succsim z$. Preferences contain no cycles — logical consistency is maintained.
Continuity. An axiom requiring that small changes in bundles produce small changes in preference ranking. There are no "jumps" — if bundle $x$ is preferred to $y$, bundles sufficiently close to $x$ are also preferred to $y$.
Utility function. A real-valued function $U(x_1, x_2)$ that assigns a number to each bundle such that higher numbers correspond to more preferred bundles. It exists when preferences satisfy completeness, transitivity, and continuity.
Ordinal utility. A utility representation in which only the ranking of bundles matters, not the magnitude of the utility numbers. Any monotonic transformation $V = g(U)$ (where $g$ is strictly increasing) represents the same preferences.

Under these conditions, a fundamental theorem guarantees the existence of a utility function $U(x_1, x_2)$ — a real-valued function that assigns a number to each bundle such that:

$$x \succsim y \iff U(x) \geq U(y)$$

Higher utility means more preferred. But the numbers themselves carry no meaning beyond ranking. Any monotonic transformation $V = g(U)$ (where $g$ is strictly increasing) represents the same preferences. This is what we mean by ordinal utility: only the ordering matters.

Indifference Curves

Indifference curve. The set of all bundles yielding the same utility level: $\{(x_1, x_2) : U(x_1, x_2) = \bar{u}\}$.

Properties of indifference curves (given well-behaved preferences): (1) Downward-sloping: more of one good requires giving up some of the other. (2) Cannot cross: would violate transitivity. (3) Higher curves = higher utility. (4) Convex to the origin (if preferences are convex): mixtures are preferred to extremes.

Marginal Rate of Substitution

Marginal rate of substitution (MRS). The rate at which the consumer is willing to trade good 2 for good 1 while maintaining the same utility — the (negative of the) slope of the indifference curve.

Along an indifference curve, $dU = 0$:

$$MRS_{12} = -\frac{dx_2}{dx_1}\bigg|_{U = \bar{u}} = \frac{MU_1}{MU_2}$$ (Eq. 5.1)
Intuition

What this says: The MRS tells you your personal exchange rate between two goods. If your MRS is 3, you would give up 3 units of good 2 for 1 more unit of good 1 and feel equally happy. It equals the ratio of how much extra happiness each good gives you.

Why it matters: This is how economists measure "how much you want something" without using money. It captures trade-offs purely in terms of your own preferences, and it is the slope of the indifference curve at any point.

What changes: As you consume more of good 1 and less of good 2, your willingness to trade shrinks — each additional unit of good 1 is less valuable when you already have a lot. This "diminishing MRS" gives indifference curves their bowed-in shape.

In Full Mode, Eq. 5.1 derives this formally from the total differential of the utility function.

The MRS is the ratio of marginal utilities. Diminishing MRS: for convex preferences, the MRS decreases as the consumer moves down the indifference curve (more $x_1$, less $x_2$). Intuitively: the more lemonade you already have, the less you're willing to give up cookies for yet another cup.

Common Utility Functions

Name$U(x_1, x_2)$MRSKey feature
Cobb-Douglas$x_1^a x_2^b$$(a/b)(x_2/x_1)$Constant budget shares
Perfect substitutes$ax_1 + bx_2$$a/b$ (constant)May buy only one good
Perfect complements$\min(ax_1, bx_2)$Undefined at kinkFixed consumption ratio
Quasilinear$v(x_1) + x_2$$v'(x_1)$No income effect on $x_1$
CES$(x_1^\rho + x_2^\rho)^{1/\rho}$$(x_2/x_1)^{1-\rho}$Nests all of the above

5.2 The Consumer's Problem

Budget constraint. The set of affordable bundles: $p_1 x_1 + p_2 x_2 \leq m$, where $p_i$ are prices and $m$ is income. The budget line has slope $-p_1/p_2$ and intercepts $m/p_1$ on the $x_1$-axis and $m/p_2$ on the $x_2$-axis.

The slope $-p_1/p_2$ is the market rate of exchange: to buy one more unit of good 1 (costing $p_1$), the consumer must give up $p_1/p_2$ units of good 2.

Interactive: Budget Constraint Explorer

Drag the sliders to change prices and income. Watch the budget line pivot and shift in real time.

\$1\$10
\$1\$10
\$20\$240
Budget line: $x_1$-intercept = 30  |  $x_2$-intercept = 60  |  Slope = −2.00

Figure 5.0. The budget constraint shows all affordable bundles. Changing a price pivots the line around the other intercept; changing income shifts it parallel. The slope $-p_1/p_2$ is the market rate of exchange.

The Consumer's Problem

Utility maximization. The consumer's fundamental problem: choose the bundle of goods that yields the highest utility subject to the budget constraint. Formally: $\max U(x_1, x_2)$ subject to $p_1 x_1 + p_2 x_2 \leq m$.
$$\max_{x_1, x_2} \; U(x_1, x_2) \quad \text{subject to} \quad p_1 x_1 + p_2 x_2 = m$$ (Eq. 5.2)

The Lagrangian Method

Lagrangian. A mathematical technique for solving constrained optimization problems. The Lagrangian $\mathcal{L} = U(x_1, x_2) + \lambda(m - p_1 x_1 - p_2 x_2)$ converts a constrained problem into an unconstrained one by introducing a multiplier $\lambda$ that prices the constraint.
$$\mathcal{L} = U(x_1, x_2) + \lambda(m - p_1 x_1 - p_2 x_2)$$ (Eq. 5.3)
Intuition

What this says: The Lagrangian packages a constrained optimization problem (maximize utility subject to a budget constraint) into a single expression. Instead of juggling two separate conditions, the mathematician combines them into one function and optimizes freely.

Why it matters: Every consumer demand curve and every cost curve in microeconomics comes from solving a Lagrangian — it is the engine behind the entire chapter. The shadow price λ tells you exactly how much one more dollar of income would increase your utility.

What changes: When the budget constraint tightens (income falls), the Lagrange multiplier (shadow price of the constraint) rises, meaning each additional dollar of income is worth more. When prices change, the optimal bundle shifts along the budget line.

In Full Mode, the Lagrangian expression derives this formally.

The Lagrange multiplier $\lambda$ is the marginal utility of income — the increase in maximum utility from an additional dollar of budget.

First-order conditions:

$$MU_1 = \lambda p_1, \quad MU_2 = \lambda p_2, \quad p_1 x_1 + p_2 x_2 = m$$ (Eq. 5.4)
Intuition

What this says: The consumer picks the best affordable bundle. The Lagrangian is the calculus machinery for solving this, but the result is simple: spend your budget so that the last dollar spent on each good gives you the same boost in happiness. If coffee gives you more happiness-per-dollar than tea, buy more coffee until the extra enjoyment per dollar is equalized.

Why it matters: This "equal bang for the buck" principle is the foundation of all demand theory. It explains why people diversify their spending rather than buying only one good, and it generates the demand curves from Chapter 2.

What changes: When prices change, the "bang for the buck" shifts. If good 1 gets cheaper, its happiness-per-dollar rises, so you buy more of it until the marginal enjoyment drops back to equality. When income rises, you can afford more of both goods, but the ratio stays the same for Cobb-Douglas preferences.

In Full Mode, Eqs. 5.2-5.4 derive the first-order conditions from the Lagrangian.

The consumer allocates spending so that the marginal utility per dollar is the same for both goods: $MU_1/p_1 = MU_2/p_2 = \lambda$. Dividing the first two conditions:

At the optimum, the consumer equalizes the happiness-per-dollar across all goods. This principle leads directly to the tangency condition:

$$MRS = \frac{MU_1}{MU_2} = \frac{p_1}{p_2}$$ (Eq. 5.5)
Tangency condition. At the consumer's optimum, the indifference curve is tangent to the budget line: $MRS = p_1/p_2$. The rate at which the consumer is willing to trade goods equals the rate at which the market allows her to trade.

Marshallian Demand

Marshallian (ordinary) demand. The optimal quantities as functions of prices and income: $x_i^*(p_1, p_2, m)$. These are the demand functions that underlie the demand curves of Chapter 2.
Example 6.1 — Cobb-Douglas Demand

$U = x_1^{1/2} x_2^{1/2}$. Tangency: $x_2/x_1 = p_1/p_2$, so $x_2 = (p_1/p_2)x_1$.

Substituting into the budget constraint: \$1p_1 x_1 = m$.

Marshallian demand: $x_1^* = m/(2p_1)$, $x_2^* = m/(2p_2)$.

The consumer spends exactly half her income on each good — the constant budget share property of Cobb-Douglas preferences.

Intuition

What this says: With Cobb-Douglas preferences, the consumer always spends a fixed fraction of income on each good — regardless of prices. If the utility exponents are equal, she splits her budget 50/50. The demand for each good is simply income divided by twice its price.

Why it matters: This "constant budget share" result is the signature of Cobb-Douglas utility. It makes these preferences the workhorse model in economics: demand is easy to compute, and the income elasticity is always 1 (spending on each good rises proportionally with income).

What changes: When price doubles, quantity demanded halves (unit elastic demand). When income doubles, quantity demanded doubles. The budget share stays fixed no matter what — a strong and testable prediction.

In Full Mode, Example 6.1 derives the Marshallian demand step by step from the tangency condition.

Interactive: Utility Maximization and Demand Derivation

This visualization shows the deep connection: as $p_1$ changes, the optimal bundle traces out the demand curve for good 1. The demand curve IS the set of optimal points at different prices.

\$1 (cheap)\$4\$10 (expensive)
Optimal bundle: x₁* = 15.0, x₂* = 30.0  |  Utility = 20.1  |  MRS = p₁/p₂ = 2.00

Figure 5.1a. Budget line and indifference curves. The optimal bundle is at the tangency point.

Figure 5.1b. The demand curve for good 1, traced out by varying $p_1$.

Example 6.2 — Quasilinear Utility

$U = \ln(x_1) + x_2$. Tangency: \$1/x_1 = p_1/p_2$, so $x_1^* = p_2/p_1$.

Budget: $x_2^* = m/p_2 - 1$.

Demand for $x_1$ depends only on the price ratio, not on income — the hallmark of quasilinear utility. There are no income effects on good 1.

Intuition

What this says: With quasilinear preferences, the consumer has a "satiation point" for good 1 that depends only on relative prices. Any extra income goes entirely to good 2. This means income changes have zero effect on the demand for good 1.

Why it matters: Quasilinear utility isolates the substitution effect perfectly — since there is no income effect on good 1, the Slutsky decomposition simplifies dramatically. Economists use this as a benchmark to study pure substitution behavior.

What changes: When the price of good 1 rises, the consumer buys less of it (pure substitution). When income rises, all extra spending goes to good 2 — the Engel curve for good 1 is perfectly vertical.

In Full Mode, Example 6.2 derives the demands from the tangency condition.

5.3 Income and Substitution Effects

When the price of a good changes, two things happen simultaneously:

Substitution effect. The change in quantity demanded due solely to the change in relative prices, holding utility constant. The substitution effect is always negative: a price increase always reduces the compensated quantity demanded.
Income effect. The change in quantity demanded due to the change in real purchasing power caused by the price change. For normal goods, a price increase reduces real income and further reduces demand. For inferior goods, the income effect works in the opposite direction.
  1. Substitution effect: The good becomes relatively cheaper (or more expensive). The consumer substitutes toward the cheaper good. This effect is always negative.
  2. Income effect: The price change alters real purchasing power. A price decrease is like an income increase. For normal goods, this reinforces the substitution effect. For inferior goods, it works in the opposite direction.

The Slutsky Equation

Slutsky equation. The fundamental decomposition of the total effect of a price change into substitution and income effects: $\partial x_1/\partial p_1 = \partial x_1^h/\partial p_1 - x_1 \cdot \partial x_1/\partial m$. It shows that the demand response to a price change depends on how easily the consumer substitutes and how much the good matters in the budget.
$$\frac{\partial x_1}{\partial p_1} = \underbrace{\frac{\partial x_1^h}{\partial p_1}}_{\text{substitution (−)}} - \underbrace{x_1 \cdot \frac{\partial x_1}{\partial m}}_{\text{income (sign varies)}}$$ (Eq. 5.7)
Intuition

What this says: When a price changes, two things happen simultaneously. First, the good becomes relatively more or less expensive compared to alternatives, so you substitute (the substitution effect — always pushes you away from the pricier good). Second, the price change makes you effectively richer or poorer, changing how much of everything you buy (the income effect). The Slutsky equation says: total response = substitution effect + income effect.

Why it matters: This decomposition explains why demand curves almost always slope downward (both effects reinforce for normal goods), and identifies the rare exception: Giffen goods, where the income effect is so strong it overwhelms substitution, making people buy more of something when its price rises.

What changes: When the good takes up a small share of the budget (like salt), the income effect is negligible and substitution dominates — the demand curve definitely slopes down. When the good takes up a large share of the budget AND is inferior (like a staple food for a very poor household), the income effect can be large enough to overwhelm substitution, potentially creating a Giffen good.

In Full Mode, Eq. 5.7 derives this decomposition formally.
Normal good (consumer theory). A good for which demand increases when income rises ($\partial x/\partial m > 0$). For normal goods, the income effect reinforces the substitution effect, so the law of demand always holds.
Inferior good. A good for which demand decreases when income rises ($\partial x/\partial m < 0$). For inferior goods, the income effect opposes the substitution effect, but the substitution effect usually dominates.
Giffen good. An extreme inferior good for which the income effect is so large that it dominates the substitution effect, causing demand to increase when the price rises. Giffen goods violate the law of demand and are exceedingly rare in practice.
Good typeSubstitution effectIncome effectTotal effect of price increase
Normal good− (buy less)− (poorer → buy less)Unambiguously −
Inferior good− (buy less)+ (poorer → buy more)Usually −
Giffen good− (buy less)+ (income effect dominates)+ (demand rises)

Interactive: Income and Substitution Effects (Hicks Decomposition)

Slide $p_1$ downward to see the price decrease decomposed into a substitution effect (movement along the original indifference curve) and an income effect (movement to a higher indifference curve).

\$1 (large decrease)\$4 (original)
No price change yet. Slide p₁ below \$1.00 to see the decomposition.

Figure 5.2. Hicks decomposition of a price decrease. A = original bundle, B = compensated bundle (substitution effect), C = new bundle (income effect). The substitution effect moves along the original IC; the income effect shifts to a higher IC.

Engel Curves

Engel curve. The relationship between income and the quantity demanded of a good, holding prices constant. For normal goods, the Engel curve slopes upward. For inferior goods, it eventually slopes downward.

For Cobb-Douglas, the Engel curve is a straight line through the origin: $x_1 = am/p_1$, linear in $m$. The budget share is always $a$, regardless of income.

Interactive: Engel Curves

Adjust income with the slider to see how the optimal bundle shifts. The left panel shows budget lines and indifference curves; the right panel traces the Engel curve. Toggle between a normal good (Cobb-Douglas) and an inferior good (modified utility where demand bends back at high income).

20200

Figure 5.4. Left: budget lines and indifference curves at different income levels. As income rises, the optimal bundle shifts outward along the income-consumption path. Right: the Engel curve plots quantity of good 1 (horizontal) against income (vertical). For a normal good (Cobb-Douglas), the Engel curve is linear. For an inferior good, it bends back at high income.

5.4 Production Functions

Production function. A mathematical relationship describing the maximum output obtainable from given inputs: $Y = f(K, L)$, where $K$ is capital and $L$ is labor.

Cobb-Douglas Production

$$Y = AK^\alpha L^{1-\alpha}$$ (Eq. 5.8)

where $A > 0$ is total factor productivity and $\alpha \in (0,1)$ is the output elasticity of capital.

Marginal products: $MP_K = \alpha Y/K$, $MP_L = (1-\alpha)Y/L$. Both are positive and diminishing.

Intuition

What this says: The marginal product of each input tells you how much extra output you get from one more unit of that input, holding the other fixed. For Cobb-Douglas, each input's marginal product is proportional to its average product (total output divided by the amount of that input).

Why it matters: Diminishing marginal products are the engine behind upward-sloping cost curves. Adding more workers to a fixed factory eventually yields less and less extra output per worker, which means each additional unit of output costs more to produce.

What changes: Doubling capital while holding labor fixed does NOT double the marginal product of capital — it falls. But doubling both inputs together (with CRS) doubles output and leaves marginal products unchanged.

In Full Mode, the marginal products are derived by differentiating the Cobb-Douglas production function.

Isoquants and MRTS

Isoquant. The set of input combinations producing the same output: $\{(K, L) : f(K,L) = \bar{Y}\}$. Isoquants are the production analog of indifference curves.
Marginal rate of technical substitution (MRTS). The rate at which a firm can substitute one input for another while keeping output constant — the (negative of the) slope of the isoquant. $MRTS_{LK} = MP_L/MP_K$.
$$MRTS_{LK} = \frac{MP_L}{MP_K} = \frac{(1-\alpha)K}{\alpha L}$$ (Eq. 5.9)
Intuition

What this says: The MRTS tells you how many units of capital you can replace with one more worker while keeping output the same. It is the production analog of the consumer's MRS. When you already have lots of capital relative to labor, one extra worker is very productive (high MRTS); when you have lots of workers already, each additional one adds less.

Why it matters: This ratio determines the shape of the isoquant (the production equivalent of an indifference curve) and drives the firm's input choice. The firm will keep substituting the cheaper input for the more expensive one until the trade-off rate matches the relative input prices.

What changes: As the firm uses more labor relative to capital, each additional worker adds less output (diminishing marginal product), so the MRTS falls. This is why isoquants are bowed inward — the same logic as diminishing MRS for consumers.

In Full Mode, Eq. 5.9 derives the MRTS from the marginal products of the Cobb-Douglas production function.

Returns to Scale

Returns to scale. How output changes when all inputs are scaled by the same factor. Constant returns to scale (CRS): output scales proportionally. Increasing returns to scale (IRS): output more than scales proportionally (economies of scale). Decreasing returns to scale (DRS): output scales less than proportionally (diseconomies of scale).
TypeConditionMeaning
CRS$f(tK,tL) = tY$Doubling inputs doubles output
IRS$f(tK,tL) > tY$Doubling inputs more than doubles output
DRS$f(tK,tL) < tY$Doubling inputs less than doubles output
Example 6.3 — Returns to Scale

$Y = K^{0.3}L^{0.8}$: $f(tK,tL) = t^{1.1}Y$. Since \$1.1 > 1$: increasing returns to scale.

Intuition

What this says: To check returns to scale, ask: if I double all inputs, does output more than double, exactly double, or less than double? Add the exponents — if they sum to more than 1, doubling inputs more than doubles output (increasing returns).

Why it matters: Returns to scale determine market structure. With increasing returns, larger firms have lower unit costs, which tends toward natural monopoly. With constant returns, firm size is indeterminate — perfectly competitive markets are possible.

What changes: If the exponents sum to exactly 1 (like standard Cobb-Douglas with $\alpha + (1-\alpha) = 1$), we get constant returns. Larger exponent sums mean stronger scale economies; smaller sums mean scale diseconomies.

In Full Mode, Example 6.3 tests returns to scale by scaling all inputs by factor $t$.

6.5 Cost Minimization

Cost minimization. The firm's problem of choosing the combination of inputs that produces a given output level at the lowest total cost: $\min wL + rK$ subject to $f(K,L) = \bar{Y}$.
$$\min_{K, L} \; wL + rK \quad \text{subject to} \quad f(K,L) = \bar{Y}$$ (Eq. 5.10)
Isocost line. All combinations of $K$ and $L$ that cost the same: $C = wL + rK$. Slope: $-w/r$.

The cost-minimizing condition (from the FOCs of the Lagrangian):

$$MRTS = \frac{MP_L}{MP_K} = \frac{w}{r}$$ (Eq. 5.11)
Intuition

What this says: To produce at the lowest cost, the firm adjusts its mix of workers and machines until the "bang for the buck" is equal across inputs. If hiring one more worker adds more output per dollar than renting one more machine, hire the worker. Keep adjusting until the last dollar spent on labor and the last dollar spent on capital contribute equally to output.

Why it matters: This is the producer's version of the consumer's "equal marginal utility per dollar" rule. It explains why firms change their input mix when wages or interest rates change, and it generates the cost curves that underpin supply.

What changes: When wages rise relative to the rental rate of capital, the firm substitutes toward capital (more machines, fewer workers). When interest rates rise, the firm substitutes toward labor. The firm always moves along the isoquant toward the relatively cheaper input.

In Full Mode, Eqs. 5.10-5.11 derive the cost-minimizing condition from the Lagrangian.

This perfectly parallels the consumer's $MRS = p_1/p_2$.

Interactive: Isoquant/Isocost Cost Minimization

The firm chooses inputs to minimize cost. Adjust factor prices and watch the isocost line pivot and the optimal $K/L$ ratio change.

\$2\$30
\$2\$30
Cost minimum: L* = 141.4, K* = 70.7  |  K/L = 0.50  |  TC = \$1,828

Figure 5.3. Cost minimization: the firm chooses the input mix where the isoquant ($\bar{Y} = 100$) is tangent to the lowest isocost line. The tangency condition is $MRTS = w/r$. When labor gets more expensive, the firm substitutes toward capital.

Example 6.4 — Cost Minimization

$Y = K^{0.5}L^{0.5}$, $w = 10$, $r = 20$. Produce $\bar{Y} = 100$.

$MRTS = K/L = w/r = 0.5$, so $K = 0.5L$.

$(0.5L)^{0.5} \cdot L^{0.5} = 100 \Rightarrow L^* = 141.4$, $K^* = 70.7$.

$TC = 10(141.4) + 20(70.7) = \\$1{,}828$. Since labor is cheaper, the firm uses more labor than capital.

Intuition

What this says: When labor costs half as much as capital per unit, the firm uses twice as many workers as machines. The cheaper input gets used more intensively — the firm tilts its input mix toward whatever is the better deal.

Why it matters: This is why manufacturing moves to low-wage countries (labor is cheap relative to capital there) and why automation increases when wages rise (capital becomes relatively cheaper). The cost-minimizing input ratio responds directly to relative input prices.

What changes: If the wage doubled from $10 to $20, the firm would use equal amounts of labor and capital (K/L = 1 instead of 0.5), and total cost would rise. The firm substitutes away from the input that got more expensive.

In Full Mode, Example 6.4 solves the cost minimization step by step.

6.6 Cost Curves

Short Run vs. Long Run

In the short run, at least one input is fixed (typically capital: $K = \bar{K}$). In the long run, all inputs are variable.

Short-Run Cost Functions

Fixed cost (FC). The cost of inputs that cannot be adjusted in the short run (e.g., rent, equipment leases). Fixed costs do not change with the level of output.
Variable cost (VC). The cost of inputs that vary with the level of output (e.g., labor, raw materials). Variable cost rises as the firm produces more.
Marginal cost (MC). The additional cost of producing one more unit of output: $MC = dTC/dQ$. Marginal cost typically falls initially (increasing returns to the variable input), then rises (diminishing returns).
Average cost (AC). Total cost per unit of output: $AC = TC/Q = AFC + AVC$. The AC curve is U-shaped, reaching its minimum where $MC = AC$.
Average variable cost (AVC). Variable cost per unit of output: $AVC = VC/Q$. The AVC curve is also U-shaped. Its minimum is the shutdown point — the lowest price at which the firm is willing to produce in the short run.
Shutdown point. The output level (and corresponding price) at which price equals the minimum of average variable cost ($P = AVC_{min}$). Below this price, the firm loses more by producing than by shutting down entirely, because revenue fails to cover even variable costs.
Minimum efficient scale. The smallest output level at which long-run average cost reaches its minimum. Firms operating below this scale have higher unit costs and are at a competitive disadvantage.
Cost conceptSymbolDefinition
Fixed cost$FC$Cost of fixed inputs ($r\bar{K}$)
Variable cost$VC$Cost of variable inputs ($wL(Q)$)
Total cost$TC$$FC + VC$
Marginal cost$MC$$dTC/dQ$
Average total cost$AC$$TC/Q$
Average variable cost$AVC$$VC/Q$
Average fixed cost$AFC$$FC/Q$ (always declining)

Key relationships:

  1. $AC = AVC + AFC$. Since $AFC$ always declines, $AC$ and $AVC$ converge at high output.
  2. MC intersects AC at AC's minimum. When $MC < AC$, producing one more unit pulls the average down. When $MC > AC$, it pulls the average up.
  3. The shutdown point is where $P = AVC_{min}$. Below this, the firm shuts down.
Intuition

What this says: A firm's costs break down simply. Fixed costs (rent, equipment) don't change with output. Variable costs (labor, materials) rise as you produce more. Marginal cost is the cost of making one more unit. Average cost is total cost spread across all units.

Why it matters: The shapes of these curves drive every supply decision. The U-shape of average cost comes from spreading fixed costs (pulls it down) battling diminishing returns (pushes it up). Marginal cost always crosses average cost at the bottom of the U — think of it like your GPA: a new grade above your average pulls it up, below pulls it down.

What changes: When fixed costs rise, the average cost curve shifts up but marginal cost is unchanged — the shutdown point stays the same but the break-even point rises. When variable costs rise (e.g., higher wages), both MC and AVC shift up, raising the shutdown price.

In Full Mode, the cost summary table shows the formal definitions and calculus notation.

Interactive: Cost Curves and Profit

The firm has $TC = 50 + 2Q + 0.05Q^2$. Adjust the market price to see the firm's profit-maximizing output and whether it earns profit or loss.

\$1\$8\$15
At P = \$1.00: Q* = 60  |  TR = \$180  |  TC = \$150  |  Profit = \$130

Figure 5.4. Short-run cost curves. The firm produces where $P = MC$ (on the rising portion). Green shading = profit; red shading = loss. Below the shutdown point ($AVC_{min}$), the firm produces nothing.

Long-Run Average Cost

In the long run, the firm can choose any level of capital. The long-run average cost (LRAC) curve is the envelope of all short-run AC curves — each corresponding to a different level of fixed capital.

Why LRAC is typically U-shaped:

The output level at the bottom of the LRAC is the minimum efficient scale (MES) — the smallest output at which LRAC is minimized.

Interactive: Short-Run vs. Long-Run Average Cost

Each short-run AC curve corresponds to a different capital level. Drag the slider to highlight a specific SRAC curve and see how it relates to the LRAC envelope.

K=1 (small)K=3K=6 (large)
Capital K\u0304 = 3: SRAC minimum at Q = 47, AC = \$1.32  |  MES at Q ≈ 60

Figure 5.5. The long-run AC curve (black) is the envelope of short-run AC curves. Each SRAC corresponds to a different factory size. The highlighted SRAC (bold) shows the current capital level. The firm can move along LRAC in the long run by adjusting capital.

6.7 Profit Maximization

Profit maximization. The firm's objective: choose output to maximize profit $\Pi = P \cdot Q - TC(Q)$. For a competitive firm (price-taker), the first-order condition yields $P = MC$ — produce where price equals marginal cost.
$$\max_Q \; \Pi = P \cdot Q - TC(Q)$$ (Eq. 5.12)

First-order condition:

$$P = MC(Q)$$ (Eq. 5.13)
Intuition

What this says: A competitive firm should keep producing as long as the price it receives for one more unit exceeds the cost of making that unit. Stop when they are equal. Producing beyond that point means each additional unit costs more to make than it earns.

Why it matters: This single rule — price equals marginal cost — is where supply curves come from. The firm's supply curve is literally its marginal cost curve. It connects the abstract calculus of profit maximization to the supply-and-demand diagrams from Chapter 2.

What changes: When the market price rises, the firm produces more (moves up its MC curve). When costs increase (MC shifts up), the firm produces less at any given price. If the price falls below the minimum of average variable cost, the firm shuts down entirely — producing would lose money on every unit.

In Full Mode, Eqs. 5.12-5.13 derive the profit-maximizing condition from the first-order condition.

The profit-maximizing rule: produce where price equals marginal cost. The firm should keep producing as long as the revenue from one more unit ($P$) exceeds the cost ($MC$). The firm's supply curve is the portion of its MC curve above $AVC_{min}$.

Why $P = MC$ is the supply curve — the deep connection. In Chapter 2, we drew the supply curve as upward-sloping. Now we see where it comes from: it is the firm's marginal cost curve. The supply curve slopes upward because marginal cost is increasing — not because we assumed it, but because it follows from diminishing marginal returns.

Example 6.5 — Profit Maximization

$TC = 50 + 2Q + 0.5Q^2$. At $P = 12$: $P = MC$ gives \$12 = 2 + Q$, so $Q^* = 10$.

$\Pi = 12(10) - [50 + 20 + 50] = 0$. Zero economic profit — the long-run competitive equilibrium.

Intuition

What this says: At a price of $12, the firm produces 10 units and exactly breaks even — zero economic profit. This is what long-run competitive equilibrium looks like: entry and exit drive the price to the point where firms earn just enough to cover all costs, including the opportunity cost of capital.

Why it matters: Zero economic profit does not mean the firm is failing — it means the firm earns a normal return on its investment. Positive economic profit attracts entry, pushing prices down. Negative economic profit triggers exit, pushing prices up. The market converges to zero economic profit.

What changes: If the price rose above $12, the firm would produce more and earn positive profit, attracting new entrants. If the price fell below the break-even point, the firm would eventually exit.

In Full Mode, Example 6.5 solves the profit maximization numerically.
Example 6.6 — Profit Maximization from Production Function

A competitive firm has production function $Y = 10L^{0.5}$, faces wage $w = 20$ and output price $P = 8$.

Step 1 — Find the profit function. Revenue: $R = PY = 8 \times 10L^{0.5} = 80L^{0.5}$. Cost: $C = wL = 20L$. Profit: $\Pi = 80L^{0.5} - 20L$.

Step 2 — FOC. $d\Pi/dL = 40L^{-0.5} - 20 = 0 \implies L^{-0.5} = 0.5 \implies L^* = 4$.

Step 3 — Compute output and profit. $Y^* = 10(4)^{0.5} = 20$. Revenue = \$1 \times 20 = 160$. Cost = \$10 \times 4 = 80$. Profit = \$10.

Verify: $P \times MP_L = w$ at the optimum: \$1 \times 10 \times 0.5 \times 4^{-0.5} = 8 \times 2.5 = 20 = w$. ✓

Intuition

What this says: The firm hires workers until the revenue generated by the last worker exactly equals the wage. Hiring one more worker beyond that point would cost more than the revenue they generate.

Why it matters: This is "P = MC" expressed in terms of the labor market: hire until the value of the marginal product equals the wage. It explains labor demand — firms hire more workers when the output price rises or when workers become more productive.

What changes: If the output price rose from $8 to $10, the firm would hire more workers (labor becomes more valuable). If wages rose, the firm would hire fewer workers. Diminishing returns mean each additional worker adds less revenue than the last.

In Full Mode, Example 6.6 derives the optimal labor choice from the first-order condition of the profit function.

6.8 The Firm's Supply Curve

Thread Example: Maya's Enterprise

Maya's Lemonade Stand — The Full Cost Analysis

Cost structure: $FC = \\$10$/day (stand rental). Materials: $\\$1.50$/cup. Maya's labor: 10 cups/hour at opportunity cost $\\$15$/hr, so $\\$1.50$/cup.

$TC = 20 + 3Q$,   $MC = 3$,   $AVC = 3$,   $AC = 20/Q + 3$.

From Chapter 2: $P^* = \\$1.75$. But $MC = \\$1.00 > P^*$. Maya should not operate. Every cup loses $\\$1.25$.

However, if we exclude her opportunity cost (accounting profit only), $AVC_{materials} = \\$1.50$, and $P = 2.75 > 1.50$. She earns $\\$16.25$/day in accounting profit but $-\\$13.75$/day in economic profit. The economist says: Maya, your time is worth $\\$120$/day at the bookstore.

Summary

Key Equations

LabelEquationDescription
Eq. 5.1$MRS = MU_1/MU_2$Marginal rate of substitution
Eq. 5.2$\max U(x_1,x_2)$ s.t. $p_1 x_1 + p_2 x_2 = m$Consumer's problem
Eq. 5.3$\mathcal{L} = U + \lambda(m - p_1 x_1 - p_2 x_2)$Lagrangian
Eq. 5.4FOCs: $MU_i = \lambda p_i$; budget bindsFirst-order conditions
Eq. 5.5$MRS = p_1/p_2$Tangency condition
Eq. 5.6$x_i^* = a_i m / p_i$Cobb-Douglas Marshallian demand
Eq. 5.7$\partial x_1/\partial p_1 = \partial x_1^h/\partial p_1 - x_1 \partial x_1/\partial m$Slutsky equation
Eq. 5.8$Y = AK^\alpha L^{1-\alpha}$Cobb-Douglas production function
Eq. 5.9$MRTS = MP_L/MP_K$Marginal rate of technical substitution
Eq. 5.10$\min wL + rK$ s.t. $f(K,L) = \bar{Y}$Cost minimization problem
Eq. 5.11$MRTS = w/r$Cost-minimizing input ratio
Eq. 5.12$\max \Pi = PQ - TC(Q)$Profit maximization
Eq. 5.13$P = MC$Profit-maximizing output rule

Exercises

Practice

  1. A consumer has utility $U = x_1^{1/3} x_2^{2/3}$, prices $p_1 = 4$, $p_2 = 2$, income $m = 120$. (a) Write the Lagrangian. (b) Derive the tangency condition. (c) Solve for the Marshallian demand for both goods. (d) Compute the optimal bundle and verify it satisfies the budget constraint.
  2. A consumer has quasilinear utility $U = 2\sqrt{x_1} + x_2$, $p_1 = 1$, $p_2 = 1$, $m = 10$. (a) Solve for optimal consumption. (b) What is the income elasticity of demand for $x_1$? (c) What happens to $x_1^*$ if income doubles?
  3. A firm has production function $Y = 4K^{0.5}L^{0.5}$, $w = 8$, $r = 2$. (a) Find the cost-minimizing input combination to produce $Y = 40$. (b) What is the total cost? (c) If $w$ doubled, how would the optimal $K/L$ ratio change?
  4. A competitive firm has $TC = 100 + 5Q + Q^2$. (a) Derive MC, AC, and AVC. (b) Find the shutdown point. (c) At $P = 25$, find profit-maximizing output and profit. (d) At $P = 5$, should the firm produce or shut down?
  5. Classify returns to scale: (a) $Y = 3K + 2L$, (b) $Y = K^{0.4}L^{0.4}$, (c) $Y = (KL)^{0.6}$, (d) $Y = \min(2K, 3L)$.

Apply

  1. For Cobb-Douglas utility $U = x_1^a x_2^{1-a}$, derive the Marshallian demands and show that the consumer always spends fraction $a$ on good 1. Then use $V = \ln U$ and show the same demands emerge. What does this confirm about ordinality?
  2. A price decrease for good 1 leads a consumer to buy less of good 1. (a) Is this irrational? (b) What type of good must this be? (c) What conditions are necessary? (d) Why are Giffen goods so rare?
  3. A firm can produce with Technology A ($TC_A = 100 + 2Q$) or Technology B ($TC_B = 10 + 5Q$). (a) For what output levels is each cheaper? (b) What does this imply about firm size and technology choice?
  4. Derive the short-run supply curve for a firm with $TC = 50 + Q^2/2$. Graph it, label the shutdown price, and shade profit at $P = 10$.
  5. Using $Y = K^{0.3}L^{0.7}$ with $w = 14$, $r = 6$: (a) Find the cost-minimizing $K/L$ ratio. (b) Derive $TC(Y)$. (c) What are the returns to scale?

Challenge

  1. Prove that for Cobb-Douglas utility $U = x_1^a x_2^{1-a}$, the indirect utility function is $V(p_1, p_2, m) = m \cdot (a/p_1)^a \cdot ((1-a)/p_2)^{1-a}$. Then verify Roy's identity: $x_1^* = -(\partial V/\partial p_1)/(\partial V/\partial m)$.
  2. Show that a profit-maximizing firm with Cobb-Douglas CRS production earns zero economic profit in long-run equilibrium. (Hint: Euler's theorem.) Why does IRS pose a problem for competitive markets?
  3. A consumer's demand for good 1 is $x_1 = m/p_1 - p_2$. (a) Is it homogeneous of degree zero? (b) Does it satisfy Slutsky symmetry? (c) Can it be generated by utility maximization?