Chapter 19Behavioral and Experimental Economics

Introduction

Every model in this book has assumed rational agents — consumers who maximize expected utility, firms that minimize costs, traders with consistent time preferences and correct beliefs. These assumptions are powerful: they yield sharp predictions, clean welfare theorems, and elegant mathematics. But are they true?

This chapter confronts the evidence. Behavioral economics documents predictable, systematic deviations from the standard rational model. These are not random errors that wash out in aggregation — they are patterned biases that survive repetition, incentives, and even expertise.

We begin with the cracks in expected utility theory — the Allais and Ellsberg paradoxes — and build toward prospect theory, the leading descriptive alternative. We then examine intertemporal choice under present bias, social preferences that violate pure self-interest, bounded rationality and heuristics, experimental methodology, nudge theory, and behavioral finance. Throughout, the approach is formal: we write down utility functions, derive predictions, and test them against data.

By the end of this chapter, you will be able to:
  1. Identify the Allais and Ellsberg paradoxes as formal violations of expected utility axioms
  2. State prospect theory's value function and probability weighting function with their parametric forms
  3. Model present bias using the quasi-hyperbolic (beta-delta) framework and derive time inconsistency
  4. Compute equilibrium outcomes under Fehr-Schmidt inequality aversion preferences
  5. Distinguish satisficing from optimizing and characterize sparse maximization
  6. Evaluate experimental design choices — lab vs field, demand effects, replication concerns
  7. Apply nudge theory and choice architecture to policy design
  8. Explain why rational arbitrage fails to eliminate behavioral mispricing in financial markets

Prerequisites: Expected utility theory (Ch. 6), game theory (Ch. 7), consumer theory (Ch. 6/10), econometrics basics (Ch. 9), mechanism design familiarity (Ch. 11).

Named literature: Kahneman & Tversky (1979); Tversky & Kahneman (1992); Thaler (1980, 2015); Laibson (1997); Fehr & Schmidt (1999); Gabaix (2014); Shleifer & Vishny (1997); DeLong, Shleifer, Summers & Waldmann (1990).

19.1 Violations of Expected Utility

The Expected Utility Benchmark

Recall from Chapter 6 that under the axioms of completeness, transitivity, continuity, and the independence axiom, preferences over lotteries can be represented by expected utility:

Expected utility. A theory of decision under risk where an agent chooses the lottery $L$ that maximizes $EU(L) = \sum_{i=1}^{n} p_i \, u(x_i)$, where $p_i$ are objective probabilities, $x_i$ are outcomes, and $u(\cdot)$ is a Bernoulli utility function defined over final wealth. EU is the benchmark against which all behavioral deviations are measured.
$$EU(L) = \sum_{i=1}^{n} p_i \, u(x_i)$$ (Eq. 19.1)
Independence axiom. If lottery $A$ is preferred to $B$, then a mixture $pA + (1-p)C$ must be preferred to $pB + (1-p)C$ for any lottery $C$ and any probability $p \in (0,1)$. Mixing in a common component $C$ should not reverse your ranking of $A$ versus $B$. The Allais paradox demonstrates systematic violations of this axiom.

Independence is elegant and normatively appealing. It says your preference between two gambles should not be swayed by an irrelevant common component. But as Maurice Allais demonstrated in 1953, most human beings violate it consistently.

The Allais Paradox (1953)

Allais paradox. The empirical finding (Allais, 1953) that most people prefer a certain \$1 million over a risky gamble with higher expected value (the certainty effect), yet simultaneously prefer a riskier gamble when both options involve uncertainty. These joint preferences violate the independence axiom of expected utility theory.

Consider two pairs of lotteries:

Pair 1: Gamble 1A: \$1M with certainty. Gamble 1B: \$5M with prob 0.10, \$1M with prob 0.89, \$0 with prob 0.01.

Pair 2: Gamble 2A: \$1M with prob 0.11, \$0 with prob 0.89. Gamble 2B: \$5M with prob 0.10, \$0 with prob 0.90.

The modal pattern: most people choose 1A over 1B and 2B over 2A. This joint choice $\{1A, 2B\}$ violates the independence axiom.

Common consequence effect. A special case of the independence axiom violation in which preferences reverse when a common consequence shared by both options in a pair is altered. In the Allais paradox, the 0.89 probability component common to both gambles in each pair is the common consequence.

By independence, replacing the common consequence (\$1M in Pair 1, \$0 in Pair 2) should not change the ranking. If \$1A \succ 1B$, then \$1A \succ 2B$. The reversal reveals a certainty effect.

The Ellsberg Paradox (1961)

Ellsberg paradox. The empirical finding (Ellsberg, 1961) that people prefer gambles with known probabilities over gambles with unknown (ambiguous) probabilities, even when EU theory predicts indifference. This reveals ambiguity aversion.

Consider an urn with 30 red balls and 60 balls that are black or yellow in unknown proportions. Gamble A: win \$100 if red (prob 1/3, known). Gamble B: win \$100 if black (prob unknown). Most choose A.

But then: Gamble C: win \$100 if red or yellow. Gamble D: win \$100 if black or yellow. Most choose D. Under EU, $A \succ B$ requires $C \succ D$. The joint choice $\{A, D\}$ violates the Sure-Thing Principle.

Ambiguity aversion. The preference for known probabilities over unknown ones. An ambiguity-averse agent prefers a 50/50 gamble from a known urn over an equivalent gamble from an urn with unknown composition. This violates the Savage axioms underlying subjective expected utility.

These paradoxes reveal that the independence axiom fails descriptively. We need a theory that accommodates these violations.

Figure 19.3. Allais Paradox Detector. Select your preferred gamble in each pair, then check whether your choices violate the independence axiom.

Pair 1

1A: \$1M for sure
1B: 10% \$5M, 89% \$1M, 1% \$0

Pair 2

2A: 11% \$1M, 89% \$0
2B: 10% \$5M, 90% \$0
Example 19.1 — Allais Paradox Calculation

Problem. Two lottery pairs. Assume CRRA utility $u(x) = x^{0.5}$ (with $x$ in millions). (a) Compute EU of each gamble. (b) Which does EU recommend? (c) Show {1A, 2B} violates independence.

Solution.

(a) $EU(1A) = 1.0 \times 1^{0.5} = 1.000$. $EU(1B) = 0.89(1) + 0.10(2.236) + 0.01(0) = 1.1136$. $EU(2A) = 0.11(1) = 0.11$. $EU(2B) = 0.10(2.236) = 0.2236$.

(b) EU recommends 1B (1.114 > 1.000) and 2B (0.224 > 0.110). EU-consistent pairs: {1A, 2A} or {1B, 2B}.

(c) \$1A \succ 1B$ requires \$1.11 \, u(1) > 0.10 \, u(5) + 0.01 \, u(0)$. \$1B \succ 2A$ requires \$1.10 \, u(5) + 0.01 \, u(0) > 0.11 \, u(1)$. These directly contradict. No $u(\cdot)$ satisfies both.


19.2 Prospect Theory

From EU to Prospect Theory

Kahneman and Tversky (1979) proposed prospect theory as a descriptive alternative, later refined as cumulative prospect theory (1992). It modifies EU in four ways: reference dependence, loss aversion, diminishing sensitivity, and probability weighting.

Prospect theory. A descriptive theory of decision under risk (Kahneman and Tversky, 1979) that replaces expected utility with four key modifications: reference dependence, loss aversion, diminishing sensitivity (concave for gains, convex for losses), and probability weighting.

The Value Function

The value function replaces $u(x)$ defined over final wealth with $v(x)$ defined over gains and losses relative to a reference point:

Value function (S-shaped, kinked). The prospect theory analog of the utility function, defined over deviations from a reference point rather than final wealth levels. It is concave for gains, convex for losses, and displays a kink at the reference point where the slope for losses exceeds the slope for gains by the loss aversion coefficient $\lambda$.
$$v(x) = \begin{cases} x^{\alpha} & \text{if } x \geq 0 \\ -\lambda(-x)^{\beta} & \text{if } x < 0 \end{cases}$$ (Eq. 19.2)

The parameters estimated by Tversky and Kahneman (1992) are $\alpha = \beta = 0.88$ and $\lambda = 2.25$.

Three properties: (1) Reference dependence — outcomes are coded as gains or losses relative to $r$. (2) Diminishing sensitivity — $\alpha, \beta < 1$ gives concavity for gains and convexity for losses. (3) Loss aversion — $\lambda > 1$ makes the value function steeper for losses.

Loss aversion. The empirical finding that losses loom larger than equivalent gains: $|v(-x)| > v(x)$ for $x > 0$. The loss aversion coefficient $\lambda \approx 2.25$ means losing \$100 feels about 2.25 times worse than gaining \$100 feels good.
Reference dependence. The principle that outcomes are evaluated as gains or losses relative to a reference point, not as final wealth states. The reference point is typically the status quo, but can be expectations, aspirations, or social comparisons.

Figure 19.1. Prospect theory value function. The S-shaped curve is concave for gains and convex for losses, with a steeper slope for losses (loss aversion). At $\alpha = \beta = \lambda = 1$ it collapses to linear (EU). Drag sliders to explore.

The Probability Weighting Function

Probability weighting function. The function $w(p)$ that transforms objective probabilities into subjective decision weights. It overweights small probabilities ($w(p) > p$ for small $p$), underweights moderate-to-large probabilities, and satisfies $w(0)=0$, $w(1)=1$.
$$w(p) = \dfrac{p^{\delta}}{(p^{\delta} + (1-p)^{\delta})^{1/\delta}}$$ (Eq. 19.3)

The Tversky-Kahneman (1992) parameter $\delta \approx 0.65$. When $\delta = 1$, $w(p) = p$ (EU). When $\delta < 1$, the function overweights small probabilities and underweights large ones. Crossover at $p \approx 0.37$.

Figure 19.2. Tversky-Kahneman (1992) probability weighting function. The inverse-S curve overweights small probabilities and underweights large ones. At $\delta = 1$ it collapses to the 45-degree line (EU). Drag the slider.

The Prospect Theory Valuation

$$V(L) = \sum_{i} w(p_i) \, v(x_i - r)$$ (Eq. 19.4)

Note: This is the original Prospect Theory formulation (Kahneman & Tversky, 1979), which applies decision weights to individual probabilities. Cumulative Prospect Theory (Tversky & Kahneman, 1992) applies decision weights to cumulative probabilities of ranked outcomes, resolving certain anomalies such as violations of stochastic dominance.

The Endowment Effect

Endowment effect. The tendency to value an item more highly once you own it than you would pay to acquire it. Follows from loss aversion: giving up an owned item is coded as a loss.

The Fourfold Pattern of Risk Attitudes

Fourfold pattern of risk attitudes. The combination of the S-shaped value function and probability weighting generates four distinct risk attitudes: risk seeking for small-probability gains, risk aversion for small-probability losses, risk aversion for high-probability gains, and risk seeking for high-probability losses.

The fourfold pattern: small $p$ + gains = risk seeking (lotteries); small $p$ + losses = risk aversion (insurance); large $p$ + gains = risk aversion (certainty effect); large $p$ + losses = risk seeking (desperate gambling).

Framing Effects and Mental Accounting

Framing effect. The phenomenon where the way a choice is presented (framed) affects decisions, even when the objective outcomes are identical.
Mental accounting. The cognitive process of organizing financial decisions into separate “accounts” rather than treating wealth as fungible.
Example 19.2 — Prospect Theory vs EU Valuation

Problem. A gamble offers $+\$1{,}000$ with prob 0.5 and $-\$800$ with prob 0.5. Reference point $r = 0$. (a) CE under EU with CRRA $u(x) = x^{0.5}$, $W = \$10{,}000$. (b) PT valuation with standard parameters. (c) Why does loss aversion reverse the evaluation?

Solution.

(a) $EU = 0.5(11{,}000)^{0.5} + 0.5(9{,}200)^{0.5} = 0.5(104.88) + 0.5(95.92) = 100.40$. CE: \$100.40^2 = 10{,}080$. CE change $= +80.2$. Agent accepts.

(b) $v(+1000) = 1000^{0.88} = 436.5$. $v(-800) = -2.25 \times 800^{0.88} = -2.25 \times 358.7 = -807.1$. With $w(0.5) \approx 0.439$: $V = 0.439(436.5) + 0.439(-807.1) = -162.6$. Agent rejects.

(c) Loss aversion ($\lambda = 2.25$) makes the \$800 loss weigh far more than the \$1,000 gain, flipping the evaluation.


19.3 Intertemporal Choice and Present Bias

The Exponential Benchmark

Standard theory assumes exponential discounting with discount factor $\delta \in (0,1)$. The key property is time consistency: a plan made at $t=0$ remains optimal at every future date.

Hyperbolic and Quasi-Hyperbolic Discounting

Experimental evidence overwhelmingly rejects constant discounting. People exhibit declining impatience: the discount rate between today and tomorrow is much higher than between day 100 and day 101.

Hyperbolic discounting. A model of time preference in which the discount function takes the form $D(t) = (1+kt)^{-1}$ rather than the exponential $\delta^t$. Generates declining discount rates and time-inconsistent preferences.
Quasi-hyperbolic (beta-delta) discounting. A tractable model of present bias (Laibson, 1997) where the discount function is $\{1, \beta\delta, \beta\delta^2, \ldots\}$ with $\beta \leq 1$. When $\beta = 1$, reduces to exponential discounting.
$$U_0 = u(c_0) + \beta \sum_{t=1}^{T} \delta^t \, u(c_t)$$ (Eq. 19.5)

The quasi-hyperbolic discount factors are $\{1, \beta\delta, \beta\delta^2, \ldots\}$. The immediate period receives weight 1, but all future periods are additionally discounted by $\beta$. When $\beta < 1$, there is a discrete drop between “now” and “the future.”

Present bias. The tendency to give disproportionate weight to immediate payoffs relative to future payoffs, beyond what exponential discounting would imply. Captured by $\beta < 1$ in the beta-delta model.

Time Inconsistency

$$\beta\delta u'(c_1) = u'(c_0) \neq \delta u'(c_1)$$ (Eq. 19.6)

At $t=0$, the FOC for $c_1$ is $\beta\delta u'(c_1) = u'(c_0)$. At $t=1$, re-optimization gives $u'(c_1) = \beta\delta u'(c_2)$. The $\beta$ has shifted — the plan is time-inconsistent.

Naive vs Sophisticated Agents

Naive agent. A present-biased agent who incorrectly believes their future selves will behave as exponential discounters ($\beta=1$ in the future). Perpetually postpones costly actions.
Sophisticated agent. A present-biased agent who correctly anticipates future present bias. Uses backward induction and may seek commitment devices.
Commitment device. Any mechanism an agent voluntarily adopts to restrict their own future choice set. Examples: illiquid retirement accounts, deadline commitments, automatic payroll deductions.

A naive agent procrastinates indefinitely. A sophisticated agent uses backward induction and may employ commitment devices.

Figure 19.4. Beta-delta discounting explorer. The naive agent perpetually delays; the sophisticated agent uses backward induction. At $\beta = 1$, all lines collapse (no present bias). Drag sliders.

Example 19.3 — Beta-Delta Procrastination

Problem. A student must complete a project. Cost today = 6 utils, benefit in 2 periods = 10 utils. $\beta = 0.7$, $\delta = 0.95$, 5 periods. (a) When does a naive agent act? (b) A sophisticated agent?

Solution.

(a) Naive: At each $t$, net of acting now $= -6 + 0.7 \times 0.95^2 \times 10 = -6 + 6.32 = +0.32$. Net of waiting (perceived) $= 0.7 \times 0.95 \times (-6) + 0.7 \times 0.95^3 \times 10 = -3.99 + 6.00 = +2.01$. Since \$1.01 > 0.32$, always delays. Procrastinates until the deadline.

(b) Sophisticated: Backward induction. At $t = 2$ (last feasible), net $= +0.32 > 0$, so the $t=2$ self acts. At $t = 1$: net now $= +0.32$, net of waiting for $t=2$ to act $= +2.01 > 0.32$, so waits. At $t = 0$: same, waits. Sophisticated agent acts at $t = 2$ — earlier than the naive agent's deadline.

Example 19.4 — Commitment Device Value

Problem. Agent with $\beta = 0.7$, $\delta = 0.95$, log utility, income $Y = 100$ over 3 periods. (a) Savings without commitment. (b) With commitment. (c) Welfare gain.

Solution.

(a) Without: $t=0$ allocates $c_0 = 100/(1+0.665+0.632) = 43.54$, leaving 56.46. At $t=1$ re-optimization: $c_1 = 56.46/1.665 = 33.91$, $c_2 = 22.55$.

(b) With: $c_1 = 0.665 \times 100/2.297 = 28.95$, $c_2 = 0.632 \times 100/2.297 = 27.51$.

(c) Without: $U = 3.774 + 2.344 + 1.967 = 8.085$. With: $U = 3.774 + 2.237 + 2.095 = 8.106$. Gain $= 0.020$ utils. The committed agent achieves a smoother consumption path.


19.4 Social Preferences

Beyond Self-Interest

Decades of experimental evidence show people systematically deviate from pure self-interest: rejecting unfair offers, giving to strangers, cooperating in one-shot games, and punishing free-riders.

Inequality aversion. A preference model in which agents dislike unequal outcomes — both when behind (envy) and when ahead (guilt). Fehr and Schmidt (1999) formalized this with parameters $\alpha$ for envy and $\beta$ for guilt.

The Fehr-Schmidt Utility Function

Fehr-Schmidt utility. A utility function that modifies self-interested payoffs by subtracting disutility from inequality: $U_i = x_i - \alpha_i \max(x_j - x_i, 0) - \beta_i \max(x_i - x_j, 0)$.
$$U_i(x) = x_i - \alpha_i \max(x_j - x_i, 0) - \beta_i \max(x_i - x_j, 0)$$ (Eq. 19.7)

The constraints $\alpha_i \geq \beta_i$ and $\beta_i < 1$ are empirically motivated: envy hurts more than guilt, and no one destroys money just to equalize.

The Ultimatum Game

In the ultimatum game, the minimum acceptable offer $s^*$ satisfies $s - \alpha_R(100-2s) \geq 0$, giving $s^* = 100\alpha_R / (1+2\alpha_R)$. For $\alpha_R = 2$: $s^* = 40$.

Figure 19.6. Fehr-Schmidt inequality aversion. Higher $\alpha$ (envy) raises the minimum acceptable offer. At $\alpha = \beta = 0$, standard theory: any positive offer is accepted. Drag sliders.

Figure 19.5. Ultimatum Game Simulator. Play as the proposer against different responder strategies. Track your earnings over rounds.

0
Round
\$0
Your Earnings
\$0
Responder Earnings
0
Rejections
\$0
Avg Accepted Offer

Dictator Games and Public Goods

In dictator games, the average allocation is 20-30%. In public goods games, adding punishment sustains cooperation.

Example 19.5 — Fehr-Schmidt Ultimatum Game

Problem. \$100 ultimatum game. Proposer: $\alpha_P = 0.5$, $\beta_P = 0.3$. Responder: $\alpha_R = 2.0$, $\beta_R = 0.6$. (a) Min acceptable offer. (b) Optimal offer. (c) Compare to standard Nash.

Solution.

(a) $U_R = s - 2.0(100-2s) = 5s - 200 \geq 0 \Rightarrow s^* = 40$.

(b) $U_P = (100-s) - 0.3(100-2s) = 70 - 0.4s$, decreasing in $s$. Minimize $s$ subject to $s \geq 40$: optimal offer $s^* = 40$. $U_P = 54$, $U_R = 0$.

(c) Standard preferences ($\alpha = \beta = 0$): offer \$1, accepted. Fehr-Schmidt: offer \$10. Much closer to experimental modal offers of 40-50%.


19.5 Bounded Rationality and Heuristics

Simon's Satisficing

Herbert Simon (1955) argued that agents satisfice rather than optimize: searching until they find an acceptable option, then stopping.

Satisficing. A decision procedure (Simon, 1955) in which the agent sets an aspiration level and chooses the first option that meets it, rather than comparing all alternatives.
Bounded rationality. The recognition (Simon, 1955) that human decision-making is constrained by cognitive limitations — finite memory, limited attention, and computational costs.

Heuristics and Biases

Tversky and Kahneman (1974) identified three core heuristics: representativeness (judging probability by resemblance), availability (estimating frequency by ease of recall), and anchoring (adjusting insufficiently from an initial value).

Gabaix's Sparse Maximization

Sparse maximization. A model of bounded rationality (Gabaix, 2014) in which the agent maximizes utility minus the cost of attention. The agent allocates attention optimally, attending more to dimensions that matter.
$$\max_{\mathbf{c}} \, u(\mathbf{c}) - \theta \|\mathbf{m}\|_1 \quad \text{s.t. } \mathbf{p} \cdot \mathbf{c} \leq w$$ (Eq. 19.8)

Gabaix (2014) formalized bounded rationality as an optimization problem: agents maximize utility subject to attention cost $\theta$ per dimension. The agent perceives $\hat{p}_k = \bar{p}_k + m_k(p_k - \bar{p}_k)$.


19.6 Experimental Design and Methodology

Lab Experiments

Lab experiments feature real monetary incentives, randomization, and control. Strength: internal validity. Weakness: external validity.

Field Experiments

Field experiments embed manipulations in real-world settings: natural behavior, no awareness, large scale. Trade-off: less control for greater realism.

Methodological Challenges

Demand effects: subjects may alter behavior because they know they are observed or infer experimenter intent. The deception debate: economics has a strong norm against deception, unlike psychology.

The replication crisis: only 36% of psychology studies replicated (Open Science Collaboration, 2015); economics is higher (~60%) but still concerning. Pre-registration addresses p-hacking and publication bias.


19.7 Nudge Theory and Libertarian Paternalism

Choice Architecture

If choices depend on framing and defaults, then choice architecture — the way choices are presented — matters.

Choice architecture. The design of the environment in which people make decisions, including the order of options, default settings, information display, and physical arrangement.
Nudge. Any aspect of choice architecture that alters behavior predictably without forbidding options or significantly changing economic incentives (Thaler and Sunstein, 2008).
Libertarian paternalism. A philosophy preserving freedom of choice while steering choices toward welfare-improving outcomes through nudges rather than mandates.

Default Effects

Default effect. The disproportionate tendency to stick with the pre-selected option, even when switching is easy and costless.

The most powerful nudge is the default. Organ donation: 15-20% in opt-in countries, 85-99% in opt-out. Retirement enrollment jumps from ~50% to over 90% with opt-out.

$$P_{\text{enroll}} = \Phi\!\left(\frac{v - k \cdot (1-d)}{\sigma}\right)$$ (Eq. 19.9)

Under opt-in ($d=0$): $P = \Phi((v-k)/\sigma)$. Under opt-out ($d=1$): $P = \Phi(v/\sigma)$. The gap is largest when $v$ is positive but moderate and $k/\sigma$ is non-trivial.

Figure 19.7. Default effect simulator. Higher switching costs widen the gap between opt-in and opt-out enrollment. At $k = 0$ the default does not matter. Drag the slider.

The EAST Framework

EAST framework. A practical guide for nudge design: make desired behaviors Easy, Attractive, Social, and Timely.

The EAST framework: Easy (reduce friction), Attractive (make salient), Social (leverage norms), Timely (prompt at receptive moments).

Sludge

Sludge. Friction deliberately or inadvertently added to a process that discourages desirable behavior. The opposite of a nudge.

Sludge is friction that discourages desirable behavior. Reducing sludge is often as effective as introducing new nudges.

Behavioral Welfare Economics

Bernheim and Rangel (2009): evaluate welfare based on choices free from behavioral distortions — when agents are well-informed, attentive, and undistorted.


Take

'Libertarian paternalism is just paternalism with better PR' — Gilles Saint-Paul, The Tyranny of Utility

When Thaler and Sunstein published Nudge in 2008, it seemed like a policy cheat code: redesign defaults and people save more, eat better, donate organs — all without restricting choice. Governments loved it. The UK created a "Nudge Unit," and Obama hired Sunstein as regulatory czar. But the backlash was fierce. Gilles Saint-Paul called it "the tyranny of utility" — technocrats deciding what's good for you while pretending to respect your freedom. Op-eds called nudging "manipulation by the state." Is libertarian paternalism a brilliant synthesis, or a contradiction in terms?

Advanced

19.8 Behavioral Finance

Market Efficiency and Its Challengers

The efficient market hypothesis holds that prices fully reflect all information. Behavioral finance challenges this: many traders are not rational, and rational arbitrageurs face limits.

Overconfidence and Excess Trading

Overconfidence generates excess trading. Barber and Odean (2000): the most active traders earned 6.5 percentage points less per year than the least active.

The Disposition Effect

Disposition effect. The tendency to sell winning assets too early and hold losing assets too long. Follows from prospect theory: gains are in the concave (risk-averse) region, losses in the convex (risk-seeking) region.

The reference point is the purchase price. Gains in the concave region (risk-averse, sell early); losses in the convex region (risk-seeking, hold).

Momentum and Reversal Anomalies

Stocks outperform over 3-12 months (momentum, Jegadeesh-Titman 1993) and underperform over 3-5 years (reversal, DeBondt-Thaler 1985).

Limits to Arbitrage

Limits to arbitrage. The conditions under which rational arbitrageurs cannot fully eliminate mispricing: fundamental risk, noise trader risk, implementation costs, and agency problems.
Noise trader. An investor who trades on sentiment rather than fundamental analysis. Introduces unpredictable price distortions. In the DSSW model, noise traders can survive and even prosper.

Even rational traders may not correct mispricing: noise trader risk, implementation costs, and agency problems constrain them.

The DSSW Noise Trader Model

$$p_t = f_t + \frac{\gamma \, \rho_t \, \mu_t}{1+r}$$ (Eq. 19.10)

DeLong, Shleifer, Summers, and Waldmann (1990): higher $\mu$ pushes price from fundamentals; higher $\rho$ amplifies deviation; higher $\gamma$ (arbitrageur risk aversion) means less aggressive trading against mispricing, so the deviation increases.

The paradox: noise traders can earn higher expected returns by bearing the risk they themselves created.

Figure 19.8. DSSW noise trader model. Noise trader sentiment pushes prices away from fundamentals. Risk-averse arbitrageurs cannot fully correct the mispricing. Drag sliders.

Example 19.6 — Noise Trader Pricing

Problem. $f = 100$, $\rho = 0.30$, $\mu = 20$ (bullish), $r = 0.05$, $\gamma = 2$. (a) Compute equilibrium price. (b) Price deviation. (c) What if $\gamma = 0$?

Solution.

(a) $p = 100 + \frac{2 \times 0.30 \times 20}{1.05} = 100 + \frac{12}{1.05} = 100 + 11.43 = 111.43$.

(b) Deviation: $p - f = 11.43$. The asset is overpriced because noise traders push prices above fundamentals and risk-averse arbitrageurs don't fully counteract them.

(c) With $\gamma = 0$: $p = 100 + 0 = 100$. Risk-neutral arbitrageurs trade aggressively enough to eliminate mispricing entirely. The key DSSW insight: it is arbitrageur risk aversion ($\gamma > 0$) that allows noise-trader-driven deviations to persist.


Big Question #4

Are people rational?

You now have prospect theory, present bias, social preferences, bounded rationality, and the DSSW noise trader model. This is the final stop — the question gets its resolution.

What the model says

The behavioral case is now fully assembled. Prospect theory (Kahneman & Tversky 1979) provides a formal, testable alternative to expected utility: people evaluate outcomes relative to a reference point, are loss averse ($\lambda \approx 2.25$), and overweight small probabilities. Present bias ($\beta\delta$ discounting, Laibson 1997) explains procrastination, undersaving, and time inconsistency — people discount the immediate future far more heavily than the distant future. Social preferences (Fehr-Schmidt inequality aversion) explain cooperation and punishment in settings where pure self-interest predicts defection. Bounded rationality (Gabaix sparse maximization) formalizes the idea that attention is scarce and people optimize over a simplified model of the world. These are not isolated anecdotes — they are systematic, replicable, and survive high stakes. The violations of the rationality axioms documented at Stop 2 (Chapter 11) now have formal alternative models that fit the data better than expected utility theory does.

The strongest counter

Two powerful counterarguments survive the behavioral onslaught. First, ecological rationality (Gigerenzer): heuristics aren't biases — they're efficient adaptations to real-world environments with limited time and information. "Fast and frugal" heuristics often outperform full optimization in realistic settings with noisy data. The lab results that document "biases" may be artifacts of artificial environments that strip away the ecological context in which human cognition evolved to perform well. If the environment is uncertain enough, ignoring information can be optimal, not irrational. Second, market discipline: even if individuals are biased, competitive markets may aggregate away individual errors. Firms run by irrational managers get outcompeted. Consumers who systematically overpay get educated by experience. The "as if" defense — markets behave as if agents are rational, regardless of what happens inside their heads — remains a serious position, particularly for competitive product markets where entry is easy and feedback is fast.

How the mainstream responded

Behavioral finance provided the critical test — and the "as if" defense failed in the one market where it should have been strongest. The DSSW noise trader model (1990) showed that irrational traders can survive and move prices because arbitrage is risky and limited. Shleifer and Vishny (1997) established the "limits to arbitrage": even sophisticated arbitrageurs face short-selling costs, margin calls, and career risk — they can't fully correct mispricings caused by noise traders. The equity premium puzzle, excess volatility, and momentum anomalies all persist despite decades of sophisticated arbitrage. If biases survive in financial markets — where information travels fastest, stakes are highest, and the smartest capital competes — the "as if" defense cannot be a general principle. The mainstream absorbed behavioral economics not by rejecting rational choice but by enriching it: prospect theory is now standard in finance, $\beta\delta$ preferences are standard in macro, and mechanism design increasingly incorporates behavioral agents.

The judgment (at this level)

People are not fully rational in the way the axioms require — the evidence is overwhelming and no longer seriously contested. The more important question is whether it matters for aggregate outcomes, and the answer is domain-specific. In financial markets: yes, biases survive and move prices, because limits to arbitrage are real and persistent. In consumer markets: sometimes — defaults and framing have large, durable effects on retirement saving, organ donation, and energy use. In competitive product markets: less clear — competition, entry, and experience may discipline many biases over time. The honest resolution is that "are people rational?" was the wrong question all along. Rationality is not binary. The right question is: when does irrationality matter for aggregate outcomes, and when does the market machinery wash it out? The answer depends on the specific market, the specific bias, and the specific institutional context. Behavioral economics didn't overthrow rational choice — it drew the map of where rational choice works, where it breaks, and what to use instead.

What you can't resolve yet

This is the final stop on BQ #4. The arc ran from the rationality assumption as a modeling tool (Ch 1), through its formalization and testable axioms (Ch 11), to the full behavioral challenge and its market test (here). The hardest unresolved question is about policy: if people are biased, should the government correct their choices? Nudge theory says yes, gently — libertarian paternalism. But the premise (systematic irrationality) may undermine the conclusion (people can be trusted to opt out). The "who nudges the nudgers?" problem has no clean answer — government regulators are themselves subject to the same biases they seek to correct. And the frontier keeps moving: neuroeconomics, computational models of bounded rationality, and machine learning approaches to preference estimation are reshaping what "rationality" even means in the 21st century.

Related Takes

Take

'Libertarian paternalism is just paternalism with better PR' — Gilles Saint-Paul, The Tyranny of Utility

Nudges work. But who decides what "better" means, and where does intervention stop? The internal logic of behavioral economics points toward hard paternalism, not the gentle kind.

Advanced
Take

"Every billionaire is a policy failure" — viral slogan, popularized by Dan Riffle / AOC's office

Dan Riffle popularized the slogan in 2019. The welfare theorems say competitive equilibria are efficient — but many billionaire fortunes arise from market power, not competition. Behavioral economics adds another layer: fairness norms shape what people tolerate.

Advanced
← Previous: Ch 11 — Rationality axiomatized Stop 3 of 3 — Final This question is now fully explored.
Big Question #4

Are people rational?

BQ #4 reaches its verdict — biases are real, but do they survive markets? Prospect theory, present bias, and noise traders give the answer: rationality is a spectrum, and the question was wrong all along. It's not "are they rational" but "when does irrationality matter for aggregate outcomes?"

Explore this question →

Thread Example: Maya's Enterprise

Maya bundled a free cookie with every lemonade purchase as a summer promotion. Sales increased modestly — up 8%. When Maya removes the free cookie (returning to the original price), customer backlash is disproportionate: complaints, negative reviews, lost regulars. Sales drop 15% — below the pre-promotion baseline.

Prospect theory analysis. During the promotion, customers' reference point shifted from “lemonade” to “lemonade + cookie.” The gain from adding the cookie was $v(+\text{cookie}) = (\text{cookie\_value})^{0.88}$. But the loss from removing it is $v(-\text{cookie}) = -2.25 \times (\text{cookie\_value})^{0.88}$. The perceived loss is 2.25× the original gain. The promotion was a one-way ratchet: easy to give, painful to take away.

Maya designs a nudge experiment. For her loyalty program, Maya tests two enrollment designs as a field experiment: Treatment A (opt-in): customers can sign up at the counter. Treatment B (opt-out): every customer automatically gets a card; they can opt out. Using Eq. 19.9 with $v = 3$, $\sigma = 2$, $k = 2$: opt-in $P = \Phi(0.5) = 0.69$; opt-out $P = \Phi(1.5) = 0.93$. Maya's field experiment confirms the prediction. She switches to opt-out for the full rollout.

Historical Lens

Kahneman and Tversky (1979). “Prospect Theory: An Analysis of Decision under Risk” is one of the most cited papers in economics. Published in Econometrica, it formalized experimental findings into a coherent mathematical framework. Kahneman received the Nobel Prize in 2002; Tversky had passed away in 1996.

Maurice Allais (1953). The French economist presented his paradox directly to Leonard Savage. Legend has it Savage himself fell into the Allais pattern. Allais received the Nobel Prize in 1988.

Richard Thaler (2017 Nobel). Thaler's “Anomalies” column systematically catalogued behavioral deviations. His 2008 book Nudge (with Sunstein) brought behavioral insights to policy, leading to “nudge units” worldwide.

David Laibson (1997). “Golden Eggs and Hyperbolic Discounting” formalized the beta-delta model and explained why people simultaneously hold credit card debt at 18% interest and illiquid savings at 5%.

Shleifer and Vishny (1997). “The Limits of Arbitrage” showed why rational traders cannot eliminate mispricing when they manage other people's money and face capital constraints.

Summary

  1. Expected utility violations. The Allais paradox (certainty effect) and Ellsberg paradox (ambiguity aversion) demonstrate that EU axioms fail descriptively.
  2. Prospect theory. Kahneman and Tversky's alternative features reference dependence, loss aversion ($\lambda \approx 2.25$), diminishing sensitivity, and probability weighting. The fourfold pattern explains simultaneous lottery-ticket buying and insurance purchasing.
  3. Present bias. The quasi-hyperbolic model ($\beta < 1$) captures disproportionate weight on immediate payoffs, generating time inconsistency, procrastination, and demand for commitment devices.
  4. Social preferences. Fehr-Schmidt inequality aversion explains rejections of unfair offers, positive giving in dictator games, and conditional cooperation.
  5. Bounded rationality. Heuristics (representativeness, availability, anchoring) produce systematic biases. Gabaix's sparse maximization formalizes bounded rationality as optimal attention allocation.
  6. Experimental methodology. Lab experiments offer internal validity; field experiments offer external validity. The replication crisis has driven pre-registration and more rigorous standards.
  7. Nudge theory. Choice architecture is inevitable; libertarian paternalism uses defaults, framing, and simplification to improve welfare without restricting choice. The EAST framework operationalizes this.
  8. Behavioral finance. Overconfidence drives excess trading. The disposition effect follows from prospect theory. Limits to arbitrage (Shleifer-Vishny) and noise trader risk (DSSW) explain why mispricing persists.

Key Equations

LabelEquationDescription
Eq. 19.1$EU(L) = \sum p_i u(x_i)$Expected utility
Eq. 19.2$v(x) = x^\alpha$ (gains), $-\lambda(-x)^\beta$ (losses)Prospect theory value function
Eq. 19.3$w(p) = p^\delta / (p^\delta + (1-p)^\delta)^{1/\delta}$Tversky-Kahneman probability weighting
Eq. 19.4$V(L) = \sum w(p_i) v(x_i - r)$Prospect theory valuation
Eq. 19.5$U_0 = u(c_0) + \beta \sum \delta^t u(c_t)$Quasi-hyperbolic discounting
Eq. 19.6$\beta\delta u'(c_1) = u'(c_0) \neq \delta u'(c_1)$Time inconsistency
Eq. 19.7$U_i = x_i - \alpha_i \max(x_j-x_i,0) - \beta_i \max(x_i-x_j,0)$Fehr-Schmidt inequality aversion
Eq. 19.8$\max u(c) - \theta\|m\|_1$ s.t. $p \cdot c \leq w$Gabaix sparse maximization
Eq. 19.9$P_{\text{enroll}} = \Phi((v - k(1-d))/\sigma)$Default-sensitive enrollment
Eq. 19.10$p_t = f_t + \gamma \rho_t \mu_t / (1+r)$DSSW noise trader pricing

Practice

  1. A lottery pays $+\$500$ with probability 0.6 and $-\$300$ with probability 0.4. Compute the valuation under (a) expected utility with $u(x) = \ln(W+x)$, $W = 10{,}000$, and (b) prospect theory with $\alpha = \beta = 0.88$, $\lambda = 2.25$, $\delta = 0.65$. Does the agent accept or reject under each model?
  2. Prove algebraically that the choice pattern $\{1A, 2B\}$ in the Allais paradox violates the independence axiom. Write the expected utility of each gamble in terms of a general $u(\cdot)$ and show no $u$ can rationalize both preferences.
  3. An agent has $\beta = 0.8$, $\delta = 0.90$. Compare the present-discounted value of receiving \$100 at $t = 3$ under (a) beta-delta and (b) exponential discounting. By what percentage does present bias reduce the perceived value?
  4. In a \$200 ultimatum game, the responder has $\alpha_R = 1.5$, $\beta_R = 0.4$. Compute the minimum acceptable offer $s^*$. What fraction of the total is this?

Apply

  1. Using prospect theory, explain why the same person buys lottery tickets and insurance (the fourfold pattern). Compute subjective valuations of: (a) a \$5 lottery ticket with 1-in-10,000 chance of \$50,000, and (b) \$200/year insurance against 1-in-10,000 chance of losing \$500,000. Use standard parameters.
  2. A present-biased student ($\beta = 0.6$, $\delta = 0.95$) has three homework assignments due on days 1, 2, 3. Each costs 4 utils and yields 8 utils. Using backward induction for the sophisticated agent, determine the work schedule. Compare to the naive agent.
  3. Compute opt-in vs opt-out enrollment rates using Eq. 19.9 with $v=2$, $\sigma=3$, $k=4$. What is the enrollment gap? If there are 5,000 employees, how many additional enrollments from switching to opt-out?
  4. In the DSSW model with $f = 50$, $\rho = 0.20$, $\mu = 10$, $r = 0.04$: compute equilibrium price for $\gamma = 1$, $\gamma = 3$, $\gamma = 10$. What happens to the deviation as risk aversion increases?

Challenge

  1. Derive the disposition effect using prospect theory. An investor bought at $P_0 = 50$. The stock is at $P_1 = 70$ (gain) or $P_1 = 30$ (loss). Each can rise or fall \$10 with equal probability. Compute utility of selling vs holding for each. Show the investor sells the winner and holds the loser.
  2. A planner considers mandatory commitment (illiquid savings locking 20% of income). Population: 60% have $\beta = 1$, 40% have $\beta = 0.6$. All have $\delta = 0.95$, log utility. (a) Welfare change for each type. (b) When does the mandate reduce aggregate welfare? (c) Why might opt-in be superior?
  3. Apply Gabaix sparse maximization to $n=3$ goods with $p_1 = 10$, $p_2 = 10.50$, $p_3 = 50$, default $\bar{p} = 10$, $\theta = 0.5$, Cobb-Douglas utility, wealth $w = 100$. (a) Which dimensions get attention? (b) Demand under full vs sparse attention. (c) Why does the consumer overspend on good 3?
  4. Show that DSSW noise traders can earn higher expected returns than arbitrageurs. With mean misperception $\mu > 0$ and variance $\sigma_\mu^2$: (a) derive expected excess return; (b) show the condition for noise traders to outperform; (c) explain the paradox.