This final chapter brings together the book's threads — micro, macro, institutions, and empirics — to address the most consequential question in economics: why are some countries rich and others poor, and what can be done about it?
Development economics is not "applied growth theory." It deals with coordination failures, institutional traps, human capital deficits, and political economy that standard models abstract away. It also features the most dramatic methodological revolution in modern economics: the rise of randomized controlled trials as a tool for evaluating interventions — and, more recently, the counter-revolution in structural estimation that seeks to push beyond what any single experiment can tell us.
This chapter synthesizes the entire textbook. Growth theory (Ch 13) provides the framework. Institutions (Ch 18) provide the deep determinants. Econometrics (Ch 10) provides the identification tools — instrumental variables, regression discontinuity, and the logic of causal inference. Behavioral insights (Ch 19) inform the design of development interventions.
Prerequisites: Ch 10 (Econometrics Foundations — IV, regression), Ch 13 (Growth Theory — Solow model, steady states), Ch 18 (Institutional Economics — AJR, extractive/inclusive), Ch 19 (Behavioral Economics — nudges, RCTs).
The richest countries — Norway, Switzerland, the United States — have GDP per capita above \$60,000 (PPP). The poorest — Burundi, South Sudan, the Central African Republic — have GDP per capita below \$500. A factor of over 100 separates the richest from the poorest, and this gap has widened dramatically over two centuries. In 1800, the richest-to-poorest ratio was approximately 5:1. By 2000, it exceeded 100:1. This "Great Divergence" is the central fact that development economics must explain.
The Penn World Table reveals several patterns. In the early 19th century, the distribution was approximately unimodal: nearly all countries were poor. The Industrial Revolution created a divergence that accelerated through the 20th century. By the 1970s–1980s, the distribution had become distinctly bimodal — "twin peaks" (Quah 1996). Since 2000, rapid growth in China and India has partially filled the gap, though Sub-Saharan Africa remains largely at the lower peak.
| Kaldor Facts (Ch 13) | Development Facts (This Chapter) |
|---|---|
| Constant capital-output ratio | Rising capital-output ratio during industrialization |
| Constant labor share | Falling labor share in agriculture, rising in industry then services |
| Constant growth rate of output per worker | Highly variable growth; episodes of acceleration and stagnation |
| Balanced growth path | Structural transformation; unbalanced, sector-shifting growth |
The Solow model (Ch 13) captures the Kaldor facts well. It does not capture the development facts — it has one sector, one type of labor, and smooth convergence. Development economics requires models with multiple sectors, heterogeneous labor, and the possibility of traps.
Figure 20.3. Global income distribution over time (stylized). Slide through decades to see the evolution from unimodal (1800) to twin peaks (1970s) to partial convergence (2000s). Use the slider or play button.
The modern sector uses capital and labor in a Cobb-Douglas production function:
The subsistence sector features surplus labor:
The modern sector hires workers as long as $MPL_M > \bar{w}$. During the surplus labor phase, the modern sector faces a perfectly elastic labor supply at wage $\bar{w}$. Profits ($\Pi_M = Y_M - \bar{w}L_M$) are reinvested, creating a virtuous cycle: capital accumulation raises $MPL_M$, absorbing more workers, generating more profits.
Why it matters: A poor country has a bottomless pool of farm workers whose extra output is essentially nil — pull one off the land and nothing is lost. So the modern factory sector can hire as many as it wants at a flat subsistence wage, reinvest the profits, and grow by absorbing workers rather than bidding up pay. That free ride lasts until the pool runs dry — the Lewis turning point — after which extra workers come only by raising wages, and growth has to shift to making each worker more productive. China between 1980 and 2010 is the textbook case: hundreds of millions moved from fields to coastal factories, with wages staying flat until they finally surged around 2010–2015. The slider figure below lets you watch the modern sector swell and the turning point arrive.
China is the most dramatic modern illustration. Between 1980 and 2010, China transferred hundreds of millions of workers from rural agriculture to urban manufacturing, generating growth rates of 10% per year. Economists debate whether China crossed its Lewis turning point around 2010–2015, evidenced by rapidly rising wages in coastal manufacturing zones.
Figure 20.2. Lewis dual-economy model. Left: modern sector MPL curve and subsistence wage. Right: output by sector. Increase capital to absorb labor; watch for the Lewis turning point. Drag the sliders to explore.
The Kaelani Republic has 10 million workers. Currently, 7 million work in subsistence with surplus labor of 3 million ($\bar{L} = 4$ million). Modern sector: $A_M = 2$, $K_M = 100$, $\alpha = 0.4$.
(a) Current modern output ($L_M = 3$M): $Y_M^{\text{before}} = 2 \times 100^{0.4} \times 3^{0.6} \approx 24.40$. After reallocating 1M workers ($L_M = 4$M): $Y_M^{\text{after}} = 2 \times 100^{0.4} \times 4^{0.6} \approx 28.99$. Output gain = 4.59 units (18.8% increase), with zero subsistence loss since transferred workers were surplus.
(b) At the turning point, $L_M = L - \bar{L} = 6$M. Setting $MPL_M = \bar{w} = 1$: $K_M^* \approx 3.80$ — a low threshold reflecting the abundance of surplus labor and modest subsistence wage.
The standard Solow model features a concave production function guaranteeing a unique stable steady state. Poverty traps require an S-shaped (locally convex) production function creating multiple crossings between $sf(k)$ and $(n+\delta)k$.
Figure 20.1. Poverty trap diagram. The S-shaped $sf(k)$ curve crosses the $(n+\delta)k$ line at up to three points. Drag the dot to see convergence to the low trap or high equilibrium. Adjust saving rate and curvature with the sliders. Drag the initial condition dot to explore.
Why it matters: When the production function bends the wrong way at low capital — each early dollar adds little, but past some threshold it pays off — an economy can have two resting points: a poverty trap and a prosperous equilibrium, with an unstable tipping point between them. The reason no single factory modernizes on its own is that modernizing only pays once enough other firms have done it too: a steel mill needs customers with money, who need jobs at other modern firms. Everyone waiting on everyone else is a coordination failure, and it locks the economy at the low point. A coordinated “big push” — investing across many sectors at once — jumps the whole economy over the tipping point together. Drag the starting-capital dot in the figure below across the unstable threshold and watch the economy fall toward the trap or climb to prosperity.
Here $n$ is the number of other sectors that have industrialized, $N$ is the total number of sectors, $L$ is the labor force, and $F$ is the fixed cost of adopting modern technology. The key feature: $\pi_i$ is increasing in $n$ — demand spillovers from industrialized sectors raise profits for each firm that modernizes. When $n = 0$, $\pi_i(0) = -F < 0$: no firm wants to industrialize alone. When $n = N$, $\pi_i(N) = \alpha L - F > 0$ if $L$ is large enough. The MSV model thus generates two Nash equilibria: no industrialization ($n = 0$, the poverty trap) and full industrialization ($n = N$, the developed equilibrium). A government can serve as the coordinating mechanism — subsidizing simultaneous investment across sectors to push the economy from the low to the high equilibrium.
Not all poor countries are trapped. Kraay and McKenzie (2014) find limited evidence for poverty traps at the household level. At the country level, persistent underdevelopment in parts of Sub-Saharan Africa is more consistent with trap dynamics, particularly when combined with institutional failure and conflict.
Given $f(k) = k^2/(1+k^2)$ (S-shaped), $s = 0.20$, $n+\delta = 0.10$. Setting $sf(k) = (n+\delta)k$ and solving yields $k = 0$ and $k = 1$ (repeated root — the trap is on the verge of existence).
For a richer example, $f(k) = k^{3}/(1+k^{3})$ with $s = 0.25$, $n+\delta = 0.10$ yields three solutions: $k_L^* \approx 0$ (poverty trap), $k_U \approx 0.76$ (unstable threshold), $k_H^* \approx 2.31$ (high equilibrium). At $k_U$, the production function is locally convex so $g'(k_U) > 0$ — unstable. The big push requires injecting $\Delta k \approx 0.76$ per worker.
The fundamental challenge is endogeneity: rich countries can afford better institutions. AJR (2001) proposed an IV strategy using settler mortality. The first-stage coefficient $\beta$ is negative and highly significant (F-statistic > 20). The 2SLS estimate $\hat{\delta} \approx 0.94$ exceeds OLS ($\approx 0.52$) — consistent with attenuation bias from measurement error.
Why it matters: You can’t prove institutions cause wealth just by noticing that rich countries have good institutions — rich countries can afford good institutions, so the arrow might run the other way. Acemoglu, Johnson and Robinson found a natural experiment: where European colonizers faced deadly disease they couldn’t settle, so they set up purely extractive states; where they survived, they built the inclusive institutions they knew. Those centuries-old death rates can only affect a country’s income today through the institutions they shaped — which lets them isolate the causal channel. The instrument estimate comes out larger than the raw correlation not by magic but because mismeasured institutions blur the simple comparison. The scatter figure below lets you switch between settler mortality, latitude, and rule-of-law on the horizontal axis and watch how tightly each tracks income.
Natural experiments reinforce the institutions hypothesis: North vs. South Korea, East vs. West Germany, pre- vs. post-reform China, and Botswana vs. its neighbors all illustrate how institutional divergence drives income divergence.
Figure 20.4. Institutions vs. geography scatter. Toggle the x-axis variable to compare settler mortality, latitude, and rule of law as predictors of income. Use the dropdown to switch views.
Results: First-stage F = 22.9, $\hat{\beta} = -0.61$, 2SLS $\hat{\delta} = 0.94$ (SE = 0.16), OLS = 0.52. (a) A one-unit increase in institutional quality causes a 0.94 log-point increase in GDP/capita. Moving from 25th percentile (score 5) to 75th (score 8) predicts a \$1 \times 0.94 = 2.82$ log-point increase — roughly 16.8x.
(b) Exclusion restriction threats: settler mortality may proxy for current disease environment (directly reducing productivity); Europeans may have invested differently in infrastructure beyond institutions. (c) IV > OLS likely due to attenuation bias: if the reliability ratio is ~0.55, then \$1.52/0.55 \approx 0.94$.
| Income Group | Average Return ($\hat{\rho}$) |
|---|---|
| Low-income countries | 10.5% |
| Lower-middle-income | 8.7% |
| Upper-middle-income | 7.2% |
| High-income countries | 5.4% |
Why it matters: Each extra year of school raises a worker’s pay by a roughly constant percentage — and that percentage is larger where educated workers are scarce. So the return is around 10–14% in poor countries and only 5–7% in rich ones, simply because scarcity commands a premium. Health is human capital in the same way: a child who is dewormed, fed, and free of chronic disease learns more in school and earns more as an adult, with returns that rival schooling — which is why a few dollars of deworming can be one of the most cost-effective things a development budget buys. The figure below lets you slide schooling years and the return rate to trace out the wage profile.
Bleakley (2007) exploited geographic variation in hookworm prevalence to show a 17% income increase per SD reduction. Miguel & Kremer (2004) found deworming reduced school absenteeism by 25% with large spillovers — approximately \$3.50 per additional year of attendance, among the most cost-effective development interventions known.
Figure 20.5. Mincer equation explorer. Adjust schooling years and returns to see how the log-wage profile shifts. The dashed line shows the premium from 4 additional years. Drag the sliders to explore.
Country A (low-income): $\hat{\rho} = 0.10$, $\hat{\beta}_1 = 0.03$, $\hat{\beta}_2 = -0.0005$. Country B (high-income): $\hat{\rho} = 0.05$, $\hat{\beta}_1 = 0.05$, $\hat{\beta}_2 = -0.0008$. The education premium for 4 extra years: Country A = $e^{0.40}-1 = 49.2\%$; Country B = $e^{0.20}-1 = 22.1\%$.
Wage peaks at $\text{Exp}^* = \beta_1 / (2|\beta_2|)$: Country A at 30 years, Country B at 31.25 years. Returns differ due to scarcity, ability bias, credit constraints, school quality, and signaling vs. human capital effects.
Banerjee, Duflo, and Kremer received the 2019 Nobel Prize for their experimental approach to alleviating global poverty. Key findings: cash transfers work and do not reduce effort; microfinance is not transformative; deworming is extraordinarily cost-effective. The RCT revolution's greatest contribution was replacing prior beliefs with evidence.
Why it matters: Flip a coin to decide who gets a program and who doesn’t, and the two groups end up identical in expectation on everything — rich and poor, motivated and not. So any difference you see afterward must be the program’s doing; you don’t need a model of human behavior to believe it. That credibility is what won Banerjee, Duflo and Kremer the 2019 Nobel. The catch is sample size: with too few people, a real effect hides inside ordinary noise. The power formula just turns that worry into a number — how many people (or villages) you must enroll to be confident of spotting an effect of a given size. The figure below lets you dial the effect size, variability, and cluster design and watch the required sample swing.
| Intervention | Finding | Study |
|---|---|---|
| Deworming | 25% reduction in absenteeism; large spillovers | Miguel & Kremer (2004) |
| Bed nets | Free distribution yields much higher adoption than cost-sharing | Cohen & Dupas (2010) |
| Microfinance | Modest effects on business income; no transformative poverty reduction | Banerjee et al. (2015) |
| Cash transfers (UCT) | Recipients invest productively; effects persist | GiveDirectly (Haushofer & Shapiro 2016) |
| Cash transfers (CCT, Progresa) | +8pp school enrollment, improved nutrition | Schultz (2004) |
| Teacher incentives | Incentive pay raises test scores; design details matter | Muralidharan & Sundararaman (2011) |
Figure 20.6. RCT power calculator. See how effect size, variance, significance level, and clustering affect the required sample size. The dashed line marks 80% power. Drag the sliders to explore.
Kaelani's ministry expects a \$30/month income effect ($\sigma = 120$). At $\alpha = 0.05$, 80% power: $N = 2 \times 120^2 \times (1.96+0.84)^2 / 30^2 \approx 251$ per arm. With cluster randomization (42 villages, 60 households each, ICC = 0.04): design effect = 3.36, effective sample = 744 — well above 251.
If budget allows only 1,500 per arm: effective sample $\approx 446$. MDE $= \sqrt{2 \times 14400 \times 7.84 / 446} \approx$ \$22.50/month — smaller than the expected \$30 effect, so the study remains adequately powered.
Zambian economist Dambisa Moyo's Dead Aid and her TED talk made the incendiary case: over \$1 trillion in aid to Africa hadn't just failed — it had "created dependency, fueled corruption, and killed African entrepreneurship." Bill Gates publicly called the book "evil." Jeffrey Sachs accused Moyo of advocating policies that would "lead to the deaths of millions." Moyo fired back that Sachs's own Millennium Villages Project was the real failure. The debate went nuclear. But who was actually right about the evidence?
AdvancedTodd and Wolpin (2006) validated a structural model against the Progresa RCT, then used it to simulate untested counterfactuals. Attanasio et al. (2012) showed the CCT worked primarily by reducing opportunity costs of schooling rather than relaxing budget constraints — a mechanism-based understanding that enables transportability.
The resolution combines structural and reduced-form approaches. RCTs provide credible causal estimates; structural models provide frameworks for generalization. The ideal workflow: use an RCT to identify parameters, feed them into a structural model, validate against experimental data, then extrapolate with honest uncertainty bounds.
Figure 20.8. Structural vs. reduced-form comparison. The left panel shows the original RCT estimate; the right shows predictions for a new site. As contexts diverge, the structural model adjusts honestly while naive extrapolation stays falsely precise. Use the toggle to switch scenarios.
Miguel & Kremer found 25% absenteeism reduction in Kenya; a replication in India found ~3pp (not significant). Key structural differences: helminth prevalence 75% (Kenya) vs. 20–30% (India); different school quality and access; different opportunity costs of child labor; smaller spillover effects.
A structural model of schooling with health inputs, calibrated to Kenya, predicts 7pp. Recalibrated with Indian parameters: 2–3pp — consistent with the replication. The model "knows what it doesn't know": it adjusts predictions and widens confidence intervals rather than falsely extrapolating.
The new structural economics (Lin) argues governments should identify industries consistent with latent comparative advantage. Rodrik extends this to green industrial policy: the clean energy transition requires coordinated public investment because carbon externalities are underpriced and learning-by-doing spillovers are not internalized.
The debate between conditional and unconditional cash transfers (UCTs) is central to contemporary policy. GiveDirectly's programs show UCTs work well — recipients invest productively and effects persist. Conditionality may matter when behavioral biases prevent optimal investment (connecting to Ch 19), but may be unnecessary when households already want to invest in children's human capital.
Figure 20.7. Cash transfer RCT simulator. Adjust transfer amount, duration, and conditionality to see how treatment effects vary across outcomes. Significance stars appear when the CI excludes zero. Drag the sliders to explore.
The colonial era (pre-1945) created the institutional foundations. The post-independence era (1945–1980) was dominated by big push thinking. The Washington Consensus (1980–2000) promoted markets. The RCT revolution (2000–2019) shifted focus to micro-level evidence. The post-2015 era synthesizes: big questions need structural thinking; specific policy questions need experimental evidence.
Kaelani implements a CCT: \$50/month to 2,500 randomly selected rural households, conditional on 80%+ school attendance, for 18 months. Control group: 2,500 households. Power calculation (Eq. 20.10): with $\sigma = 120$ and effective sample = 744 per arm (after cluster adjustment), the MDE is \$17/month at 80% power. The expected \$30–35 effect is well above this threshold.
Cluster randomization (42 treatment + 42 control villages, ICC = 0.04, cluster size 60) yields design effect = 3.36. Effective sample = 744 per arm, above the 309 minimum. Pre-registered outcomes: consumption, enrollment, dietary diversity, savings.
Results after 18 months: Monthly consumption +\$32 (p < 0.01), school enrollment +8pp (p = 0.01), dietary diversity +0.4 SD (p < 0.01), savings +\$15 (p = 0.02), adult labor supply −2 hrs/wk (p = 0.27, not significant). Compliance 94%; labor supply concern dismissed. The \$50 transfer generates \$32 in consumption gains, suggesting local spending multipliers.
Institutional analysis (Ch 18): The CCT builds state capacity — payment systems, monitoring infrastructure, bureaucratic accountability. The school attendance condition works because Kaelani invested in school construction during its 2005 reform. Without schools, conditionality is meaningless.
External validity (Sec 20.7): The Talani Republic wants to replicate. Reduced-form: naive extrapolation ignores Talani's weaker institutions and different demographics. Structural model: predicts +5pp enrollment (vs. Kaelani's +8pp) and +\$28 consumption (vs. \$32), with 90% interval [+1pp, +9pp] for enrollment. The Deaton critique applies: RCTs answer "did it work here?" but not "will it work there?"
The textbook's threads converge: Kaelani's development depends on institutions (Ch 18), growth fundamentals (Ch 13), macroeconomic stability (Chs 14–16), behavioral insights (Ch 19), and evidence-based evaluation (this chapter).
| Label | Equation | Description |
|---|---|---|
| Eq. 20.1 | $Y_M = A_M K_M^\alpha L_M^{1-\alpha}$ | Modern sector Cobb-Douglas production |
| Eq. 20.2 | $Y_S = A_S \min(L_S, \bar{L})$ | Subsistence sector with surplus labor |
| Eq. 20.3 | Lewis turning point: $MPL_S = \bar{w} \Rightarrow L_S^* = \bar{L}$ | Surplus labor exhaustion threshold |
| Eq. 20.4 | $\dot{k} = sf(k) - (n+\delta)k$, $f$ S-shaped | Capital accumulation with poverty trap |
| Eq. 20.5 | $\pi_i(n) = \alpha(n/N)L - F$ | MSV: industrialization profit (increasing in $n$) |
| Eq. 20.6 | $\text{Inst}_i = \alpha + \beta\ln(\text{settler mort}_i) + \mathbf{X}_i'\gamma + \varepsilon_i$ | AJR IV first stage |
| Eq. 20.7 | $\ln w_i = \alpha + \rho S_i + \beta_1 \text{Exp}_i + \beta_2 \text{Exp}_i^2 + u_i$ | Mincer wage equation |
| Eq. 20.8 | $Y = A(H)K^\alpha(hL)^{1-\alpha}$, $h = e^{\phi S + \psi\text{Health}}$ | Augmented production (health + education) |
| Eq. 20.9 | $\hat{\tau}_{ATE} = \bar{Y}_T - \bar{Y}_C$ | ATE estimator under randomization |
| Eq. 20.10 | $N = 2\sigma^2(z_{\alpha/2}+z_\beta)^2 / \tau^2$ | Minimum sample size for power \$1-\beta$ |
Lewis (1954); Rosenstein-Rodan (1943); Murphy, Shleifer & Vishny (1989); Acemoglu, Johnson & Robinson (2001); Nunn (2008); Mincer (1974); Bleakley (2007); Miguel & Kremer (2004); Banerjee, Duflo & Kremer (Nobel 2019); Todd & Wolpin (2006); Attanasio, Meghir & Santiago (2012); Deaton (2010); Allcott (2015); Lin (2012); Rodrik (2004).