ABM & College Admissions: Literature Context
abm_literature_context.md
ABM & College Admissions: Literature Context
Comprehensive literature review for calibrating and grounding the college-sim agent-based model.
1. Gale-Shapley & Matching Market Theory
1.1 Gale & Shapley (1962) — Foundational Paper
Citation: Gale, D. & Shapley, L.S. (1962). "College Admissions and the Stability of Marriage." The American Mathematical Monthly, 69(1), 9–15.
Key concepts: - Two-sided matching: students have preferences over colleges, colleges have preferences over students - Deferred acceptance (DA) algorithm: one side proposes, the other tentatively accepts or rejects; proposals cascade until stable - Stability: no student-college pair both prefer each other over their current match - The student-proposing DA yields the student-optimal stable matching; the college-proposing DA yields the college-optimal one - Result: a stable matching always exists in the college admissions problem
Relevance to college-sim: - Our simulation uses a sequential round structure (ED → EA → RD) rather than DA, reflecting real-world institutional design - Real college admissions are NOT a pure stable matching — colleges use holistic review (noisy signals), students have incomplete info, and binding ED creates strategic asymmetries - The Gale-Shapley framework is the theoretical benchmark against which to understand deviations
1.2 Roth (2008) — DA History, Theory, Practice
Citation: Roth, A.E. (2008). "Deferred Acceptance Algorithms: History, Theory, Practice, and Open Questions." International Journal of Game Theory, 36(3), 537–569.
Key insights: - DA underpins real matching markets: NRMP (medical residencies), NYC/Boston school choice - Three properties a market needs: thickness (enough participants), congestion management (handle the volume), safety (incentive-compatible — truthful reporting is optimal) - Student-proposing DA is strategy-proof for students but not for colleges - In practice, colleges' capacity constraints and strategic behavior mean pure DA doesn't describe elite admissions
Relevance: Our model captures congestion (application volume limits), thickness (20 high schools × 55 colleges), but uses stochastic holistic review rather than strict preference orderings.
1.3 Abdulkadiroğlu & Sönmez (2003) — School Choice as Mechanism Design
Citation: Abdulkadiroğlu, A. & Sönmez, T. (2003). "School Choice: A Mechanism Design Approach." American Economic Review, 93(3), 729–747.
Key contributions: - Formalized K–12 school choice as a matching problem - Showed Boston mechanism (immediate acceptance) is manipulable — families misrepresent preferences - Proposed two alternatives: student-proposing DA (stable, strategy-proof for students) and top trading cycles (Pareto efficient, strategy-proof) - Stability vs. efficiency tradeoff: no mechanism can be both stable and Pareto efficient (Roth 1982)
Relevance: The stability-efficiency tradeoff directly applies to our simulation. ED/binding commitments trade off student welfare (can't compare offers) for institutional yield certainty. Our model could be extended to test alternative mechanisms.
1.4 Stability vs. Efficiency in College Admissions
- No mechanism is both stable and efficient (fundamental impossibility result)
- In simulations, DA is efficient in "thick" markets where every student gets placed somewhere, but manipulation incentives appear in markets with unmatched students
- Real college admissions add complications: financial aid packages, legacy preferences, athletic recruitment, and multiple rounds — all deviations from pure DA
2. Empirical Foundations for ABM Calibration
2.1 Chetty, Deming & Friedman (2023) — "Diversifying Society's Leaders?"
Citation: Chetty, R., Deming, D.J. & Friedman, J.N. (2023). "Diversifying Society's Leaders? The Determinants and Causal Effects of Admission to Highly Selective Private Colleges." NBER Working Paper 31492.
Dataset: 2.4 million students × 139 colleges, tax records linked to admissions data (Opportunity Insights), 2010–2015 cohorts.
Key quantitative findings:
| Finding | Statistic |
|---|---|
| Top-1% income kids vs. middle-class (same SAT) at Ivy+ | 2× more likely to attend |
| Source: higher admit rates (same scores) | 2/3 of the gap |
| Source: differential application/matriculation | 1/3 of the gap |
| Legacy admission advantage | 5–6× higher admit rate (same credentials) |
| Share of advantage from legacy | 46% |
| Share from athletic recruitment | 24% |
| Share from non-academic ratings (essays, recs) | 30% |
| Causal effect of Ivy+ on reaching top 1% earnings | +50% vs. flagship public |
| Causal effect on elite grad school | ~2× |
| Causal effect on prestigious firm employment | ~3× |
Critical finding for simulation calibration: - The three preference factors (legacy, athlete, non-academic) are uncorrelated or negatively correlated with post-college outcomes - Academic credentials (SAT/ACT) are highly predictive of post-college success - This validates our model's use of academic index as the primary signal, with hooks as admission multipliers that don't reflect academic quality
Relevance to college-sim: Directly informs hook multipliers (legacy 5–6×, athlete preference ~24% of advantage), income-SAT correlation, and yield differences by income bracket. Our chetty_yield_by_college.json is derived from this dataset.
2.2 Arcidiacono, Kinsler & Ransom (2019/2022) — Legacy and Athlete Preferences at Harvard
Citation: Arcidiacono, P., Kinsler, J. & Ransom, T. (2022). "Legacy and Athlete Preferences at Harvard." Journal of Labor Economics, 40(1). (NBER WP 26316, 2019.)
Key quantitative findings (Harvard Classes of 2014–2019):
| Category | Statistic |
|---|---|
| White admits who are ALDC | 43% |
| Non-white admits who are ALDC | <16% each group |
| Athlete admit rate | 86% |
| Non-ALDC admit rate | <5.5% |
| White ALDC who'd be rejected without preference | ~75% |
| Asian-American avg SAT advantage over white | +24.9 points |
| Hypothetical Asian-American share (academics only) | 43% |
ALDC = Athletes, Legacies, Dean's interest list, Children of faculty/staff
Relevance to college-sim: These are the most granular hook multiplier estimates available. Our model's hook system (athlete 3.5×, donor 4×, legacy 2.5×, first-gen 1.4×) should be validated against these empirical rates. The 86% athlete admit rate at Harvard implies an enormous multiplier relative to the ~5.5% base rate (~15.6× raw ratio, though controlling for academic quality reduces this).
2.3 Avery & Levin (2010) — Early Admissions Signaling
Citation: Avery, C. & Levin, J. (2010). "Early Admissions at Selective Colleges." American Economic Review, 100(5), 2125–56.
Key findings: - ED/EA provides a signaling mechanism — students demonstrate genuine interest - ED advantage: 20–30 percentage points higher admit rate, equivalent to ~100 SAT points - Colleges value ED because it reduces uncertainty about yield - Strategic asymmetry: wealthier students can "afford" to commit early (less need for financial aid comparison)
Relevance: Validates our model's ED multiplier system. The 20–30pp advantage aligns with our empirical ED multiplier data (e.g., Dartmouth 3.5×, Columbia 3.4×).
2.4 Dale & Krueger (2002, 2014) — Returns to College Selectivity
Citation: Dale, S.B. & Krueger, A.B. (2002). "Estimating the Payoff to Attending a More Selective College." Quarterly Journal of Economics, 117(4), 1491–1527.
Key findings: - After controlling for where students applied (revealed ambition), attending a more selective college has zero average earnings premium - Exception: Low-income students benefit significantly (~8% earnings increase per 200-point SAT increase in college average) - Implication: selection bias explains most of the apparent selectivity premium
Relevance: Challenges simple prestige-maximizing utility functions in ABMs. Our model's student utility function should perhaps weight financial fit more heavily for low-income agents, and prestige less.
2.5 Hoxby & Avery (2013) — The Missing "One-Offs"
Citation: Hoxby, C. & Avery, C. (2013). "The Missing 'One-Offs': The Hidden Supply of High-Achieving, Low-Income Students." Brookings Papers on Economic Activity, Spring 2013, 1–65.
Key findings: - Most high-achieving low-income students never apply to selective colleges - They apply to resource-poor local institutions that would actually cost MORE (after financial aid) - Two types: "achievement-typical" (apply like high-income peers) and "income-typical" (apply only locally) - Income-typical students are geographically dispersed — not in feeder school networks - Standard recruiting (campus visits, college fairs) misses them entirely
Relevance to college-sim: Our model should capture differential application behavior by income/school type. The archetype-based application count system partially handles this. Feeder-school students apply broadly; isolated students under-apply. This is a key mechanism driving stratification in the Reardon et al. ABM.
2.6 Avery, Glickman, Hoxby & Metrick (2013) — Revealed Preference Rankings
Citation: Avery, C., Glickman, M.E., Hoxby, C. & Metrick, A. (2013). "A Revealed Preference Ranking of U.S. Colleges and Universities." Quarterly Journal of Economics, 128(1), 425–467.
Key insights: - Constructed college rankings from 3,240 students' actual enrollment choices (which offer they accepted) - Uses tournament-style statistical model (Elo-like) - Rankings align roughly with selectivity but diverge from U.S. News in interesting ways - Provides empirical student utility ordering — useful for calibrating our prestige weights
Relevance: Could inform the prestige ranking and utility calculation in buildCollegeLists().
2.7 CommonApp Annual Data (2024–2025 Season)
Source: Common Application End-of-Season Report, 2024–2025.
| Metric | Value |
|---|---|
| Total applicants | ~1.5 million |
| Member institutions | 1,097 |
| Applications per applicant | 6.80 (up 2% from 6.64) |
| YoY applicant growth | +5% |
| Fastest-growing demographics | Latinx (+15%), Black (+12%) |
| Top state by applicant count | Texas (overtook NY, CA) |
| International applicants | -1% (first decline since 2019–20) |
Relevance: Our model uses 6.8 apps/student as the baseline — this exactly matches the 2024–25 CommonApp data. The growth trends inform archetype distribution calibration.
2.8 College Board SAT Validity Research
Source: College Board (2024). "SAT Score Relationships with College GPA." (111,899 students, 4-year tracking.)
Key findings: - SAT adds 15% more predictive power over HSGPA alone for first-year college GPA - For STEM: SAT adds 38% more predictive power than HSGPA alone - SAT predicts cumulative GPA across all 4 years, not just freshman year - Predictive validity holds across demographic subgroups
Relevance: Validates using SAT + GPA as the academic index in our admission scoring model. The higher STEM prediction aligns with potential major-specific modeling.
2.9 ACT College Readiness Benchmarks
Source: ACT Research & Policy (2017). "What Are the ACT College Readiness Benchmarks?"
- Benchmarks represent 50% probability of earning B+ or 75% probability of C+ in corresponding college courses
- 84% of students meeting all four benchmarks graduate within 6 years
- Hierarchical logistic models used for institution-specific predictions
Relevance: Provides external validation for our academic index → admission probability mapping.
3. Agent-Based Models of College Admissions (2010–2025)
3.1 Reardon, Kasman, Klasik & Baker (2016) — The Key ABM Paper
Citation: Reardon, S.F., Kasman, M., Klasik, D. & Baker, R. (2016). "Agent-Based Simulation Models of the College Sorting Process." Journal of Artificial Societies and Social Simulation (JASSS), 19(1), 8.
This is the most directly relevant paper to our project.
Model Architecture
| Component | Detail |
|---|---|
| Student agents | 8,000 per simulation run |
| College agents | 40 institutions |
| Stages per year | Application → Admission → Enrollment |
| Simulation duration | 30 years to equilibrium |
| Runs per condition | 100 (stochastic noise mitigation) |
Agent Decision Rules
Students: - Characterized by "resources" (SES) and "caliber" (academic quality) - Resource-caliber correlation: r = 0.3 (from ELS:2002) - Perceived caliber: C*_s = C_s + c_s + e_s (true + enhancement + noise) - Quality reliability: 0.7 + 0.1 × resources (information advantage for wealthy) - Application portfolio: maximize E[utility] = P(admission) × utility(college quality) - Caliber distribution: N(1000, 200) — matches College Board data
Colleges: - Rank applicants by perceived caliber - Admit top-ranked to fill expected enrollment (based on historical yield) - Yield rates: initial = 0.2 + 0.06 × quality_percentile
Five SES → College Sorting Mechanisms
| Mechanism | Effect Size | Description |
|---|---|---|
| Resource-caliber correlation | Dominant (reduces 90-10 gap from ~50% to ~20% when removed) | SES linked to academic quality |
| Application enhancement | 3–6 pp | Test prep, essay coaching boost perceived credentials |
| Information quality | 2–5 pp | Wealthy students know college quality and own caliber better |
| Application volume | 1–2 pp | More apps for wealthier students |
| Utility preferences | Negligible | Differential valuation of prestige |
Key Results
- 90th percentile resources: ~93% college enrollment
- 10th percentile resources: ~55% college enrollment
- 90th percentile students ~20× more likely at top-10% colleges vs. 10th percentile
- Model output matches IPEDS data on applications/admissions/yield by tier
- Latin Hypercube sampling used for sensitivity analysis (10 combos per 5D space)
Methodological Notes
- Fast algorithm (Appendix D): recursive portfolio selection avoids combinatorial explosion
- Equilibrium emergence: yield rates and application patterns stabilize after ~15–20 simulated years
Relevance to college-sim: Our model shares the same three-stage structure but adds: (1) multiple admission rounds (ED/EA/RD), (2) hook multipliers, (3) logistic admission model instead of rank-cutoff, (4) real college data instead of synthetic quality distributions. We should validate that our output patterns match theirs for comparable parameter settings.
3.2 Assayed & Maheshwari (2023) — Review of ABMs for University Admissions
Citation: Assayed, S.K. & Maheshwari, P. (2023). "A Review of Agent-based Simulation for University Students Admission." Computer Science & Engineering: An International Journal (CSEIJ), 13(2).
Survey findings: - Reviewed ABMs deployed by international admission offices - Models classified by: educational attainment level and university selection behaviors - Common platforms: NetLogo (dominant), some Python/Mesa - Parameters across models: GPA, test scores, family income, geographic proximity - Gap: most models focus on K–12 or single-country systems; few model elite U.S. admissions specifically
3.3 Assayed & Al-Sayed (2025) — Student Behaviors Survey
Citation: Assayed, S.K. & Al-Sayed, S. (2025). "Student Behaviors in College Admissions: A Survey of Agent-Based Models." Int. J. Emerging Multidisciplinaries: CS & AI.
- Explores ABM techniques for secondary education pathways and admissions
- Focuses on equitable practices and complex decision-making
- Reviews behavioral models including peer effects and information asymmetries
3.4 Daemen & Leoni (2025) — Netherlands Tertiary Education ABM
Citation: Daemen & Leoni (2025). "Simulating Tertiary Educational Decision Dynamics: An Agent-Based Model for the Netherlands." Journal of Economic Interaction and Coordination.
Key features: - Models economic motivations (wages, financial constraints) + sociological/psychological (peer effects, personality, geography) - Evaluates policy impacts: student grants vs. loans on enrollment by SES - Counter-intuitive finding: greater parental emphasis on achievement doesn't consistently raise district achievement - Different institutional context (Netherlands) but similar agent architecture
3.5 Sirolly (2023) — Toy Model of College Admissions
Citation: Sirolly, A. (2023). "A Toy Model of College Admissions." Blog post.
Model setup: - 50 colleges × 100 capacity, 5,000 applicants - Applicant ability: W_i ~ N(0, 1²), noisy signal W̃_i ~ N(W_i, 0.1²) - Utility: u_i(k) = I_k^(-β) + γ(K-k) - Belief shrinkage: P_α(admit | W_i) = (1-α)P(...) + α × I_k (weight on public signal)
Application inflation mechanism: 1. Applicants become pessimistic (weight public admit rates over private info) 2. Apply to more colleges as hedging behavior 3. Colleges see lower admit rates 4. Public signal becomes more pessimistic → repeat
Relevance: Captures the application volume spiral that drives real-world trends. Our model's archetype-based application count implicitly models this, but could be extended with dynamic belief updating.
4. Structural/Equilibrium Models (Non-ABM but Relevant)
4.1 Epple, Romano & Sieg (2006) — Equilibrium in Higher Education Markets
Citation: Epple, D., Romano, R. & Sieg, H. (2006). "Admission, Tuition, and Financial Aid Policies in the Market for Higher Education." Econometrica, 74(4), 885–928.
Model features: - Equilibrium model predicting: student sorting, financial aid, educational expenditures, outcomes - Strict quality hierarchy emerges endogenously - Higher-ranked colleges: need-based aid (can attract top students) - Lower-ranked colleges: merit-based aid (must compete for good students)
Relevance: Provides theoretical backing for our college tier system and could inform financial aid modeling extensions.
4.2 Chao Fu (2014) — Equilibrium in the College Market
Citation: Fu, C. (2014). "Equilibrium Tuition, Applications, Admissions, and Enrollment in the College Market." Journal of Political Economy, 122(2), 225–281.
Key features: - Structural model: students with heterogeneous abilities/preferences, application costs, uncertainty - Colleges: observe noisy signals, set tuition + admissions cutoffs - Estimated on NLSY97 data - Joint equilibrium: tuition, apps, admissions, enrollment all endogenous
Relevance: Our model treats tuition/financial aid as exogenous (via net cost data). Fu's framework shows how these could be endogenized in future extensions.
5. Policy Simulation Work
5.1 CEPA / Reardon et al. — SES-Based Affirmative Action Simulation
Citation: Reardon, S.F., Baker, R. & Kasman, M. (2017). "Can Socioeconomic Status Substitute for Race in Affirmative Action College Admissions Policies?" CEPA Working Paper 15-04.
Key findings: - Neither SES-based affirmative action nor race-targeted recruiting alone matches diversity of race-based affirmative action - Combined SES + race-targeted recruiting can achieve comparable diversity - Three policy levers with largest effects: 1. Reducing credential enhancement inequality (test prep gap) 2. Improving information quality for low-resource students 3. Subsidizing application volume for low-income students
Relevance: Our model could run these policy counterfactuals. The three mechanisms map directly to parameters in our student generation and application decision logic.
5.2 SFFA v. Harvard — Simulation Evidence in Litigation
Key simulation results from trial evidence: - Simulation D (removing race + ALDC preferences): African-American representation drops from 14% → 5% - Without race-conscious admissions: African-American admits fall ~7pp, Hispanic ~4pp - Asian-American admits increase ~3pp, white admits increase 6–8pp - These simulations used Harvard's own admissions model with parameter modifications
Relevance: Demonstrates the real-world stakes of ABM-calibrated college admissions models. Our simulation could replicate these counterfactuals.
6. Available ABM Code & Platforms
6.1 NetLogo Models
| Model | Source | Focus |
|---|---|---|
| School_Choice_ABM | NetLogo Community Models Library | Chilean school choice with information signals |
| Medical College Admission (Jordan) | Assayed & Maheshwari, NetLogo 6.3 | Income + GPA → medical school |
| Matching mechanisms comparison | COMSES.net (codebase 4407) | Serial dictatorship, Boston, Chinese Parallel |
| School choice with information asymmetries | Academia.edu / ResearchGate | Santiago schools, income-based info gaps |
6.2 Python / Mesa
- Mesa 3 (2025): Modern Python ABM framework, could be used for a Python port of college-sim
- No publicly available Mesa model specifically for U.S. elite college admissions found
- Our JS-based simulation is unique in combining: real college data, multiple admission rounds, hook multipliers, and D3 visualization in a single self-contained file
6.3 Reardon et al. Code
- The JASSS 2016 paper references code but it does not appear to be publicly available in a standard repository
- Their fast portfolio optimization algorithm (Appendix D) is described in sufficient detail to reimplement
7. Key Parameters for Simulation Calibration — Cross-Study Summary
| Parameter | Value | Source |
|---|---|---|
| Resource-caliber (SES-SAT) correlation | r = 0.3 | Reardon et al. 2016 / ELS:2002 |
| Avg applications per student | 6.8 | CommonApp 2024–25 |
| SAT income gap (bottom vs. top quintile) | ~206 points | College Board / our sat_by_income.json |
| Legacy admit advantage | 5–6× (same credentials) | Chetty et al. 2023 |
| Athlete admit rate (Harvard) | 86% vs. 5.5% base | Arcidiacono et al. 2022 |
| ALDC share of white admits (Harvard) | 43% | Arcidiacono et al. 2022 |
| ED admit advantage | +20–30 pp (~100 SAT equiv.) | Avery & Levin 2010 |
| Top-1% → Ivy+ attendance (same SAT) | 2× vs. middle class | Chetty et al. 2023 |
| Admit rate → perception → app volume feedback | α ∈ [0,1] shrinkage | Sirolly 2023 |
| Information quality advantage (SES) | 0.7 + 0.1 × resources | Reardon et al. 2016 |
| College quality hierarchy | Strict ordering emerges | Epple et al. 2006 |
8. Gaps in Literature & Opportunities for College-Sim
- No existing ABM combines real institutional data with multiple admission rounds and hook multipliers — our model fills this gap
- Post-SFFA simulation: most ABMs predate the 2023 ruling; our model can simulate race-neutral alternatives
- Application volume dynamics: the feedback loop (lower rates → more apps → lower rates) is described theoretically but rarely modeled with real college parameters
- Financial aid as a strategic variable: Epple et al. model it theoretically but no ABM integrates Chetty net-cost-by-income data
- Waitlist dynamics: rarely modeled in ABMs despite being a real mechanism; our model includes waitlist processing
- Geographic/feeder school networks: Hoxby & Avery's "missing one-offs" suggest network effects in application behavior that could be added to our high school archetypes