Student Welfare Optimization in College Matching

student_welfare_matching.md


Student Welfare Optimization in College Matching

Student-Optimal Deferred Acceptance Empirically

Theoretical Foundation

The Gale-Shapley Deferred Acceptance (DA) algorithm (1962) produces a student-optimal stable matching when students propose: each student receives their most-preferred partner consistent with stability. The key properties:

  • Stability: No student-college pair mutually prefers each other over their assigned match

  • Strategy-proofness: Truthful preference reporting is a dominant strategy for the proposing side (students)

  • Optimality within stability: The student-proposing DA yields the best possible stable matching for students; no other stable matching is weakly preferred by all students

  • Lattice structure: The set of stable matchings forms a lattice, with student-optimal and college-optimal matchings at opposite extremes

Empirical Implementations

NYC High School Match (2003)

  • Replaced an uncoordinated system where ~30,000 students were unassigned annually

  • Adopted student-proposing DA with single tiebreaking

  • Reduced unassigned students from 30,000 to ~3,000

  • Abdulkadiroglu, Pathak, and Roth found that simulations with field data favor single tiebreaking (breaking ties the same way at every school) for efficiency

Boston School Choice (2005)

  • Boston School Committee replaced the "Boston mechanism" (immediate acceptance) with DA

  • Under the old Boston mechanism, sophisticated parents strategically misrepresented preferences while unsophisticated parents (disproportionately low-income and minority) reported truthfully and were penalized

  • The switch to strategy-proof DA eliminated the "gaming advantage" of informed families

  • Abdulkadiroglu, Pathak, Roth, and Sonmez documented both sophisticated and unsophisticated strategic behavior, establishing fairness as a rationale for strategy-proof mechanisms

NRMP Medical Residency Match

  • Roth (1984) showed that NRMP had independently converged on a DA-equivalent algorithm

  • The match has operated stably since 1952, with periodic refinements (couples matching added in 1998)

Known Limitations

  1. Not Pareto efficient: DA does not maximize total student welfare. Abdulkadiroglu, Pathak, and Roth showed that the inefficiency "can potentially be severe," and empirical findings from the NYC match corroborated this
  2. Proposer advantage: Students get optimal stable matching, but this can still be far from their first choices at highly selective institutions
  3. No mechanism is both stable and efficient: Stability and Pareto efficiency are fundamentally incompatible (Roth, 1982). Gains for some students from breaking stability always create justified envy for others
  4. Tiebreaking matters: When colleges are indifferent among students, different tiebreaking rules lead to different matchings with different welfare properties

Relevance to College Admissions

U.S. college admissions does not use DA. Instead, it operates as a decentralized market with:

  • Students applying to multiple colleges simultaneously

  • Colleges making independent admission decisions

  • Multiple rounds (ED, EA, RD) creating a sequential matching structure

  • No centralized clearinghouse

This decentralized structure introduces information frictions, strategic complexity, and welfare losses that a centralized DA mechanism would partially address.


Alternative Mechanisms for Student Welfare

Mechanism Comparison

Mechanism Strategy-Proof Stable Pareto Efficient Used Where
Student-proposing DA Yes (for students) Yes No NYC schools, Boston, NRMP
College-proposing DA No (for students) Yes No Theoretical
Top Trading Cycles (TTC) Yes No Yes Theoretical; some kidney exchange variants
Boston/Immediate Acceptance No No No Pre-2005 Boston, China (variants)
Serial Dictatorship Yes N/A Yes Simple assignment problems
Decentralized (current U.S.) N/A No No U.S. college admissions

Top Trading Cycles (TTC)

  • Pareto efficient and strategy-proof for students

  • Students can form "trading cycles" to swap assignments, leading to efficiency gains over DA

  • Not stable: can produce justified envy (a student prefers another school that would prefer them)

  • Abdulkadiroglu and Sonmez (2003) proposed TTC for school choice; it was considered but not adopted in Boston or NYC due to perceived fairness concerns about justified envy

  • When priority structures satisfy both strong acyclicity and Kesten-acyclicity, TTC and the Boston mechanism produce equivalent outcomes

Boston/Immediate Acceptance Mechanism

  • Students rank schools; in each round, schools permanently accept top applicants up to capacity

  • Not strategy-proof: Parents must strategically rank "realistic" choices first, not true preferences

  • Sophisticated families game the system; unsophisticated families are harmed

  • Research on China's parallel college admissions (a Boston mechanism variant) found significant gender, rural-urban, and ethnic gaps in mismatching explained by risk aversion and information disadvantage

  • Some theoretical work suggests the Boston mechanism may produce higher aggregate welfare when all agents are fully strategic, but this assumption fails empirically

Consistent Pareto Improvement over DA

  • Recent theoretical work (Tang and Yu, 2014; Erdil and Ergin, 2008) proposes mechanisms that achieve Pareto improvements over student-optimal DA without sacrificing strategy-proofness

  • These involve finding "stable improvement cycles" -- groups of students who can swap assignments while maintaining stability

  • Practical significance: even small efficiency gains can matter at scale

Implications for Simulation

The decentralized U.S. college admissions market is none of these mechanisms -- it lacks strategy-proofness, stability, and efficiency. This creates space for modeling:

  • How much welfare is lost vs. a centralized DA mechanism?

  • How does information asymmetry compound these losses?

  • Which students bear disproportionate welfare costs?


Existing ABM Simulations of College Admissions

Reardon, Kasman, Klasik, and Baker (2016) -- Stanford CEPA

"Agent-Based Simulation Models of the College Sorting Process" Published in Journal of Artificial Societies and Social Simulation (JASSS), Vol. 19, Issue 1.

Model Architecture:

  • 8,000 students, 40 colleges, 150 seats per college (75% capacity utilization)

  • Two student attributes: "resources" (socioeconomic capital) and "caliber" (academic achievement), bivariate normal with correlation 0.3

  • One college attribute: "quality" (running average of enrolled student caliber)

  • Three-stage annual cycle: application, admission, enrollment

Key Parameters:

Parameter Value Source
Resource-caliber correlation 0.3 ELS:2002
Quality reliability 0.7 + 0.1 x resources Plausible estimate
Caliber enhancement +0.1 x resources Test prep literature
Application count 4 + 0.5 x resources ELS:2002

Information Model:

  • Students observe college quality with noise; noise decreases with resources

  • Students observe their own caliber with some error

  • Information quality = 0.7 + 0.1 x resources (wealthy students have better information)

Admission Model:

  • Colleges rank by observed caliber and admit based on expected yield

  • Yield estimated from 3-year running average

  • Colleges adjust admission volume to fill seats

Key Findings:

  1. Resource-caliber correlation is the dominant driver of sorting inequality (eliminating it reduced the 90th-10th percentile gap from 20x to 4x)
  2. Information disparities, application enhancement, application count inequality, and utility preferences each produce modest individual effects but collectively create "non-trivial" stratification
  3. Model reached equilibrium by year 10-20
  4. Validated against IPEDS 2010-2011 data: selectivity and yield patterns matched real institutional data

Relevance: This is the closest published model to the college-sim project architecture. Key differences from our simulator: Reardon et al. use continuous distributions rather than archetype-based student generation, and a simpler two-attribute student model.

Assayed and Maheshwari (2023) -- Jordan Medical Colleges

"Agent-Based Simulation for University Students Admission: Medical Colleges in Jordan Universities"

  • Built in NetLogo v6.3

  • Two agents: high school students, medical colleges

  • Parameters: family income, high school GPA

  • Focused on seat allocation fairness

  • Found that high-ranking universities consistently set high GPA cutoffs

  • Simulated both partially centralized (each university sets cutoffs) and fully centralized (central authority allocates) scenarios

Assayed and Al-Sayed (2025) -- Survey Paper

"Student Behaviors in College Admissions: A Survey of Agent-Based Models" Published in International Journal of Emerging Multidisciplinaries.

  • Comprehensive survey of ABM approaches to college admissions

  • Identified common patterns: two agent types (students, colleges), three-stage matching (application, admission, enrollment)

  • Highlighted how family resources impact application strategy and outcomes

  • Emphasized the role of ABM in studying fairness and equity

Sirolly (2023) -- Toy Model

"A Toy Model of College Admissions"

  • 50 colleges, 100 seats each

  • Students modeled with normally distributed ability W ~ N(0,1)

  • Noisy signals sent to colleges

  • Utility function: u_i(k) = I_k^(-beta) + gamma(K - k)

  • Students solve portfolio optimization: maximize expected utility minus application costs

  • Found application volume concentrates at selective colleges; information cascades amplify competitive pressure

Other Notable Models

  • Reardon et al. (2015) extended the base ABM to study affirmative action policy effects, simulating race-based and socioeconomic-based policies

  • Matching Impacts of School Admission Mechanisms (ResearchGate, 2016): compared DA, Boston, and TTC mechanisms using agent-based simulation, measuring mismatch and welfare outcomes

  • Lee et al. (2023, Cornell): used learned admission-prediction models as replacement for standardized tests; calibration-focused approach


Undermatching / Mismatch Literature

Hoxby and Avery (2012) -- The Foundational Paper

"The Missing 'One-Offs': The Hidden Supply of High-Achieving, Low-Income Students" NBER Working Paper 18586.

Key Findings:

  • 25,000-35,000 low-income students annually have SAT/ACT scores and GPAs in the top 10% nationally

  • The vast majority do not apply to any selective college, despite being admissible

  • These students are geographically dispersed ("one-offs") in small towns, not concentrated in urban areas where selective colleges recruit

  • Selective institutions would often cost them less than non-selective alternatives due to generous financial aid

  • High schools serving these students have overworked counselors unfamiliar with selective admissions

Student Typology:

  • "Achievement-typical" low-income students: application behavior mirrors high-income peers with similar achievement (only 8% of high-achieving low-income students)

  • "Income-typical" low-income students: application behavior mirrors other low-income students regardless of achievement (the vast majority, ~92%)

Hoxby and Turner (2013) -- The ECO Intervention

"Expanding College Opportunities for High-Achieving, Low-Income Students"

Intervention Design:

  • Low-cost information packet sent to 39,682 high-achieving, low-income students (2010-2012)

  • Included: application guidance, financial aid information, fee waivers, college resource/graduation data

  • Cost: approximately $6 per student

Results:

  • Treated students were 46% more likely to enroll at peer-quality institutions matching their abilities

  • Institutions attended had graduation rates 15.1% higher on average

  • Instructional spending was 21.5% higher at enrolled institutions

  • Benefit-to-cost ratio was "extremely high, even under the most conservative assumptions"

  • Impact was 275x greater than equivalent spending on in-person counseling

Implication: Information intervention alone dramatically reduces undermatching. The problem is primarily informational, not financial or academic.

Determinants of Mismatch (NBER Working Paper 19286)

Key Findings:

  • Mismatch is driven primarily by student application and enrollment decisions, not college admission decisions

  • Most mismatched students either never applied to well-matched schools or were accepted but chose differently

  • Financial constraints, information access, and public college options all affect mismatch probability

  • More information = less mismatch; lower socioeconomic backgrounds = less information = more undermatch

Lincove and Cortes (2016) -- Automatic Admissions

"Match or Mismatch? Automatic Admissions and College Preferences of Low- and High-Income Students" NBER Working Paper 22559.

  • Studied Texas top 10% automatic admissions policy

  • Low-income students still undermatch even with guaranteed admission

  • Preferences, not access, drive much of the remaining mismatch

Bastedo and Flaster (2014) -- Methodological Critique

"Conceptual and Methodological Problems in Research on College Undermatch"

  • Challenged assumptions in undermatching research

  • Argued that definitions of "match" are often arbitrary

  • Questioned whether attending a more selective institution is always welfare-improving

  • Important caveat for simulation design: how we define "optimal match" matters

Mizala et al. (2026) -- International Evidence

"Bright but Poor: Undermatching in the Access to Postsecondary Education" American Educational Research Journal.

  • Extended undermatching analysis to international contexts

  • Confirmed that socioeconomic status is a persistent predictor of undermatching across different educational systems

Welfare Consequences of Undermatching

Empirical evidence on outcomes:

  1. Graduation rates: Students who undermatch graduate at lower rates than peers at better-matched institutions
  2. Earnings: Attending a more selective institution is associated with higher lifetime earnings, particularly for low-income and minority students (Dale and Krueger, 2014)
  3. Graduate school access: Selective college attendance increases probability of graduate/professional school enrollment
  4. Network effects: Peer quality, alumni networks, and institutional resources compound over careers

Key Parameters for Simulation

Based on the literature, these are the critical parameters for modeling student welfare in a college admissions simulation:

Student-Side Parameters

Parameter Literature Value Source
Resource-caliber correlation 0.3 Reardon et al. (ELS:2002)
Information quality (low-resource) 0.7 base Reardon et al.
Information quality (high-resource) 0.7 + 0.1 x resources Reardon et al.
Application count (low-resource) 4 applications Reardon et al. (ELS:2002)
Application count (high-resource) 4 + 0.5 x resources (up to ~7) Reardon et al. (ELS:2002)
Caliber enhancement from resources +0.1 x resources Test prep literature
Undermatching rate (low-income, high-achieving) ~92% income-typical behavior Hoxby & Avery (2012)
Information intervention effect +46% peer enrollment Hoxby & Turner (2013)

College-Side Parameters

Parameter Literature Value Source
Yield estimation window 3-year running average Reardon et al.
Admission volume adjustment Based on prior year fill rate Reardon et al.
Quality metric Weighted average enrolled caliber Reardon et al.
ED yield boost Binding commitment ~90%+ yield Common knowledge

System-Level Parameters

Parameter Description Typical Range
Stability % of matched pairs with no blocking pair 85-95% in decentralized markets
Pareto efficiency % of students who could improve without harming others DA achieves ~85-90% of optimal
Undermatching rate % of students at institutions below their caliber 20-40% depending on definition
Strategic behavior prevalence % of students who misrepresent preferences 10-30% under non-strategy-proof mechanisms

Information Asymmetry Parameters

  1. Student knowledge of own caliber: How accurately students assess their competitiveness (signal noise)
  2. Student knowledge of college quality: How well students perceive fit and resources (correlated with SES)
  3. College knowledge of student quality: Admissions offices observe noisy signals (GPA, SAT, essays) of true ability
  4. Strategic sophistication: Proportion of students who optimize application portfolios (higher in high-SES)

Recommendations for College Simulator

1. Add Information Asymmetry Layer

The current simulator uses deterministic scoring. The literature strongly suggests adding:

  • Student perception noise: Students should have imperfect knowledge of their admission probability at each college, with noise inversely correlated with socioeconomic status

  • Application portfolio optimization: Students should choose where to apply based on perceived probability x perceived utility, not perfect knowledge

  • Counselor quality: High school counselor quality (varying by school type) should influence which colleges students consider

Implementation suggestion: Add a perceptionNoise parameter to each student archetype. Elite prep school students get low noise (0.05-0.1); rural/under-resourced students get high noise (0.3-0.5). This single parameter captures much of the Reardon et al. information asymmetry finding.

2. Model Undermatching Explicitly

Based on Hoxby and Avery:

  • Income-typical behavior: 92% of high-achieving low-income students should exhibit application patterns matching their income cohort, not their achievement cohort

  • Achievement-typical behavior: Only 8% of such students apply like high-achieving high-income peers

  • Geographic isolation: Students at rural or under-resourced high schools should have shorter college consideration lists biased toward local/state options

Implementation suggestion: When generating application lists for students from under-resourced high schools, apply a "consideration set filter" that removes colleges the student has never heard of (probability based on distance, marketing reach, and school counselor quality).

3. Track Welfare Metrics

Add post-simulation welfare analysis:

  • Match quality: For each student, compute the gap between their enrolled college's tier and their "optimal" placement based on academic index

  • Undermatching rate: Percentage of students enrolled at colleges 1+ tiers below their academic qualification

  • Overmatching rate: Percentage enrolled 1+ tiers above (these students face academic mismatch risk)

  • Welfare by demographic: Break down match quality by student archetype, high school type, hook status

  • Counterfactual DA comparison: Run the same student population through a centralized DA mechanism and compare aggregate welfare

4. Implement Yield Management Feedback

Colleges should adjust behavior over simulation runs:

  • Track acceptance rate vs. target enrollment

  • Adjust number of offers based on historical yield

  • This creates the dynamic feedback loop that Reardon et al. found drives equilibrium convergence (10-20 iterations)

5. Model Strategic Behavior Heterogeneity

Not all students are equally strategic:

  • Sophisticated applicants (high-SES, well-counseled): optimize application portfolios, use ED strategically, apply to safety/target/reach spread

  • Naive applicants (low-SES, poorly counseled): apply to too few schools, skip safeties, miss ED advantages, or apply only to local/familiar options

  • The Boston mechanism literature shows this heterogeneity causes the most welfare damage under non-strategy-proof mechanisms

6. Consider Adding a DA Benchmark Mode

For research validity, implement an optional mode where:

  • All students submit truthful preference rankings

  • All colleges submit preference rankings

  • A centralized DA algorithm produces the student-optimal stable matching

  • Compare this benchmark to the decentralized simulation outcome

This would allow measuring the "price of decentralization" in student welfare terms.

7. Calibration Targets

Validate the simulation against known empirical patterns:

  • Acceptance rate vs. yield rate correlation should match IPEDS data

  • Proportion of students within 1 tier of their "match" should be 60-80%

  • Low-SES undermatching rate should be 2-4x higher than high-SES

  • ED acceptance rate advantage should be 2-3x regular admission at top schools

  • Hook multiplier effects should produce demographic compositions matching published CDS data


References

  1. Gale, D. & Shapley, L.S. (1962). "College Admissions and the Stability of Marriage." American Mathematical Monthly, 69(1), 9-15.
  2. Roth, A.E. (2008). "Deferred Acceptance Algorithms: History, Theory, Practice, and Open Questions." International Journal of Game Theory, 36, 537-569.
  3. Abdulkadiroglu, A., Pathak, P.A., & Roth, A.E. (2005). "The New York City High School Match." American Economic Review P&P, 95(2), 364-367.
  4. Abdulkadiroglu, A., Pathak, P.A., Roth, A.E., & Sonmez, T. (2006). "Changing the Boston School Choice Mechanism." NBER Working Paper 11965.
  5. Abdulkadiroglu, A. & Sonmez, T. (2003). "School Choice: A Mechanism Design Approach." American Economic Review, 93(3), 729-747.
  6. Hoxby, C.M. & Avery, C. (2012). "The Missing 'One-Offs': The Hidden Supply of High-Achieving, Low-Income Students." NBER Working Paper 18586.
  7. Hoxby, C.M. & Turner, S. (2013). "Expanding College Opportunities for High-Achieving, Low-Income Students." Stanford Institute for Economic Policy Research Discussion Paper 12-014.
  8. Reardon, S.F., Kasman, M., Klasik, D., & Baker, R. (2016). "Agent-Based Simulation Models of the College Sorting Process." Journal of Artificial Societies and Social Simulation, 19(1), 8.
  9. Pathak, P.A. & Sonmez, T. (2008). "Strategy-Proofness versus Efficiency in Matching with Indifferences: Redesigning the NYC High School Match." American Economic Review, 98(5), 1636-1689.
  10. Erdil, A. & Ergin, H. (2008). "What's the Matter with Tie-Breaking? Improving Efficiency in School Choice." American Economic Review, 98(3), 669-689.
  11. Bastedo, M.N. & Flaster, A. (2014). "Conceptual and Methodological Problems in Research on College Undermatch." Educational Researcher, 43(2), 93-99.
  12. Assayed, S.K. & Maheshwari, P. (2023). "Agent-Based Simulation for University Students Admission: Medical Colleges in Jordan Universities."
  13. Assayed, S.K. & Al-Sayed, S. (2025). "Student Behaviors in College Admissions: A Survey of Agent-Based Models." International Journal of Emerging Multidisciplinaries.
  14. Kloosterman, A. (2020). "School choice with asymmetric information: Priority design and the curse of acceptance." Theoretical Economics.
  15. Mizala, A. et al. (2026). "Bright but Poor: Undermatching in the Access to Postsecondary Education." American Educational Research Journal.