Common Data Set & Admissions Data: Top 20-50 Selective Colleges

data_top20_50_cds.md


Common Data Set & Admissions Data: Top 20-50 Selective Colleges

Research compiled from 2023-24 Common Data Sets and institutional admissions reports.


Data Table: All Schools

School Acceptance Rate SAT 25th SAT 75th CDS URL Data Availability
Columbia 3.9% 1510 1560 CDS 2023-24 PDF High
UPenn 5.9% 1500 1570 CDS Portal High
Brown 5.5% 1510 1560 CDS 2023-24 PDF High
Dartmouth 6.2% 1450 1550 CDS Portal High
Cornell 8.7% 1510 1560 CDS Portal High
Duke 5.9% 1500 1570 CDS Portal High
Northwestern 7.8% 1500 1560 CDS 2023-24 PDF High
UChicago 6.5% 1510 1560 CDS 2023-24 PDF High
Caltech 2.6% 1510 1570 CDS Portal High
Johns Hopkins 7.5% 1500 1560 CDS 2023-24 PDF High
Vanderbilt 5.6% 1510 1560 CDS Portal High
Rice 9.5% 1510 1560 CDS 2023-24 PDF High
Notre Dame 12.4% 1440 1540 CDS 2023-24 PDF High
Georgetown 12.3% 1400 1540 CDS Portal High
Carnegie Mellon 11.3% 1500 1560 CDS 2023-24 PDF High
WashU 12.0% 1500 1570 CDS 2023-24 PDF High
Emory 11.4% 1480 1540 CDS 2023-24 PDF High
Tufts 11.4% 1480 1540 CDS via Fact Book Medium
Boston College 16.7% 1450 1520 CDS Portal High
UVA 20.0% 1400 1540 CDS Portal High
UCLA 8.6% N/A N/A CDS 2023-24 PDF Medium
Michigan 18.0% 1340 1530 CDS 2023-24 PDF High
Williams 7.5% 1500 1560 CDS 2023-24 PDF High
Amherst 9.0% 1500 1560 CDS Portal High
Middlebury 10.0% 1440 1550 CDS Portal (2024-25) Medium

Notes on SAT Data

  • UCLA: Does not use SAT/ACT in admissions (UC system test-free policy). No SAT percentiles reported in CDS.

  • Test-optional schools: Most schools above were test-optional for 2023-24. SAT ranges reflect submitters only (typically 40-60% of enrolled students), creating upward bias.

  • Michigan: Wide SAT range (1340-1530) reflects large public university with in-state/out-of-state mix. Overall acceptance rate ~18%; in-state ~39%, out-of-state ~18%.

  • Georgetown: Requires SAT/ACT submission (not test-optional), so SAT data is more representative of the full admitted class.


Acceptance Rate Tiers (for simulation calibration)

Ultra-Selective (<6%)

School Rate Notes
Caltech 2.6% Smallest class size (~230)
Columbia 3.9% Highest Ivy selectivity
Brown 5.5% Test-optional
Vanderbilt 5.6% Rising selectivity trend
UPenn 5.9% Strong ED component
Duke 5.9% Strong ED component

Very Selective (6-10%)

School Rate Notes
Dartmouth 6.2% Small Ivy
UChicago 6.5% First to release CDS in 2024
Johns Hopkins 7.5% STEM-focused
Williams 7.5% Top LAC
Northwestern 7.8% Suburban setting
UCLA 8.6% Test-free (UC system)
Cornell 8.7% Largest Ivy class
Amherst 9.0% Top LAC
Rice 9.5% Small research university

Selective (10-20%)

School Rate Notes
Middlebury 10.0% Top LAC
Carnegie Mellon 11.3% STEM/CS focused
Emory 11.4% Rising selectivity
Tufts 11.4% International focus
WashU 12.0% Strong pre-med
Georgetown 12.3% Requires test scores
Notre Dame 12.4% Strong legacy/Catholic tradition
Boston College 16.7% Jesuit institution
Michigan 18.0% Large public flagship
UVA 20.0% Public; in-state/OOS split

Feeder School Data Availability

Schools That Publish or Have Published Feeder Data

No colleges in this list officially publish feeder school matriculation data in their CDS or admissions reports. Feeder school data comes from external sources:

  1. Harvard Crimson (2024 investigation): Identified 21 schools that collectively send ~1 in 11 accepted Harvard students. Top feeders include Phillips Exeter, Phillips Andover, Boston Latin, Stuyvesant. This is the most comprehensive public feeder data for any elite school.

  2. National Student Clearinghouse: Publishes aggregate High School Benchmarks Report with high-school-to-college enrollment data. Does not break out by specific college destination at the individual school level publicly.

  3. Michigan School Data Portal: Michigan publishes state-level data showing which colleges Michigan high school graduates attend (mischooldata.org).

  4. College-Specific Sources:

  5. Princeton's Class of 2028: ~60% from public high schools (published aggregate, not school-specific)

  6. Yale's Class of 2027: ~60% from public high schools (published aggregate, not school-specific)

  7. Brown: Publishes aggregate public/private split but not individual feeder schools

  8. Third-Party Databases:

  9. IvyLeagueFeeders.com: Crowdsourced data on feeder school placements

  10. Crimson Education research: Lists of top 20 feeder schools to Ivies

  11. Niche.com: School profiles sometimes include college matriculation data reported by schools

Feeder Data Availability by School

School Official Feeder Data Public/Private Split External Feeder Data
Columbia No Yes (CDS) Limited
UPenn No Yes (CDS) Limited
Brown No Yes (CDS) Limited
Dartmouth No Yes (CDS) Limited
Cornell No Yes (CDS) Limited
Duke No Yes (CDS) Limited
Northwestern No Yes (CDS) Limited
UChicago No Yes (CDS) Limited
Caltech No Yes (CDS) Very Limited
Johns Hopkins No Yes (CDS) Limited
Vanderbilt No Yes (CDS) Limited
Rice No Yes (CDS) Limited
Notre Dame No Yes (CDS) Limited
Georgetown No Yes (CDS) Limited
Carnegie Mellon No Yes (CDS) Limited
WashU No Yes (CDS) Limited
Emory No Yes (CDS) Limited
Tufts No Yes (CDS) Limited
Boston College No Yes (CDS) Limited
UVA No Yes (CDS) Moderate (state data)
UCLA No Yes (CDS) Moderate (UC system data)
Michigan No Yes (CDS) Good (MI school data portal)
Williams No Yes (CDS) Very Limited
Amherst No Yes (CDS) Very Limited
Middlebury No Yes (CDS) Very Limited

Key Finding

No elite college publishes official feeder school lists. The best feeder data comes from:

  • Investigative journalism (Harvard Crimson 2024)

  • State education departments (Michigan, California UC system)

  • Crowdsourced platforms (IvyLeagueFeeders.com)

  • Individual high school college counseling reports (e.g., Naviance data shared by schools)


Data Gaps Assessment

High-Quality Data Available (Simulation-Ready)

These data points are reliably available from CDS for all 25 schools:

  • Overall acceptance rate

  • Total applications received / admitted / enrolled

  • SAT/ACT score percentiles (except UCLA)

  • GPA distributions

  • Class rank distributions

  • Geographic diversity (in-state/out-of-state/international)

  • First-generation student percentage

  • Pell Grant recipient percentage

Moderate Data Gaps

  • ED/EA acceptance rates: Some schools report ED numbers in CDS Section C (e.g., Carnegie Mellon ED: 12.5%), but many do not break out by round. This is critical for simulation round modeling.

  • Hook multipliers: No CDS reports explicit admit-rate boosts for athletes, legacies, or donors. Must be inferred from investigative reporting (e.g., Harvard trial data, Duke/Princeton legacy studies).

  • Yield rates: CDS provides enrolled/admitted ratios, but not by round or demographic segment.

  • Waitlist conversion rates: CDS Section C9 has waitlist data for some schools, but reporting is inconsistent.

Significant Data Gaps (Not in CDS)

  • Feeder school matriculation: No official data. See Feeder section above.

  • Legacy admit rates: Not published. Estimates from studies: ~30-40% at Ivies (vs. 3-10% overall).

  • Athlete admit rates: Not published. Estimates: recruited athletes admitted at 2-4x the base rate at most selective schools.

  • Donor/development case rates: Not published. Extremely limited public data.

  • Essay/EC scoring rubrics: Proprietary. Some insight from Harvard litigation discovery.

  • Regional/school-type quotas: Not published. Evidence of geographic balancing from CDS demographic data.

Recommendations for Simulation Parameters

  1. Acceptance rates: Use CDS data directly. All 25 schools have reliable 2023-24 data.
  2. SAT ranges: Use CDS 25th/75th percentiles. Note test-optional bias (submitter pool is self-selected higher). For UCLA, use GPA-only model.
  3. ED/EA boosts: Estimate from schools that report (Carnegie Mellon, Dartmouth, Brown). Typical ED boost: 2-3x the RD rate.
  4. Hook multipliers: Use Harvard trial data as baseline: legacy ~5-6x, athlete ~4x (ALDC data), first-gen ~1.5x.
  5. Yield rates: Calculate from CDS enrolled/admitted. Range: ~45-70% for HYPSM+, 30-50% for Ivy+, 20-40% for selective.
  6. Feeder school effects: Model as a quality multiplier based on school tier rather than specific feeder pipelines, since school-specific data is unavailable.

Source Summary

CDS Portal Pages

Feeder School Data Sources