Skip to main content

GeraMarket / US College ROI / Methodology

Gera Education ROI Index (GEROI) — Methodology

Full, reproducible formula behind the GEROI index. Every number traces to the US Department of Education College Scorecard — public domain data, no key, no registration. No estimates, no synthetic data.

What is GEROI?

The Gera Education ROI Index (GEROI) answers a single question: for every dollar you pay in net tuition, how much net financial return do alumni earn within 10 years? It combines three official data points per institution:

  • medianEarnings10yr — median earnings 10 years after entry (MD_EARN_WNE_P10)
  • medianDebt — median cumulative debt at graduation (GRAD_DEBT_MDN)
  • netPrice — average annual net price (NPT4_PUB or NPT4_PRIV)
GEROI = (medianEarnings10yr − medianDebt) / netPrice · normalised 0–100

Institutions with any NULL, “PrivacySuppressed”, or zero-enrollment values are excluded. GEROI is published only for the top 300 institutions by undergraduate enrollment.

Data source

SourceUS Department of Education — College Scorecard (Most Recent Institution Data)
Data fileMost-Recent-Cohorts-Institution_06102026.zip
Reference periodJune 2026
Last computed2026-06-20
LicenceUS Federal Government Open Data (public domain)
Key required?No — bulk CSV, no registration, no API key

Step-by-step formula

  1. 1

    Download the College Scorecard CSV

    Fetch Most-Recent-Cohorts-Institution_06102026.zip from https://collegescorecard.ed.gov/data/. No API key required. Extract the CSV. The file contains 3,844 rows (one per institution).

  2. 2

    Extract and validate required columns

    Parse UGDS, MD_EARN_WNE_P10, GRAD_DEBT_MDN, and NPT4_PUB/NPT4_PRIV. Discard any row where any of these four values is empty, "NULL", "PrivacySuppressed", "NA", or ≤ 0. This left 3,844 valid rows in the June 2026 dataset.

  3. 3

    Sort by enrollment and cap to top 300

    Sort valid rows by UGDS (total undergrad enrollment) descending. Take the top 300 as the published cohort. The remaining 3,544 institutions are excluded from individual leaf pages (logged in the data module header).

  4. 4

    Compute raw GEROI

    raw_geroi = (MD_EARN_WNE_P10 − GRAD_DEBT_MDN) / NPT4. This ratio is positive when 10-year median earnings exceed cumulative debt. It is negative when debt exceeds earnings — and will still produce a valid GEROI (the normalisation step handles negative raw values).

  5. 5

    Normalise to 0–100 (min-max across the top-300 cohort)

    Find min_raw and max_raw across all 300 institutions. GEROI = (raw_geroi − min_raw) / (max_raw − min_raw) × 100, rounded to 1 d.p. The institution with the highest raw ratio (College of the Sequoias, CA, raw=72.1) scores 100; the lowest (University of Arizona Global Campus, AZ, raw=0.13) scores 0.

  6. 6

    Generate slugs and publish

    Institution name → lowercase → non-alphanumeric → hyphen → trim to 80 chars. Collision (same slug for two institutions) → append UNITID. Pages live at /us-college-roi/[slug].

Pseudocode

# Input: Most-Recent-Cohorts-Institution_06102026.csv
# Output: GEROI_INSTITUTIONS array (top 300 by enrollment)

rows = []
for each row in csv:
  ugds      = parse_float(row['UGDS'])
  earn10    = parse_float(row['MD_EARN_WNE_P10'])
  debt      = parse_float(row['GRAD_DEBT_MDN'])
  net_price = parse_float(row['NPT4_PUB']) ?? parse_float(row['NPT4_PRIV'])

  if any of [ugds, earn10, debt, net_price] is null or <= 0: skip
  rows.append({ name, city, state, ugds, earn10, debt, net_price })

rows.sort(by=ugds, descending=True)
top300 = rows[:300]

for each r in top300:
  r.raw_geroi = (r.earn10 - r.debt) / r.net_price

min_raw = min(r.raw_geroi for r in top300)
max_raw = max(r.raw_geroi for r in top300)

for each r in top300:
  r.geroi = round((r.raw_geroi - min_raw) / (max_raw - min_raw) * 100, 1)
  r.slug  = slugify(r.name)[:80]

publish top300 as GEROI_INSTITUTIONS

Coverage (June 2026)

  • Institutions published (hub + leaf pages): 300
  • Institutions in source dataset with all required fields: 3,844
  • Rolled into state context (no individual leaf page): 3,544
  • Cohort median 10-yr earnings: $54,940
  • Cohort median debt: $18,095
  • Cohort median net price: $13,461/yr
  • Data source published: 10 June 2026
  • Update cadence: When US Dept of Education releases new cohort data (typically annual)
  • Licence: US Federal Government Open Data (public domain) — no restriction

Limitations & caveats

  • 10-year window is post-entry, not post-graduation. MD_EARN_WNE_P10 counts 10 years from first enrollment, not degree completion. Students who took longer to graduate will have fewer post-graduation work years in the window.
  • Debt is at graduation, not 10 years later. GRAD_DEBT_MDN reflects cumulative debt at the time of leaving, not after 10 years of repayment. GEROI is a return-on-investment context metric, not a net-worth projection.
  • Top-300 by enrollment is a size filter, not a quality filter.Smaller specialised institutions that may have high GEROI are excluded because they fall outside the top 300 by UGDS. The enrollment cap ensures data volume is sufficient for the College Scorecard's privacy thresholds.
  • PrivacySuppressed values are excluded. Institutions where any of the four required columns is privacy-suppressed (too few students for DoE to publish without risking re-identification) are excluded. This typically affects very small or specialised schools.
  • Not financial advice. GEROI is a statistical summary of publicly reported outcomes. It is not a guarantee of individual earnings, a financial recommendation, or an endorsement of any institution. Field of study, location, job market, and individual circumstances all affect actual outcomes.

Data sourced from US Department of Education — College Scorecard (Most Recent Institution Data), published by the US Department of Education. As a work of the US Federal Government, this data is in the public domain and carries no copyright restriction on use, reproduction, or distribution. See College Scorecard documentation for the official terms.