University administration analytics to track institutional effectiveness
Find a file
2026-03-31 08:03:58 -04:00
docs Compensation, endowmnet tweaks. Added About. 2026-03-31 08:03:58 -04:00
src/admin_analytics Compensation, endowmnet tweaks. Added About. 2026-03-31 08:03:58 -04:00
tests Updates after testing 2026-03-30 20:42:08 -04:00
.gitignore Initial build out 2026-03-30 07:15:14 -04:00
administrative_analytics_scope_v0.1.md Initial project planning docs for UD administrative analytics 2026-03-29 18:28:30 -04:00
CLAUDE.md Initial project planning docs for UD administrative analytics 2026-03-29 18:28:30 -04:00
LICENSE Add MIT license 2026-03-30 20:45:20 -04:00
phase1_plan.md Phase 1 project prototype 2026-03-30 19:29:33 -04:00
pyproject.toml Add MIT license 2026-03-30 20:45:20 -04:00
README.md Compensation, endowmnet tweaks. Added About. 2026-03-31 08:03:58 -04:00
uv.lock Updates after testing 2026-03-30 20:42:08 -04:00

Admin Analytics

University of Delaware administrative cost benchmarking using public data (IRS 990, IPEDS, BLS CPI-U). Ingests data into a local DuckDB database and serves an interactive Dash dashboard for analysis.

Scope

This project is currently scoped to the University of Delaware as a single institution. It tracks:

  • Executive compensation from IRS 990 Schedule J filings by the University of Delaware (EIN 516000297) and UD Research Foundation (EIN 516017306)
  • Administrative cost ratios from IPEDS finance surveys (expenses by function, staffing levels, enrollment)
  • Endowment performance and philanthropic giving from IPEDS F2 (FASB) financial data
  • Administrative headcount via web scraping, currently focused on the College of Engineering line management (COE Central, department offices) and the Provost's Office

Changing the target institution

The institution scope is controlled by constants in src/admin_analytics/config.py:

  • UD_UNITID = 130943 -- IPEDS institution identifier. Change this to target a different institution. Look up UNITIDs at the IPEDS Data Center.
  • UD_EINS = [516000297, 516017306] -- IRS Employer Identification Numbers for 990 filings. Update these to the EINs of the target institution's nonprofit entities.

All IPEDS loaders accept a unitid_filter parameter. The scraper URLs in src/admin_analytics/scraper/directory.py are UD-specific and would need to be updated for a different institution.

Multi-institution comparisons (AAU peers, Carnegie peers) are planned for a future phase.

Prerequisites

  • Python 3.11+
  • uv package manager
  • Playwright browsers (only needed for the scrape command)

Setup

# Clone and install
git clone <repo-url>
cd AdminAnalytics
uv sync

# Install Playwright browsers (optional, only for scraping)
uv run playwright install chromium

Ingesting Data

Load data from public sources into the local DuckDB database (data/admin_analytics.duckdb).

# Ingest everything (IPEDS + IRS 990 + CPI + scraper)
uv run admin-analytics ingest all

# Or ingest individual sources
uv run admin-analytics ingest ipeds --year-range 2005-2024
uv run admin-analytics ingest irs990 --year-range 2019-2025
uv run admin-analytics ingest cpi
uv run admin-analytics ingest scrape

Use --force on any command to re-download files that already exist locally.

Downloaded files are stored in data/raw/ (gitignored).

Launching the Dashboard

uv run admin-analytics dashboard

Opens at http://localhost:8050. Use --port to change the port, or --host 0.0.0.0 for network access (e.g. over Tailscale).

The dashboard must be restarted to pick up newly ingested data (DuckDB opens in read-only mode to avoid lock conflicts).

The dashboard has seven tabs:

  • Executive Compensation -- top earners from IRS 990 Schedule J, President and top-10 CAGR, trends by role, compensation breakdown by component, growth vs CPI-U (2015-2023)
  • Admin Cost Overview -- admin cost ratios, expense breakdown by function, cost per student, admin-to-faculty ratio (IPEDS data, 2005-2024)
  • Staffing & Enrollment -- staff composition, student-to-staff ratios, management vs faculty vs enrollment growth (indexed)
  • Endowment -- endowment value trends, CAGR, investment return rate, CIO compensation vs endowment growth (IPEDS F2)
  • Philanthropy -- total private gifts and grants, gift allocation, President and VP Development compensation growth vs fundraising (IPEDS F2 and IRS 990)
  • Current Headcount -- scraped UD staff directory data with overhead/non-overhead classification by unit
  • About -- data sources, methodology, and limitations

Validating Data

Check row counts, NULL rates, year coverage, and cross-source consistency:

uv run admin-analytics validate

Running Tests

uv sync --group dev
uv run pytest

Project Structure

src/admin_analytics/
    cli.py              # CLI entry point (typer)
    config.py           # Constants (UD identifiers, URLs, paths)
    db/                 # DuckDB schema and connection
    ipeds/              # IPEDS download, parsing, loading
    irs990/             # IRS 990 XML download, parsing, title normalization
    bls/                # BLS CPI-U download and loading
    scraper/            # UD staff directory scraper and classifier
    dashboard/          # Dash app, queries, page layouts
    validation.py       # Data validation queries
data/raw/               # Downloaded files (gitignored)
docs/data_dictionary.md # Schema documentation
tests/                  # pytest test suite

Data Sources

Source What it provides Years
IPEDS Institutional directory, expenses by function, staffing, enrollment 2005-2024
IRS 990 e-file UD Foundation filings, executive compensation (Schedule J) 2019-2025 index years (tax years 2017-2023)
BLS CPI-U Consumer Price Index for inflation adjustment Full history
UD staff directories Admin office headcounts and overhead classification Current snapshot