- Python 100%
| docs | ||
| src/admin_analytics | ||
| tests | ||
| .gitignore | ||
| administrative_analytics_scope_v0.1.md | ||
| CLAUDE.md | ||
| LICENSE | ||
| phase1_plan.md | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Admin Analytics
University of Delaware administrative cost benchmarking using public data (IRS 990, IPEDS, BLS CPI-U). Ingests data into a local DuckDB database and serves an interactive Dash dashboard for analysis.
Scope
This project is currently scoped to the University of Delaware as a single institution. It tracks:
- Executive compensation from IRS 990 Schedule J filings by the University of Delaware (EIN 516000297) and UD Research Foundation (EIN 516017306)
- Administrative cost ratios from IPEDS finance surveys (expenses by function, staffing levels, enrollment)
- Endowment performance and philanthropic giving from IPEDS F2 (FASB) financial data
- Administrative headcount via web scraping, currently focused on the College of Engineering line management (COE Central, department offices) and the Provost's Office
Changing the target institution
The institution scope is controlled by constants in src/admin_analytics/config.py:
UD_UNITID = 130943-- IPEDS institution identifier. Change this to target a different institution. Look up UNITIDs at the IPEDS Data Center.UD_EINS = [516000297, 516017306]-- IRS Employer Identification Numbers for 990 filings. Update these to the EINs of the target institution's nonprofit entities.
All IPEDS loaders accept a unitid_filter parameter. The scraper URLs in src/admin_analytics/scraper/directory.py are UD-specific and would need to be updated for a different institution.
Multi-institution comparisons (AAU peers, Carnegie peers) are planned for a future phase.
Prerequisites
- Python 3.11+
- uv package manager
- Playwright browsers (only needed for the
scrapecommand)
Setup
# Clone and install
git clone <repo-url>
cd AdminAnalytics
uv sync
# Install Playwright browsers (optional, only for scraping)
uv run playwright install chromium
Ingesting Data
Load data from public sources into the local DuckDB database (data/admin_analytics.duckdb).
# Ingest everything (IPEDS + IRS 990 + CPI + scraper)
uv run admin-analytics ingest all
# Or ingest individual sources
uv run admin-analytics ingest ipeds --year-range 2005-2024
uv run admin-analytics ingest irs990 --year-range 2019-2025
uv run admin-analytics ingest cpi
uv run admin-analytics ingest scrape
Use --force on any command to re-download files that already exist locally.
Downloaded files are stored in data/raw/ (gitignored).
Launching the Dashboard
uv run admin-analytics dashboard
Opens at http://localhost:8050. Use --port to change the port, or --host 0.0.0.0 for network access (e.g. over Tailscale).
The dashboard must be restarted to pick up newly ingested data (DuckDB opens in read-only mode to avoid lock conflicts).
The dashboard has seven tabs:
- Executive Compensation -- top earners from IRS 990 Schedule J, President and top-10 CAGR, trends by role, compensation breakdown by component, growth vs CPI-U (2015-2023)
- Admin Cost Overview -- admin cost ratios, expense breakdown by function, cost per student, admin-to-faculty ratio (IPEDS data, 2005-2024)
- Staffing & Enrollment -- staff composition, student-to-staff ratios, management vs faculty vs enrollment growth (indexed)
- Endowment -- endowment value trends, CAGR, investment return rate, CIO compensation vs endowment growth (IPEDS F2)
- Philanthropy -- total private gifts and grants, gift allocation, President and VP Development compensation growth vs fundraising (IPEDS F2 and IRS 990)
- Current Headcount -- scraped UD staff directory data with overhead/non-overhead classification by unit
- About -- data sources, methodology, and limitations
Validating Data
Check row counts, NULL rates, year coverage, and cross-source consistency:
uv run admin-analytics validate
Running Tests
uv sync --group dev
uv run pytest
Project Structure
src/admin_analytics/
cli.py # CLI entry point (typer)
config.py # Constants (UD identifiers, URLs, paths)
db/ # DuckDB schema and connection
ipeds/ # IPEDS download, parsing, loading
irs990/ # IRS 990 XML download, parsing, title normalization
bls/ # BLS CPI-U download and loading
scraper/ # UD staff directory scraper and classifier
dashboard/ # Dash app, queries, page layouts
validation.py # Data validation queries
data/raw/ # Downloaded files (gitignored)
docs/data_dictionary.md # Schema documentation
tests/ # pytest test suite
Data Sources
| Source | What it provides | Years |
|---|---|---|
| IPEDS | Institutional directory, expenses by function, staffing, enrollment | 2005-2024 |
| IRS 990 e-file | UD Foundation filings, executive compensation (Schedule J) | 2019-2025 index years (tax years 2017-2023) |
| BLS CPI-U | Consumer Price Index for inflation adjustment | Full history |
| UD staff directories | Admin office headcounts and overhead classification | Current snapshot |