Josh Elberg

Data Stories

Interactive explorations at the intersection of machine learning and data visualization. Real public datasets. Real ML. Creative, non-obvious ways to see the world.

D3.jsdeck.glscikit-learnUMAPNext.js

Hospital Quality Survival Landscape

In Progress

Is your ZIP code your destiny?

Clustering 4,700+ hospitals by 150 quality measures, overlaid with socioeconomic data. UMAP dimensionality reduction, random forest with SHAP interpretation, and scrollytelling narrative reveal which communities are served by underperforming hospitals.

CMS Hospital CompareCensus ACSHRSA AHRF

ML Techniques

  • K-Means Clustering
  • Random Forest + SHAP
  • PCA / UMAP
  • Kaplan-Meier Curves

Visualizations

  • Ridge Plots
  • Beeswarm
  • Choropleth
  • Scrollytelling

The Wage Topology

In Progress

800 occupations. 30 skill dimensions. One landscape.

UMAP dimensionality reduction transforms O*NET skill profiles into an interactive 3D terrain where elevation is median wage. Discover occupation families, skill-to-wage relationships, and how the landscape shifts across states.

BLS OEWSO*NETBLS Employment Projections

ML Techniques

  • UMAP Embedding
  • K-Means Clustering
  • Linear Regression
  • Dimensionality Reduction

Visualizations

  • 3D Terrain Surface
  • UMAP Scatter
  • Ridge Plots
  • Radar Charts

The Anatomy of $700 Billion

In Progress

Who actually gets the money?

Network analysis of federal contract and grant spending reveals hidden ecosystems of contractors, geographic dependencies, and structural patterns invisible in aggregate statistics. Four ML techniques expose what bar charts never could.

USASpending.govCensus Bureau

ML Techniques

  • Network Community Detection
  • UMAP Contractor Profiles
  • Isolation Forest Anomaly Detection
  • Changepoint Detection

Visualizations

  • Force-Directed Graph
  • Sankey Diagram
  • Hex-Bin Map
  • Small Multiples