Scope note: The AI & digital-health interventions referenced across these analyses are illustrative “low-hanging fruit” — a small, high-feasibility/high-impact sample chosen to demonstrate planning-grade ROI. They are not an exhaustive or prescriptive list; many other interventions may apply.
Shared Methodology · Framework Document · v1.0

County-Level Disease Burden Analysis in Rural Michigan: DALY Framework, Data Sources, and Reproducible Python Workflow

Methodology reference for the Michigan Rural Health Burden Analysis series
Sergey Soshnikov, MD PhD
Public Health Division, Central Michigan University · Mount Pleasant, Michigan
Correspondence: biososh@gmail.com · Project page: lakeslinkedcare.github.io
Version 1.0 · June 2026 Not peer-reviewed Planning purposes only WHO GHE Frontier LE 89.1 yrs (primary) Michigan LE 78.6 yrs (planning sensitivity) Monte Carlo n=10,000 Python 3.10+ Code/data: by reasonable request CDC PLACES 2024 · CDC WONDER 2020–22 · IHME GBD 2021
Abstract

Background. County-level DALY disease burden estimates are rarely available for rural U.S. counties. This document describes a reproducible Python workflow for computing planning-grade Disability-Adjusted Life Year (DALY) estimates for rural Michigan counties using exclusively open public data. The DALY framework was developed by IHME for the Global Burden of Disease Study — a body of work we deeply respect and do not attempt to replicate. This project applies GBD's openly published disability weights and WHO reference standards for educational and local health planning purposes only. The workflow is designed to scale to all 83 Michigan counties with minimal per-county configuration changes.

Methods. We constructed DALY estimates from five open data sources: CDC WONDER 2020–2022 (mortality), CDC PLACES 2024 (prevalence), IHME GBD 2021 (disability weights), County Health Rankings 2010–2024 (years of potential life lost), and U.S. Census ACS 2022 (population denominators). YLL uses the WHO GHE frontier reference life table (89.1 years) as primary; Michigan observed LE (78.6 years) is reported as a planning sensitivity. Uncertainty was quantified via Monte Carlo simulation (n = 10,000). Analyses were implemented in Python 3.10 using NumPy, Pandas, SciPy, and Matplotlib; working code, configuration files, derived tables, and validation reports can be shared by the author on reasonable request.

Completed counties (June 2026). Isabella County (FIPS 26073): 15,430 DALYs/yr (95% UI 14,459–16,481), frontier LE primary. Clare County (FIPS 26035): 8,198 DALYs/yr (±20% uncertainty), frontier LE primary. Midland County (FIPS 26111): 14,335 DALYs/yr (MI LE) / ~20,400 (frontier LE) — CDC PLACES 2023 observed county-level prevalence. Three counties in pipeline: Gratiot (Q3 2026), Mecosta (Q4 2026), Washtenaw (Q4 2026, urban reference).

Limitations. Mixed-source modeled planning scenario — not a validated GBD-standard county burden estimate. Prevalence inputs are model-based (CDC PLACES MLRP). Mortality uses mean-age-at-death simplification for YLL; rural adjustment factors (×1.2–1.85) proxy suppressed CDC WONDER cells. Results are directional planning approximations for grant writing, CHNA processes, and county health planning.

Keywords: disability-adjusted life years · DALY methodology · rural health · Michigan · Python · reproducible workflow · CDC PLACES 2024 · county-level disease burden · public data

⚙️ Code & Data Availability

The analysis was implemented using a Python-based workflow and publicly available source data. The complete working repository is not presented here as a fixed public release while the manuscript and multi-county workflow remain under revision.

Working code, county configuration files, derived tables, validation reports, and replication notes can be shared by the author on reasonable request. Raw source data remain available from CDC WONDER, CDC PLACES, IHME GBD, County Health Rankings, the U.S. Census Bureau, USDA, CMS, and MDHHS under their respective terms of use.

1. DALY Framework

1.1 Standard formula

This analysis adopts the standard WHO DALY formula with no age-weighting and no time discounting, consistent with the 2010 GBD revision and subsequent global burden analyses:

DALY = YLL + YLD (1)

YLL = Years of Life Lost (premature mortality); YLD = Years Lived with Disability (prevalent morbidity). The two components capture different dimensions of burden: YLL is sensitive to mortality rates and the reference life expectancy standard; YLD is sensitive to prevalence and disability weights. Conditions with high late-life mortality (CVD, cancer) are YLL-dominated; conditions with high prevalence and low mortality (mental health) are YLD-dominated.

1.2 YLL calculation

YLL ≈ Deaths × (L − Ā) (2, planning approximation)

L = reference life expectancy (two standards — see Section 1.5); Ā = mean age at death for the condition (condition-specific constants, see Table 1). Deaths estimated as:

Deaths/yr = (WONDER_rate × rural_adj_factor) / 100,000 × population (3)
YLL simplification note. The standard GBD YLL formula is YLL = Σa d(a) × ℓ(a), where d(a) = age-specific deaths and ℓ(a) = remaining life expectancy at age a from the reference life table. The planning approximation above assumes all deaths in a condition group occur at a single mean age, which underestimates uncertainty and introduces systematic bias for conditions with wide age-at-death distributions. Future county-level extensions with age-stratified CDC WONDER data should use the full summation form. Mean ages used (national CDC WONDER 2019–21 aggregates): MH 46 yrs · SUD 44 yrs · Cancer 67 yrs · CVD 72 yrs · COPD 73 yrs · Stroke 73 yrs · Diabetes 70 yrs.

1.3 YLD calculation

YLD = P × DW (4)

P = prevalent cases (CDC PLACES 2024 prevalence × population denominator); DW = IHME GBD 2021 disability weight (moderate severity). Disability weights do not incorporate severity distributions or comorbidity corrections. For mental health, a remission adjustment is applied to convert lifetime-diagnosed BRFSS prevalence to active burden (see Section 1.6).

1.4 Economic burden

Two parallel methods with distinct conceptual frameworks are reported for each county:

Economic burden (Human Capital) = DALYs × GDP_per_capita (5)
Economic burden (VSL) = Deaths × VSL_adjusted (6)

Human Capital uses Michigan GDP per capita ($62,000, 2024) and represents indirect productivity losses. VSL uses the HHS ASPE 2026 central estimate ($13.4M per statistical life), income-adjusted per county median household income using income elasticity 0.4 (Viscusi & Aldy 2003):

VSL_county = $13.4M × (county_median_income / $80,000) ^ 0.4 (7)

The ~5–7× difference between Human Capital and VSL central estimates is methodologically expected. For grant applications use VSL (aligned with HHS/OMB cost-benefit guidance); for academic GBD comparison use Human Capital.

1.5 Life expectancy reference standards

Two reference standards are reported for every county:

Table 1. Life expectancy reference standards used in this analysis.
StandardValueSourceUse caseEffect on ranking
Primary (frontier) 89.1 yrs (sex-avg.)
M: 86.0 · F: 92.0
WHO GHE reference life table (global best-observed survival) Cross-county comparison, academic publication, international GBD alignment Favors late-death conditions: CVD, Cancer rank higher because more years are lost per late-life death. MH typically ranks lower under this standard.
Planning sensitivity (Michigan LE) 78.6 yrs (sex-avg.)
M: 76.2 · F: 80.9
MDHHS 2024, Michigan observed LE Local planning documents, CHNA reports, comparisons with prior county analyses Favors YLD-dominant conditions: MH typically ranks #1 because late-life YLL is compressed. Preferred by county health departments for local planning.
Why the standard matters for ranking: Under frontier LE (89.1 yrs), a cancer death at mean age 67 yields 22.1 remaining years of life lost; a CVD death at 72 yields 17.1 years. Under Michigan LE (78.6 yrs), those same deaths yield only 11.6 and 6.6 years. Mental health, which kills at ~46, yields 43.1 years (frontier) vs. 32.6 years (Michigan) — a smaller proportional gap. The LE standard therefore systematically shifts the ranking of late-death vs. early-death conditions.

1.6 Mental health active disease approximation

CDC PLACES 2024 reports lifetime-diagnosed depressive disorder (BRFSS ADDEPEV3), not current episode. For YLD, a remission factor of 0.50 is applied in the primary estimate to approximate active burden:

MH_active_prevalence = MH_lifetime_prevalence × 0.50 (8)

Evidence basis: NSDUH 2021 national past-year major depressive episode prevalence ~8.3% vs. PLACES national lifetime-diagnosed ~20.7%, implying ~60% remission at any given time. Published psychiatric epidemiology reports 40–60% 12-month remission for MDD. The central factor of 0.50 is propagated through Monte Carlo with uncertainty Uniform(0.40, 0.60). The planning sensitivity retains raw PLACES prevalence (no remission adjustment) for continuity with prior county documents.

2. Python Analysis Pipeline

All DALY computations are implemented in Python 3.10+ using a modular agent-based pipeline. The pipeline is designed for reproducibility: every computation step is parameterized by a county FIPS code, all intermediate outputs are serialized to JSON, and the full pipeline can be re-run with a single command.

Libraries used: NumPy 1.24+ (Monte Carlo, array operations) · Pandas 2.0+ (data wrangling, CSV I/O) · SciPy 1.11+ (statistical distributions, regression) · Matplotlib 3.8+ (trend charts, CI shading) · Chart.js 4.4.1 (interactive browser dashboards, via CDN) · Requests (API data collection) · python-dotenv (environment variable management for API keys)
1

Harvester harvester.py Live

Collects county data from five open APIs: CDC PLACES 2024 (Socrata API, app.cdc.gov), County Health Rankings 2010–2024 (direct download), CDC WONDER (WONDER API — mortality counts, may require manual download for suppressed cells), U.S. Census ACS 5-year (census.gov API), USDA Food Environment Atlas (static CSV). Output: *_county_harvest.json — all raw collected values with source URLs and retrieval timestamps.

2

DALY Calculator daly_calculator.py Live

Computes YLL, YLD, DALYs, economic burden, and 95% uncertainty intervals from harvest output. Implements equations (1)–(8) above. Applies rural adjustment factors for suppressed WONDER cells. Runs Monte Carlo simulation (n=10,000 iterations, seed=42) across mortality, prevalence, and disability weight uncertainty. Produces: *_county_config.json with all computed estimates, and data/daly_longitudinal_v2.csv for year-by-year trend analysis.

3

Longitudinal DALY Model longitudinal_daly_v2.py Live

Extends the cross-sectional estimate backward (2017–2023) and forward (2026 projection) using CDC PLACES multi-year releases and County Health Rankings YPLL trends. YLD varies year-by-year from observed PLACES prevalence (with linear interpolation for 2021–2022 data gaps); YLL is scaled by the CHR 15-year YPLL trend relative to 2018 baseline. Outputs: data/daly_longitudinal_v2.csv (7 conditions × 7 years) and data/daly_2026_projection_v2.csv (OLS trend projection with 95% prediction intervals).

4

Fact Checker fact_checker.py Live

Mathematical, statistical, and epidemiological quality control gatekeeper. Re-derives all DALY totals from config, checks internal consistency of YLL + YLD = DALYs, verifies economic burden formulas, flags stale values in dashboard HTML, checks for overclaiming in manuscript text (e.g., unsupported causal language). Severity levels: HIGH (material error), MEDIUM (inconsistency), LOW (disclosure). Output: *_fact_check_report.md/.json. Pipeline rule: no dashboard may be published if any HIGH issues remain unresolved.

5

Scientific Reviewer chatgpt_reviewer.py Live

Sends computed estimates to GPT-4o with a structured epidemiological peer-review prompt (based on Naghavi et al. 2010 GBD methodology standards). Reviews for: methodological validity, appropriate uncertainty framing, evidence strength for intervention claims, and cross-county consistency. Requires OPENAI_API_KEY in .env file. Output: *_review_report.md/.json.

6

Dashboard Generator html_generator.py In development

Generates county dashboard HTML from config JSON. Will produce home.html (or {county}.html) as a templated output replacing the current manually-maintained HTML files. Uses Chart.js for interactive charts. Planned for Q3 2026.

7

Deployment deploy.py In development

Commits updated HTML files to the GitHub Pages repository and pushes to production. Will enforce the fact-checker publication rule (no HIGH issues) as a pre-deploy gate. Planned for Q3 2026.

The pipeline is orchestrated by director.py, which accepts a county FIPS name and a list of steps to run:

# Full pipeline for a new county python3 director.py --county gratiot --steps harvest,calculate,factcheck,review # Run all 6 preset counties sequentially python3 director.py --all # Check pipeline status python3 director.py --status # Dry run python3 director.py --county midland --dry-run

Each county requires a minimal configuration entry in director.py (FIPS code, population, name) and the harvester automatically retrieves all other data from public APIs. Adding a new county to the pipeline is a one-line configuration change.

3. Data Sources

Table 2. Primary data sources used across all county analyses. Original source datasets are publicly available at no cost. Working API notes, download scripts, and derived county files are maintained by the author and can be shared on reasonable request.
SourceVersion / YearDALY ComponentAccess MethodCounty-level available?Quality
CDC PLACES
Multilevel Regression & Poststratification (MLRP) on BRFSS
2024 release
(2022 BRFSS data)
YLD — prevalence input for all conditions Socrata API
data.cdc.gov
Yes — county FIPS Model-based
CDC WONDER
Multiple Cause of Death, 1999–2022
2020–2022
(3-yr pooled)
YLL — cause-specific death counts and rates WONDER web API
or manual download
Yes, but cells <10 suppressed — rural adjustment required Observed / Suppressed
IHME GBD 2021
Global Burden of Disease Study 2021
2021 release Disability weights (DW) for all conditions; reference LE (frontier standard) GHDx download
ghdx.healthdata.org
National/state only — applied uniformly to county analyses Global meta-analysis
County Health Rankings
Robert Wood Johnson Foundation
2010–2024
(15-year trend)
YLL scaling — YPLL trend used for longitudinal YLL adjustment Direct CSV download
countyhealthrankings.org
Yes — county FIPS Observed
U.S. Census ACS
American Community Survey, 5-Year Estimates
2022 release
(2018–2022)
Population denominators; income for VSL adjustment; poverty, insurance SDOH Census API
api.census.gov
Yes — county FIPS Survey sample
USDA Food Environment Atlas 2020 SDOH context (food access, food insecurity) Static Excel download
ers.usda.gov
Yes — county FIPS Administrative
CMS Medicare Geographic Variation 2021 Healthcare utilization context CMS data portal
data.cms.gov
Yes — county FIPS Administrative claims
MDHHS PCNA / Michigan vital statistics 2020 baseline All-cause mortality comparison; county health rankings context MDHHS.gov
Michigan data portal
Yes — county FIPS Administrative

3.1 Rural adjustment factors for suppressed CDC WONDER cells

CDC WONDER suppresses death count cells with fewer than 10 events in a 3-year pooled window. For small rural counties (Clare: pop. 30,013; upcoming Mecosta: pop. ~43,000), most cause-specific death counts are suppressed. Rural adjustment factors are applied to Michigan state-level rates as proxies:

Table 3. Rural adjustment factors applied to Michigan state-level mortality rates for suppressed county cells. Factors represent central estimates within published rural/urban differential ranges.
ConditionRural adj. factorSource rangePrimary reference
SUD / Drug overdose×1.851.6–2.0×Hedegaard et al. (2021), Mack et al. (2017)
Mental Health×1.401.3–1.5×Searight (2018); WWAMI Rural Health Research
CVD×1.301.2–1.4×Moy et al. (2017), Garcia et al. (2017)
COPD×1.351.2–1.5×Moy et al. (2017)
Stroke×1.251.1–1.4×Garcia et al. (2017)
Diabetes×1.301.2–1.4×Moy et al. (2017)
Cancer×1.201.1–1.3×Zahnd et al. (2021)
These factors introduce ±30% additional uncertainty for suppressed cells, propagated through Monte Carlo simulation (Section 4). County-specific CDC WONDER 5-year pooled data (2019–2023) may reduce suppression and should be used when available.

4. Uncertainty Quantification

4.1 Monte Carlo simulation design

A structured Monte Carlo simulation (n = 10,000 iterations, NumPy seed=42) propagates uncertainty from three parameter sources. Reported 95% uncertainty intervals represent simulation-based percentile intervals (2.5th to 97.5th percentile across 10,000 iterations). They do not capture systematic biases in underlying data models.

Table 4. Monte Carlo uncertainty parameter specification.
ParameterDistributionNotes
Mortality — observed WONDER cells
(cancer, CVD, COPD, stroke)
Poisson(λ = observed annual deaths) Sampling variability in small-count observed deaths
Mortality — suppressed cells
(SUD, MH, diabetes)
Uniform(lower, upper) for rural adj. factor Ranges from Table 3; see Section 3.1
Prevalence (CDC PLACES) Beta(α, β) matching reported 95% CI Method of moments parameterization from PLACES reported CIs
MH remission factor Uniform(0.40, 0.60) Active-disease approximation uncertainty; primary estimate only
Disability weights (IHME GBD 2021) Beta(α, β) matching ±15–25% at moderate severity IHME-reported 95% uncertainty bounds

4.2 Longitudinal uncertainty

For the 2017–2023 longitudinal analysis, additional uncertainty sources apply: (1) CDC PLACES 2021 and 2022 county-level data are unavailable — values for those years are linearly interpolated between 2020 and 2023 observations and flagged with interpolated: True in the output CSV; (2) CHR YPLL scaling introduces year-specific uncertainty in YLL estimates; (3) the 2021 all-cause mortality spike (COVID-19 driven YPLL = 8,100 per Isabella CHR) is preserved in the longitudinal model but flagged as an anomalous year. For 2026 projections, OLS regression is applied where ≥3 observed data points exist; conditions with fewer than 3 points receive flat projections and are flagged reliability: insufficient_data.

5. County-Specific Supplements

This shared framework document covers methodology common to all county analyses. County-specific data quality tables, condition rankings, population characteristics, and analytic considerations are documented in each county's supplement. The supplement also notes where county-specific data availability required modifications to the standard pipeline (e.g., Clare County: WONDER fully suppressed for all causes, requiring full state-rate substitution; Isabella County: University of Michigan CMU student population inflates ACS poverty rate).

County supplements: Isabella County — v5.0 Clare County — v1.0 Midland County — v1.0 🌾 Gratiot — Q3 2026 🎓 Mecosta — Q4 2026 🏙️ Washtenaw — Q4 2026
Table 5. Cross-county comparison of primary burden estimates (June 2026). Frontier LE 89.1 yrs primary standard.
CountyFIPSPopulationDALYs/yr (primary)DALYs/yr (MI LE)Top conditionUnique featureStatus
Isabella2607364,565 15,430
95% UI 14,459–16,481
10,579 Cancer #1 · CVD #2 · SUD #3 CMU university pop; MH HPSA 176,938:1 Complete v5.0
Clare2603530,013 8,198
±20% uncertainty
5,943 (remission-adj) Cancer #1 · CVD #2 · SUD #3 (frontier) MUA-designated; opioid ~47/100k; COPD 12.7%; median age 46.8 Complete v1.0
Midland2611182,884 ~20,400
CDC PLACES 2023
14,335 Cancer #1 · SUD #2 · MH #3 Dow Chemical HQ; higher income ($62k); industrial SDOH; CDC PLACES 2023 observed prevalence Complete v1.0
Gratiot26057~40,000 Analysis planned Q3 2026 — agricultural economy, no university anchor, Gratiot County CMH
Mecosta26107~43,000 Analysis planned Q4 2026 — Ferris State University, rural health desert designation
Washtenaw26161~374,000 Analysis planned Q4 2026 — Urban reference county (U-M Ann Arbor), rural/urban comparison benchmark

6. Scope, Limitations, and Intended Use

Intended use. These analyses are intended for: county health department planning, Community Health Needs Assessment (CHNA) process support, grant application development (SAMHSA SOR, HRSA Rural Health, MDHHS, NIH R21), academic collaboration, and public health education. They are NOT intended for clinical decision-making, regulatory submissions, or direct policy implementation without additional validation.

Educational use of GBD framework — not a GBD estimate. The IHME Global Burden of Disease Study is the world's most comprehensive disease burden framework, representing decades of rigorous global scholarship by thousands of researchers. We hold this work in deep respect and make no claim to replicate it. This project applies GBD's publicly available disability weights and the WHO GHE frontier life table to open county-level data for educational and rural health planning purposes only. The DALY metric — borrowed from GBD methodology — allows rural planners to express local health burden in the same units used globally, enabling grant writing, community health needs assessments, and program prioritization. A formal GBD estimate requires systematic literature reviews, DisMod-MR incidence/prevalence/duration modeling, Bayesian meta-regression for risk attribution, and international peer validation — none of which this analysis provides or attempts to provide. Results should be cited as "planning-grade DALY estimates derived from open public data, applying GBD disability weights and WHO reference standards" — never as "GBD burden estimates."

Conditions not modeled. This analysis covers seven major condition groups (cancer, CVD, SUD, mental health, COPD, stroke, diabetes). Excluded conditions include: musculoskeletal disorders (leading global YLD source), maternal/neonatal conditions, injuries and violence, communicable diseases (except indirectly via SUD/opioids), oral health, and vision/hearing loss. Total burden is therefore underestimated relative to full GBD. Expansion to additional conditions is planned for v6.0.

Reproducibility caveats. CDC PLACES prevalence estimates are updated annually (new BRFSS data) and may differ between releases. CDC WONDER data are periodically revised. Re-running the pipeline with current-year data will produce updated estimates that may differ from those reported here. County config JSON files are version-stamped; always verify the data vintage when citing specific values.

7. References

  1. Salomon JA, Vos T, Hogan DR, et al. Common values in assessing health outcomes from disease and injury: disability weights measurement study for the Global Burden of Disease Study 2010. Lancet. 2012;380(9859):2129–2143.
  2. GBD 2021 Diseases and Injuries Collaborators. Global incidence, prevalence, years lived with disability (YLDs), disability-adjusted life-years (DALYs), and healthy life expectancy (HALE) for 371 diseases and injuries: a systematic analysis for the Global Burden of Disease Study 2021. Lancet. 2024;403(10440):2133–2161.
  3. Naghavi M, Makela S, Foreman K, et al. Algorithms for enhancing public health utility of national causes-of-death data. Popul Health Metr. 2010;8(1):9.
  4. Centers for Disease Control and Prevention. PLACES: Local Data for Better Health. Atlanta, GA: CDC; 2024. cdc.gov/places.
  5. Centers for Disease Control and Prevention. CDC WONDER: Multiple Cause of Death, 1999–2022. Atlanta, GA: CDC; 2023. wonder.cdc.gov.
  6. University of Wisconsin Population Health Institute. County Health Rankings & Roadmaps. Madison, WI: UWPHI; 2024. countyhealthrankings.org.
  7. Viscusi WK, Aldy JE. The Value of a Statistical Life: A Critical Review of Market Estimates throughout the World. J Risk Uncertainty. 2003;27(1):5–76.
  8. Hedegaard H, Miniño AM, Warner M. Urban–rural differences in drug overdose death rates, by sex, age, and type of drugs involved, 2017–2019. NCHS Data Brief. 2021;(423):1–8.
  9. Mack KA, Goding Sauer MM, Bhalla S, et al. Trends in drug overdose deaths among US women, 1999–2017. MMWR. 2019;68(1):1–10.
  10. Moy E, Garcia MC, Bastian B, et al. Leading causes of death in nonmetropolitan and metropolitan areas — United States, 1999–2014. MMWR Surveill Summ. 2017;66(1):1–8.
  11. Garcia MC, Rossen LM, Bastian B, et al. Potentially excess deaths from the five leading causes of death in metropolitan and nonmetropolitan counties — United States, 2010–2017. MMWR Surveill Summ. 2019;68(10):1–11.
  12. Zahnd WE, James AS, Jenkins WD, et al. Rural-urban differences in cancer incidence and trends in the United States. Cancer Epidemiol Biomarkers Prev. 2021;30(5):860–869.
  13. Michigan Department of Health and Human Services. Michigan Primary Care Needs Assessment 2020. Lansing, MI: MDHHS; 2020.
  14. Substance Abuse and Mental Health Services Administration. Key Substance Use and Mental Health Indicators in the United States: Results from the 2021 National Survey on Drug Use and Health. Rockville, MD: SAMHSA; 2022.
  15. Soshnikov S. Health Burden from NCDs, SUDs, and MHCs in Isabella County, Michigan: A Mixed-Source Modeled Estimate. Preprint in preparation, 2026. Supporting code and derived county files available from the author on reasonable request.