Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

Summary

This PR moves ALL random number generation from policyengine-us into the dataset generation in policyengine-us-data. The country package is now a purely deterministic rules engine.

⚠️ MERGE ORDER: This PR must be merged BEFORE the companion policyengine-us PR

Changes

New take-up rate parameters

Added YAML parameter files in policyengine_us_data/parameters/take_up/:

  • snap.yaml (0.82)
  • medicaid.yaml (0.93)
  • aca.yaml (0.672)
  • eitc.yaml (0.65/0.86/0.85 by children)
  • dc_ptc.yaml (0.32)

CPS dataset generation

  • Load take-up rates from YAML parameter files
  • Generate all stochastic boolean take-up decisions
  • Use seeded RNG (seed=100) for full reproducibility
  • Changed from seed variables to boolean decisions

Stochastic variables generated

Take-up decisions (boolean):

  • takes_up_snap_if_eligible
  • takes_up_aca_if_eligible
  • takes_up_medicaid_if_eligible
  • takes_up_eitc (already boolean, now uses YAML rates)
  • takes_up_dc_ptc (already boolean, now uses YAML rates)

Trade-offs

IMPORTANT: Take-up rates can no longer be adjusted dynamically via policy reforms or in the web app. They are fixed in the microdata at generation time. This is an acceptable trade-off for the cleaner architecture of keeping the country package purely deterministic.

To adjust take-up rates for analysis, the microdata must be regenerated with updated parameter values.

Test Plan

  • CPS dataset generation completes successfully
  • All stochastic variables are generated correctly
  • Companion policyengine-us PR passes all tests after this is merged

Related PRs

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

MaxGhenis and others added 2 commits October 5, 2025 15:11
This change moves ALL random number generation from policyengine-us into the
dataset generation in policyengine-us-data. The country package is now a
purely deterministic rules engine.

## Key Changes

### policyengine-us-data:
- Add take-up rate YAML parameter files in `parameters/take_up/`
- Generate all stochastic boolean take-up decisions in CPS dataset
- Use seeded RNG (seed=100) for full reproducibility

### Stochastic variables generated:
**Take-up decisions (boolean):**
- takes_up_snap_if_eligible
- takes_up_aca_if_eligible
- takes_up_medicaid_if_eligible
- takes_up_eitc (already boolean)
- takes_up_dc_ptc (already boolean)

All random generation now uses np.random.default_rng(seed=100) for full
reproducibility across dataset builds.

## Trade-offs

**IMPORTANT**: Take-up rates can no longer be adjusted dynamically via policy
reforms or in the web app. They are fixed in the microdata. This is an
acceptable trade-off for the cleaner architecture of keeping the country
package purely deterministic. To adjust take-up rates, the microdata must be
regenerated.

Related: policyengine-us PR (must be merged after this)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Create takeup parameter files with rates from NIEER report
- Head Start: 40% (pre-pandemic), 30% (pandemic 2020-2021)
- Early Head Start: 9%
- Generate stochastic takeup in CPS dataset using same pattern as SNAP/Medicaid
- Coordinates with policyengine-us PR adding takeup variables
@MaxGhenis
Copy link
Contributor Author

Closing fork PR - recreating from upstream branch to enable CI

@MaxGhenis MaxGhenis closed this Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant