Skip to content

Releases: cachevector/hashprep

v0.1.0a1

02 Oct 19:31

Choose a tag to compare

v0.1.0a1 Pre-release
Pre-release

Improved correlation checks and reduced false positives in missing patterns

Improvements

  • Refined correlation checks in calculate_correlations
    • Fixed type inference errors by iterating over analyzer.column_types instead of analyzer.df
    • Updated mixed-variable thresholds to {'warning': 0.5, 'critical': 0.8} for consistency with Cramer’s V
    • Ensured seamless integration with run_checks
  • Reduced over-flagging in missing patterns detection
    • Introduced effect size thresholds:
      • Categorical: Cramer’s V > 0.1
      • Numeric: Cohen’s d > 0.2
    • Tightened p-value threshold to 0.01
    • Increased minimum samples per group to 10
    • Replaced ANOVA (f_oneway) with Mann-Whitney U test for better handling of skewed distributions
    • Added pattern grouping to summarize correlations per missing column (top 3 shown for conciseness)

Fixes

Corrected correlation dictionary iteration (analyzer.column_types)
Prevented spurious warnings by filtering weak associations

v0.1.0a0

27 Sep 19:24
91b2ae2

Choose a tag to compare

v0.1.0a0 Pre-release
Pre-release

First alpha release of HashPrep