Releases: cachevector/hashprep
Releases · cachevector/hashprep
v0.1.0a1
Improved correlation checks and reduced false positives in missing patterns
Improvements
- Refined correlation checks in
calculate_correlations- Fixed type inference errors by iterating over
analyzer.column_typesinstead ofanalyzer.df - Updated mixed-variable thresholds to
{'warning': 0.5, 'critical': 0.8}for consistency with Cramer’s V - Ensured seamless integration with
run_checks
- Fixed type inference errors by iterating over
- Reduced over-flagging in missing patterns detection
- Introduced effect size thresholds:
- Categorical: Cramer’s V > 0.1
- Numeric: Cohen’s d > 0.2
- Tightened p-value threshold to 0.01
- Increased minimum samples per group to 10
- Replaced ANOVA (
f_oneway) with Mann-Whitney U test for better handling of skewed distributions - Added pattern grouping to summarize correlations per missing column (top 3 shown for conciseness)
- Introduced effect size thresholds:
Fixes
Corrected correlation dictionary iteration (analyzer.column_types)
Prevented spurious warnings by filtering weak associations
v0.1.0a0
First alpha release of HashPrep