-
Notifications
You must be signed in to change notification settings - Fork 0
Added APCS primary diagnosis and ONS deaths #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rw251
commented
Nov 28, 2025
- APCS report now gets the primary diagnosis counts in addition to the ones from the "all diagnoses" field.
- New output that does a similar thing for ONS deaths where we have a count for the primary cause of death, and another for all the supplementary cause of death codes
- A validation script that reports on any financial years, or ICD10 codes that don't match our expected regex. These might be ok, but should be a quick way to spot anything unusual - and eliminate the possibility of outputting erroneous patient identifiers in the large all diagnosis field.
- Now gives separate counts for occurrences in primary diagnosis and the all diagnoses field
Query to count the occurrences of ICD10 codes in the ONS death data. Counts the primary diagnosis, and the contributing factors separately. Rounds counts to 10, suppresses values <15, and excludes type 1 opt outs.
It occurs to me that we might get unexpected data. Validating this by the output checkers might be hard so let's add a report to help us: - The financial_year column in apcs might contain unexpected data as I don't think anyone has used this before. It should be in the format "202425" or maybe "2024-25", so if not we report it. - The ICD10 codes in ONS have, I think, already been validated, but I'm not 100% and the ones in the all_diagnoses field probably haven't. So we check each one against a regex and report on those that don't match. Some of these may be valid, in which case we can update the regex. But this removes the risk that somehow patient identifiable data appears in that field.
Jongmassey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of minor nits.
Also, having re-read the docs, there's a secondary diagnosis as well - we should probably take the opportunity to include this for completeness. The information as to how often primary/secondary appear in "all" or not is very useful on its own!
| # - A000 or A00X (4 chars without dot) | ||
| # - A00.00 or A00.0X (6 chars with dot) | ||
| # - A0000 or A000X (5 chars without dot) | ||
| ICD10_PATTERN = re.compile(r"^[A-Z][0-9]{2}\.?[0-9X]?[0-9]?$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to correctly match A00X0 and A00.X0 so I think the comment is wrong?
| Der_Financial_Year, | ||
| apcs.APCS_Ident, | ||
| apcs.Der_Financial_Year, | ||
| LTRIM(RTRIM(der.Spell_Primary_Diagnosis)) as primary_diagnosis, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
der.Spell_Secondary_Diagnosis too for good measure?
Code indicating secondary diagnosis. This is a single code giving the first listed secondary diagnosis, but there may other secondary diagnoses listed in the all_diagnoses field below.