HSE Automates GDPR with Data X-Ray
The UK’s Health and Safety Executive (HSE) was under pressure to anonymize 600,000 safety records to enable data sharing, policy evaluation, and internal research without risking GDPR violations.
Doing it manually would have taken 12.5 person-years. The budget and staffing demands were untenable. HSE needed a faster, accurate, and legally defensible solution to automate the redaction and anonymisation of sensitive data across mixed-format records.
Problem
Manual redaction across 600,000 structured and unstructured documents
Risk of GDPR breach if records were incorrectly anonymized
Data trapped in formats that limited sharing, analysis, and reuse
Urgent need to support safety reporting (e.g., RIDDOR) while protecting individual privacy
Solution
HSE implemented Data X-Ray to automate the identification, flagging, and anonymisation of sensitive information across health and safety records.
Key capabilities included:
Automated extraction of content from complex, mixed-format datasets
Accurate detection of sensitive tokens using policy-driven redaction logic
99%+ anonymisation accuracy on 1,998 RIDDOR reports, with only 19 flagged as potential compliance risks
Output of clean, shareable datasets in GDPR-safe formats
Data X-Ray completed the full job in just 1 machine-day, versus an estimated 12.5 person-years manually.
Business Value
4,500x time reduction in document redaction
49x cost savings over manual anonymisation
99%+ accuracy approaching human-level sensitivity detection
Enabled HSE to share data safely and analyze at scale
Supported GDPR compliance across health data environments
Easily integrated with existing data pipelines and workflows
HSE now has an efficient, scalable model for data desensitisation.

loading...