600,000 Documents Redacted in 1 Machine-Day | 4,500x Faster at 49x Lower Cost

HSE Automates GDPR with Data X-Ray

The UK’s Health and Safety Executive (HSE) was under pressure to anonymize 600,000 safety records to enable data sharing, policy evaluation, and internal research without risking GDPR violations.

Doing it manually would have taken 12.5 person-years. The budget and staffing demands were untenable. HSE needed a faster, accurate, and legally defensible solution to automate the redaction and anonymisation of sensitive data across mixed-format records.

Problem

  • Manual redaction across 600,000 structured and unstructured documents

  • Risk of GDPR breach if records were incorrectly anonymized

  • Data trapped in formats that limited sharing, analysis, and reuse

  • Urgent need to support safety reporting (e.g., RIDDOR) while protecting individual privacy

Solution

HSE implemented Data X-Ray to automate the identification, flagging, and anonymisation of sensitive information across health and safety records.

Key capabilities included:

  • Automated extraction of content from complex, mixed-format datasets

  • Accurate detection of sensitive tokens using policy-driven redaction logic

  • 99%+ anonymisation accuracy on 1,998 RIDDOR reports, with only 19 flagged as potential compliance risks

  • Output of clean, shareable datasets in GDPR-safe formats

Data X-Ray completed the full job in just 1 machine-day, versus an estimated 12.5 person-years manually.

Business Value

  • 4,500x time reduction in document redaction

  • 49x cost savings over manual anonymisation

  • 99%+ accuracy approaching human-level sensitivity detection

  • Enabled HSE to share data safely and analyze at scale

  • Supported GDPR compliance across health data environments

  • Easily integrated with existing data pipelines and workflows

HSE now has an efficient, scalable model for data desensitisation.

Case study of Health and Safety Executive

loading...

Subscribe to our newsletter

Subscribe now