Having spent many years working in healthcare IT every now and then I encounter situations where someone is struggling to understand why data quality is an issue. What difference does it make if the data is not great? What is the impact? Now, I can always come up with examples, often scary examples where bad quality or missing data can result in something bad happening. It illustrates the point but is often cloaked in informatics-speak or clinical nuance. So, when I come across an example that I think is fairly accessible and something that I could put a light-hearted spin on, I am delighted to share it.

This story was reported widely in the UK, the protagonist is a strapping young-ish journalist named Liam. As you likely know, every patient has a unique identifier, and their health information resides in their GP medical systems, and as part of the COVID-19 response, this data was shared with the CCG. Algorithms are used operating against this data to help determine who should be prioritised for vaccination based on their age and medical conditions.

As a person who has committed much of their adult life to healthcare information technology this situation is exactly the kind of thing we are trying to enable. Enabling software to help make good decisions quickly. So, I want to say that I applaud the initiative and nothing I say from this point forward should detract from that…

The algorithms were run and invitations sent out. When our hero, Liam, got his invitation though he was ‘really confused’ as to why he was receiving his so early in the process. After all, he is in his thirties and has no chronic conditions, why was he being prioritised above other people that were obviously more vulnerable?

Liam contacted the CCG to find out more. What did they know that he did not? Did he have some terrible condition that his doctor had not informed him of? The CCG informed Liam that he was morbidly obese…

This was a surprise to Liam who did not feel morbidly obese. Could he lose a few pounds? Maybe, but morbidly obese? “You are, indeed,” he was informed by the CCG who must have assumed Liam was in denial, because he had a Body Mass Index or BMI of over 28,000. Now, for those of you who are not familiar with the formula used to calculate BMI, it is your weight in pounds times a constant of 703 divided by the square of your height in inches. At a height of 6 feet 2 inches (187cm), to achieve a BMI of greater than 28,000, Liam would have had to weigh in at an impressive 224,000 pounds, which is roughly the weight of a railway locomotive. Now, Liam informed them that there must be some mistake. They confirmed that his weight on record was a bit over 200 pounds but that was not the issue. The issue was his diminutive height of 6.2 centimeters. That’s right, according to the NHS Liam was half the size of Morph who stood at 12cm!

The moral of the story, is that data quality matters the minute we ask software to help us make decisions using that data. In this case it resulted in a whimsical story that conjures adorable images of a tiny, albeit a bit chunky figure, maybe Morphs cousin, Chas (sorry Liam), getting an early immunisation. But it could also be something more serious, something that resulted in a missed intervention or a wrong intervention. What I also would like to highlight about this story is that it brings into focus that data quality is not always about terminology. In this scenario it was a ubiquitous entity that lives between terminology and quantitative data, the unit of measure, that caused the issue. One little mistake with our friend the unit of measure, opens the door to algorithmic mayhem.

Thanks to the Metro.co.uk for sharing Liam’s story and providing me with the opportunity to turn it into a cautionary tale, and thanks to Liam for being a thoughtful member of society and thinking of others.