Ethnicity data from health records of over 61 million people studied in detail for first time  

Ethnicity data from health records of over 61 million people studied in detail for first time  

In research published in Nature Scientific Data, ethnicity data from general practice and hospital records of more than 61 million people in England has been studied in detail in for the first time. 

Researchers assessed the available details of ethnicity data from different sources of NHS records in England. They showed that much more detailed classification of ethnicity is possible than health researchers typically use. They also highlighted that ethnicity information was missing for almost one in 10 patients, while around 12% of patients had conflicting ethnicity codes in their patient records. 

The study was led by researchers at the University of Oxford, University College London and the Centre for Ethnic Health Research, and made possible through the support of Health Data Research UK (HDR UK) and the British Heart Foundation (BHF) Data Science Centre. The researchers analysed de-identified data on ethnicity and other characteristics from general practice and hospital health records, accessed safely within NHS England’s Secure Data Environment (SDE). It is the first part of a three-phase project aiming to reduce bias in AI health prediction models. 

Sara Khalid, Associate Professor of Health Informatics and Biomedical Data Science at NDORMS, said: “Because AI-based healthcare technology depends on the data that is fed into it, a lack of representative data can lead to biased models that ultimately produce incorrect health assessments. Better data from real-world settings, such as the data we have collected, can lead to better technology and ultimately better health for all.”