A Framework to Assess Healthcare Data Quality

Table 1: Final Data Quality Framework with Definitions

Criteria	Measurement	Definition
Accessibility	Assess which researchers need access to the data, and does access need to be authorised?	To ensure only those who need to use the dataset have access to the file
	Has the data been protected from deliberate bias?	Can the process of acquiring the dataset be traced?
	Will the appropriate steps be undertaken to ensure the dataset cannot be damaged or misused?	Ensure the dataset is saved in a secure file for analysis
Relevance	Are the concepts in the dataset needed for the current user?	Refer to hypotheses and evaluate whether the dataset is relevant
Relevance	Are the produced statistics needed by the user?	Investigate whether statistics have been formulated and whether these could be used in the present research
Accuracy	Is the coefficient of variation available?	Compare the degree of variation from one data series to another
	What is the response rate?	Reported as a percentage of how many participants returned the data collection
	Does the data represent a complete list of eligible persons or units? and not just a fraction of the list	Review the response rate and determine whether datasets were not submitted or incomplete. Depending on the severity of this issue, contact the data source or consider using statistical tests to account for missing values.
	Is the imputation rate available?	How many fields have been inserted to account for missing data
	Has the dataset been revised?	Check for number of revisions and ensure the researchers access the latest version
	Were data cleansing methods used?	Investigate the responsible statistician, and review the cleansing methods
Reliability	Is the data generated based on protocols and procedures that do not change according to who is using them? So, is the data completely objective, independent of user or use?	Search for published guidelines for data collection, and examine the process.
Reliability	Are variables defined, and are these definitions standardised and based on a referenced source?	Determine whether definitions of variables are available
Timeliness	Can the amount of time between the dataset and reference point be calculated?	Important when planning further research and comparisons.
Clarity	Is the metadata completed?	Imperative to assess data quality. Contact the source if metadata is not available
Comparability	What is the length of the time-series?	The occurrence of the publication of the dataset
	Which geographical areas are used? And, can these be transformed into larger geographies?	List of geographical granularity, for example, County and District
	Can the data be easily manipulated and presented as needed?	Can the dataset be modified to suit the researcher’s needs, for example, can units be converted?
Coherence	Taking the above questions into account, can the current data be compared to other datasets?	Prompts the researcher to reflect on the information
Validity	Is engagement with researchers evident?	During the data collection process, and publication of the dataset were relevant researchers liaised with?
	Are the reports provisional and subject to change or have inaccuracies been reported separately?	Find out whether the report is provisional and/or search for documentation of inaccuracies
	Is there evidence of positive reports and no negative reports on the findings?	Review the data source. Negative reports will be those that suggest that there are contradictions between different data sources for the same data.
	Overall, does the dataset meet validation criteria?	Dependent on the aforementioned. Mark the dataset as Pass, Borderline or Fail.
Confidentiality	Does the dataset meet the BPS code of conduct for confidentiality?	Check the data contains no identifiable information

< Back to article

We care about your privacy

We use cookies or similar technologies to access personal data, including page visits and your IP address. We use this information about you, your devices and your online interactions with us to provide, analyse and improve our services. This may include personalising content or advertising for you. You can find out more in our privacy policy and cookie policy and manage the choices available to you at any time by going to ‘Privacy settings’ at the bottom of any page.

Manage My Preferences

You have control over your personal data. For more detailed information about your personal data, please see our Privacy Policy and Cookie Policy.

Strictly Necessary Cookies

These cookies are essential in order to enable you to move around the site and use its features, such as accessing secure areas of the site. Without these cookies, services you have asked for cannot be provided.

Marketing Cookies

Third-party advertising and social media cookies are used to
(1) deliver advertisements more relevant to you and your interests;
(2) limit the number of times you see an advertisement;
(3) help measure the effectiveness of the advertising campaign; and
(4) understand people’s behavior after they view an advertisement.
They remember that you have visited a site and quite often they will be linked to site functionality provided by the other organization. This may impact the content and messages you see on other websites you visit.