EjSBS - The European Journal of Social & Behavioural Sciences

The European Journal of Social & Behavioural Sciences

Online ISSN: 2301-2218
European Publisher

A Framework to Assess Healthcare Data Quality

Table 1: Final Data Quality Framework with Definitions

Criteria Measurement Definition
Accessibility Assess which researchers need access to the data, and does access need to be authorised? To ensure only those who need to use the dataset have access to the file
Has the data been protected from deliberate bias? Can the process of acquiring the dataset be traced?
Will the appropriate steps be undertaken to ensure the dataset cannot be damaged or misused? Ensure the dataset is saved in a secure file for analysis
Relevance Are the concepts in the dataset needed for the current user? Refer to hypotheses and evaluate whether the dataset is relevant
Are the produced statistics needed by the user? Investigate whether statistics have been formulated and whether these could be used in the present research
Accuracy Is the coefficient of variation available? Compare the degree of variation from one data series to another
What is the response rate? Reported as a percentage of how many participants returned the data collection
Does the data represent a complete list of eligible persons or units? and not just a fraction of the list Review the response rate and determine whether datasets were not submitted or incomplete. Depending on the severity of this issue, contact the data source or consider using statistical tests to account for missing values.
Is the imputation rate available?  How many fields have been inserted to account for missing data
Has the dataset been revised? Check for number of revisions and ensure the researchers access the latest version
Were data cleansing methods used? Investigate the responsible statistician, and review the cleansing methods
Reliability Is the data generated based on protocols and procedures that do not change according to who is using them? So, is the data completely objective, independent of user or use? Search for published guidelines for data collection, and examine the process.
Are variables defined, and are these definitions standardised and based on a referenced source? Determine whether definitions of variables are available
Timeliness Can the amount of time between the dataset and reference point be calculated? Important when planning further research and comparisons.
Clarity Is the metadata completed? Imperative to assess data quality. Contact the source if metadata is not available
Comparability What is the length of the time-series? The occurrence of the publication of the dataset
Which geographical areas are used? And, can these be transformed into larger geographies? List of geographical granularity, for example, County and District
Can the data be easily manipulated and presented as needed? Can the dataset be modified to suit the researcher’s needs, for example, can units be converted?
Coherence Taking the above questions into account, can the current data be compared to other datasets? Prompts the researcher to reflect on the information
Validity Is engagement with researchers evident? During the data collection process, and publication of the dataset were relevant researchers liaised with?
Are the reports provisional and subject to change or have inaccuracies been reported separately? Find out whether the report is provisional and/or search for documentation of inaccuracies
Is there evidence of positive reports and no negative reports on the findings? Review the data source. Negative reports will be those that suggest that there are contradictions between different data sources for the same data.
Overall, does the dataset meet validation criteria? Dependent on the aforementioned. Mark the dataset as Pass, Borderline or Fail.
Confidentiality Does the dataset meet the BPS code of conduct for confidentiality? Check the data contains no identifiable information
< Back to article