Data health refers to the quality and condition of an organization’s data. Good data health means that a company’s data is appropriate to use for decision making purposes, can provide strong analysis, and is fit for its intended use. For data to be considered healthy, it must fit these seven criteria:
- Accurate: must be accurate and correct
- Complete: all required data should be present
- Consistent: shouldn’t have any conflicting data included in the dataset
- Timely: must be relevant and up to date
- Valid: must meet defined formats, rules, and standards
- Uncorrupted: must be unaltered and not corrupted or manipulated
- Unique: duplicate data entries should be avoided
If a dataset fits these seven criteria, it can be considered healthy. It can then be used to help an organization with any necessary processes or decisions.