An easy way to think about the complex field of data governance is the following simple triad:
- Know the data
- Protect the data
- Maximize the data value
Before an association can effectively govern its data, it needs to understand what data it has. This is important because it is very difficult to protect our data and to manage the value if we don’t have a clear view of what data we own, what it means, and the scope of any related risks.
The first step of “know your data“, can be split into three key activities:
- Create a business glossary
- Create a data catalog
- Create a data dictionary
Compiling this information will not only help track and communicate what data assets we have, it will enable us to prioritize security and quality efforts to focus on what really needs to be managed.
Business Glossary
The intent of the business glossary is to track and communicate the official terms and definitions commonly used by our association. Having this information consolidated and easily accessible will reduce confusion related to conflicting terminology and help create a common business language.
The business glossary can be as simple as a shared document listing the business terms and their related definitions or much more sophisticated with information related to acronyms, synonyms, hierarchies and categories.
Data Catalog
The objective of the data catalog is to provide a consolidated view of the data sets which exist in the association. “Data set” in this context refers to business concepts like: Member, Employee, Registration, Sale, Download, or other similar entities / activities related to the operations of the association.
A data catalog can be a simple document that lists the data sets that exist in the association with a brief description.
Data Dictionary
The purpose of a data dictionary within the realm of data governance is to track and communicate the technical information related to the data items which are elements of data sets. These are individual data fields in a report or table.
The data dictionary should document the definition, origin, usage and format of the data as well as the business rules which are applied. More sophisticated dictionaries include:
- Stewardship assignment
- Relationships to data catalog and business glossary
- Security classification
- Quality classification
- Quality metrics
We can include many details in a data dictionary, however; we want to make sure it is sustainable. We want to create something we can keep up to date in the future.
Compiling all the information for the business glossary, data catalog and data dictionary can be a significant task; and we find it is helpful to first break the work into smaller manageable chunks and then continually expand the breadth and depth of the information collected. By focusing our efforts on the data that is being published in current reports and new reports as they are being published, we can limit the initial scope to what is most important. We also recommend targeting data sets where there is a privacy risk as early as possible.