In addition to managing data, corporations must to be able to manage and control the definition of the data elements used in databases. Without an understanding of the structure, limitations, definition, and description of data, it is likely that data will be misinterpreted or misused; further, data that is not well-defined can cause database integrity problems. This is a metadata issue.
But What is Metadata?
Have you ever watched the Antiques Roadshow program on television? In this show people bring items to professional antique dealers to have them examined and evaluated. The participants hope to learn that their items are long-lost treasures of immense value. The antique dealers always spend a lot of time talking to the owners about their items. They always ask questions like “Where did you get this item?” and “What can you tell me about its history?” Now, the item is sitting right there in front of them, yet they ask these questions. Why? Because these details provide knowledge about the authenticity and nature of the item. The dealer also carefully examines the item looking for markings and dates that provide clues to the item’s origin.
Users of data must know what the data is before it becomes useful as information. Information about data is referred to as metadata. The simplest definition of metadata is “data about data.” But, to be a bit more precise, metadata describes data, providing information like data type, length, textual description, and other characteristics of the data. So, for example, metadata allows the user to know that the customer number is a five digit numeric field, whereas the data itself might be 56789.
So, using our Antiques Roadshow example, the item being evaluated is the “data.” The answers to the antique dealer’s questions and the marking on the item are the “metadata.” Value is assigned to an item only after the metadata about that item is discovered and evaluated.
Metadata characterizes data. It is used to provide documentation such that data can be understood and more readily consumed by your organization. Metadata answers the who, what, when, where, why, and how questions for users of the data.
From Data to Knowledge and Beyond
The basic building block of knowledge is data. Data is a fact represented as an item or event out of context and with no relation to other things. Examples of data are 27, 010110, and JAN. Without additional details we know nothing about any of these three pieces of data. Consider:
- Is 27 a number in base ten, or is it in octal (which would translate to 23 in base ten)?
- If 27 is a number in base ten what does it represent? Is it an age, a dollar amount, an IQ, a shoe size, or something else entirely?
- What about 010110? Is it a binary number? Or is it a representation of a date, perhaps January 1, 1910? January 1, 2010? Or something else entirely?
- Finally, what does JAN represent? Is it a woman’s name (or a man’s name)? Or does it represent the first month of the year? Or perhaps it is something else entirely?
All of these are examples of data because of the lack of context.
Information, on the other hand, adds context through relationships between data, and possibly other information. Data in context with
metadata makes information. The relationships may represent information, yet the relations do not actually constitute information until they are understood. Also, the relationships that represent data have a tendency to be limited in context, mostly about the past or present, with little if any implication for the future.
Webster’s New Collegiate Dictionary defines knowledge as “the fact or condition of knowing something with familiarity gained through experience or association.” Knowledge adds understanding and retention to information. It is the next natural progression after information. To have “knowledge” requires information in conjunction with patterns between data, information, and other knowledge. SO knowledge couples data with understanding and cognition.
The final step would be to move from knowledge to wisdom. Wisdom can be thought of as knowledge applied. You may have the knowledge that fatty foods are bad for you, but if you eat it anyway, you are not wise.
So. In order for data to be anything more than simply data, metadata is required. Without metadata, data has no identifiable meaning – it is merely a collection of digits, characters, or bits. Metadata gives data its form and makes it usable by information professionals.
Furthermore, metadata management is a prerequisite for truly treating data as a corporate asset... more on this in future blog postings.