Neon Enterprise Software Blog

Welcome to Neon Enterprise Software Blog Sign in | Join | Help
in Search

Data Management Today by Craig Mullins

News, views, and issues involved in managing data as a valuable corporate asset.

The Importance of Metadata, Part 2

In The Importance of Metadata, Part 1, we defined metadata at a high level and discussed how metadata is a necessity if we want to move from managing purely data to actively gaining benefit from data as information. In part 2, we will discuss the different types of metadata.

 

Even though all metadata describes data, there are many different types and sources of metadata. At a basic level, though, all metadata boils down to one of two types: technology metadata or business metadata. Technology metadata describes the  technical aspects of the data as it relates to storing and managing the data in computerized systems. Business metadata, on the other hand, describes aspects of how the data is used by the business, and is needed for the data to have value to the organization. So, knowing that the LICNO column is a positive integer between 1 and 9,999,999 is technology metadata. Of course, this information is useful to, and required by the business user, too. Knowing that the LICNO column is the practitioner license number for certified course instructors, must be unique and every instructor can have one and only one license number is business metadata (though, these details also are useful to the DBA in order to create the database appropriately and effectively).

For DBAs, the DBMS itself is a good source of metadata. The system catalog (or perhaps, data dictionary, depending on your particular DBMS of choice) is used to store information about database objects is a vital store of DBA metadata – technology metadata. DBAs and developers make regular use of the metadata in the DBMS system catalog to help them better understand about database objects and the data contained therein. Depending on the DBMS, the user can write queries against the system catalog tables or views, or can execute system-provided stored procedures to return metadata from the system catalog tables. Just about any type of descriptive information about the composition of the data may be found in the system catalog. For example, most DBMSs store all of the following metadata in the system catalog:

  • The names of every database, table, column, index, view, relationship, stored procedure, trigger, and so on.
  • The primary key for each table and any foreign keys that refer back to that primary key.
    Which tables are in which views.
  • The data type, length, and constraints for each column of every table.
  • The names of the physical files used to store database data, as well as information about file storage, extents, and disk volumes.
  • Authorization and security information detailing which users have what type of authority on which database objects.
  • The date and time of the last database definition change, as well as the ID of the user who implemented the DDL for the change.
  • Database organization information.

The DBMS system catalog is a particularly effective source of metadata because it is active, integrated, and nonsubvertible. The system catalog is active because the metadata is automatically built and maintained as database objects are created and modified. As the DBA creates databases, the DBMS automatically collects and populates metadata in the system catalog. The integration of the system catalog and the DBMS, coupled with the active nature of the system catalog, keeps the technology metadata in the system catalog accurate and up-to-date. Additionally, the DBMS system catalog is nonsubvertible, meaning that normal DBMS operations are the only mechanism for populating the system catalog. Of course, the subvertibility of the system catalog will differ from DBMS to DBMS. Some DBMSs provide options to enable direct updates to the system catalog – but such an option is to be used only in emergency situations and generally under the direction of the DBMS vendor’s technical support personnel.

Although a wealth of metadata can be found in the system catalog, this DBMS metadata usually is insufficient to fully describe data. For example, descriptions of database objects are not commonly found in the DBMS system catalog. Some DBMSs provide system catalog description columns that can be populated at the DBA’s discretion. But many DBAs avoid doing so for fear of disorganizing the system catalog or perhaps just because descriptions for the database objects were not available when the objects were created. Additional metadata that is useful, but not found in the system catalog, includes:

  • Metadata for non-database files (flat or sequential files).
  • Modification information regarding when data in the database was last changed? And by whom?
  • Copybook information for the database table (or non-database file), as well as which programs use that information.
  • Information on batch jobs and transactions that access the data.
  • Operational metadata on IT infrastructure components.
  • Data model metadata describing the logical database design and how it maps to the physical database implementation.
  • Data warehousing and ETL metadata defining data source(s), system of record, a date and timestamp when the data was last updated, and other analytical information.
  • Data ownership and stewardship metadata.

Of course, this is an incomplete list. A myriad of different metadata types and purposes exists that can be cataloged and managed. Capturing and maintaining metadata better documents databases and systems, thereby making them easier to use. The more metadata available to business users, the more value they will be able to extract from their information systems.

 

Published Monday, December 18, 2006 2:47 PM by cmullins
Filed under:

Comments

 

Data Management Today by Craig Mullins said:

Well, it is a new year, but I am going to continue my series of recent posts on metadata and its importance.

January 2, 2007 1:29 PM
 

Data Management Today by Craig Mullins said:

The first three installments of this series on the importance of metadata covered most of the basics:

January 10, 2007 9:28 AM
 

Data Management Today by Craig Mullins said:

I wrote a series of popular blog posts about metadata and MP3 files for a previous blog (which has since

July 7, 2008 1:58 PM
Anonymous comments are disabled

About cmullins

Craig S. Mullins is a data management strategist for NEON Enterprise Software, Inc.. Craig has extensive experience in the field of database management having worked as an application developer, a DBA, and an instructor with multiple database management systems, including working with with DB2 for z/OS since Version 1. Craig is also an IBM gold consultant and is the author of two books: "DB2 Developer’s Guide" and "Database Administration: Practices and Procedures."
Powered by Community Server, by Telligent Systems