I received the following question and thought I'd answer it on the blog:
Question: I work in a data-modeling environment with the responsibility
of logically modeling (LM) data for mainframe and server databases. LM
for the OLTP systems are done to 3rd NF. Recently, the DBA recommended
LM for localized data in a much less rigid format, i.e. less or
potentially no normalization.
Are you aware of any discipline of LM for localized data? If so, can you recommend some reading on this topic?
Here is my Answer:
Not only am I not aware of any resource that would
advocate logical modeling of data in an unnormalized fashion, I would
be against anyone reading it if it existed. The logical data model
should be normalized. Normalization is the process of identifying the
one best place each fact belongs.
Normalization is a design approach that minimizes data redundancy and
optimizes data structures by systematically and properly placing data
elements into the appropriate groupings. A normalized data model can be
translated into a physical database that is organized correctly.
So the goal of normalization is to eliminate redundancy from data. An
entity is in third normal form (3NF) if and only if all non-key columns
are (a) mutually independent and (b) fully dependent upon the primary
key. Mutual independence means that no non-key column is dependent upon
any combination of the other columns. I won't go into a full
explanation of normalization here, though.
Suffice it to say that normalization was created by E.F. Codd in the
early 1970s. Like the relational model of data, normalization is based
on the mathematical principles of set theory. Although normalization
evolved from relational theory the process of normalizing data is
applicable generally, to any type of data.
It is important to remember that normalization is a logical process and
does not necessarily dictate physical database design. A normalized
data model will ensure that each entity is well formed and that each
attribute is assigned to the proper entity. Of course, the best
situation is when a normalized logical data model can be physically
implemented without major modifications. However, there are times when
the physical database must differ from the logical data model due to
physical implementation requirements and deficiencies in DBMS products.
Take the proper steps to assure performance in the physical database
implementation for the type of applications you will need to create,
the service level agreements you will need to support and the DBMS that
you will use. This may mean "de-normalized" by combining tables or
carrying redundant data (and so on) but this should be undertaken for
performance reasons only. And, the logical model should not have any of
these "processing" artifacts in it.
For more details on normalization, check out
this Wikipedia entry.