Neon Enterprise Software Blog

Welcome to Neon Enterprise Software Blog Sign in | Join | Help
in Search

Data Management Today by Craig Mullins

News, views, and issues involved in managing data as a valuable corporate asset.

  • The Tao of DBA

    Can the ancient Chinese philosophy of Tao be applied to database administration?

     

    Go with the flow of things, and you
    will find yourself at one with
    the mysterious ways of the universe.
                                            -Chuang Tzu

    The senior DBA rocked back in his chair and smiled. The newly hired apprentice DBA was running back and forth from his terminal to the developer’s cube back to his phone and over to the fax machine. Frustrated, the new hire yelled over at the senior DBA “How can you be so calm? Things are falling apart!”


    “Things are always falling apart and yet life goes on,” muttered the senior DBA under his breath. He was a practitioner of Tao.

     

    What is Tao? More accurately, it is the Tao Te Ching and it proposes a model by which the individual and society can embody a philosophy to achieve a harmonious and balanced existence. Written by Chinese philosopher Lao Tzu over 2000 years ago, the Tao (or the Way) allegedly is the universal power through which all life flows. Whether you subscribe to the philosophy of Tao or not, Lao Tzu’s manuscript contains nuggets of wisdom on how to go with the flow of life. Our senior DBA has applied the philosophy of Tao to his job. Let’s continue with our eavesdropping on his on-going situation.

     

    As the apprentice DBA continued running around like a chicken with its head cut off the Taoist DBA calmly glided his mouse across the top of his desk, clicked, and typed something on his keyboard. In a minute or two the apprentice DBA walked into the senior DBA’s office and said, “Boy am I glad that is over. I don’t know what exactly I did to solve that problem, but I am exhausted.”

     

    The senior DBA shook his head and pointed to a printout he had thumb tacked to his wall that read “Don’t Panic!” He showed the apprentice the script he had run and told him why it solved the problem. A good DBA is prepared to resolve issues he has seen before because he is a packrat with a good memory. “That means you must learn from every action you take,” said the senior DBA. “Running around like a crazy man when things are going wrong will never help. Remaining calm and taking things one step at a time will help. And always remember the situation and the steps you took to remedy it. Things have a way of repeating themselves.”

     

    The senior DBA continued, “Skill and knowledge are required to be a successful DBA, but sometimes you need to master the art of getting out of your own way. You must learn how to act without forcing conclusions.”

    Lay plans for the accomplishment of the difficult
    before it becomes difficult;
    make something big by starting
    with something small.
                                            -Lao Tzu

    As the apprentice DBA took notes an application programmer walked around the corner and nervously stated “There’s a problem with the database!” The senior DBA muttered “guilty until proven innocent” under his breath. The apprentice DBA wheeled around in his chair and brought up his performance monitor. He clicked and typed and moved through menu after menu shaking his head and muttering to himself. The senior DBA chuckled and clucked out a cryptic acronym: “PEBCAK” he said.

     

    After about 15 minutes or so the apprentice DBA started to grill the application programmer with questions. “What is the problem?” he asked. The programmer replied, “My program worked and ran quickly yesterday and now it just takes forever to finish. I figured you must have changed something in the system, right?”

     

    “No, but what did you change in the program?” asked the apprentice DBA. “Nothing that should matter,” replied the programmer. “It worked yesterday.” The senior DBA, observing the conversation as he sat in the corner like a bemused buddah, said “It is another day.”

     

    Ignoring him the apprentice ran back to his terminal and kept looking for a database problem. The senior DBA looked at the programmer and said “If your changes do not matter, take them out of the program.” The programmer scrunched up his nose and muttered “I can’t do that. I guess they might matter. I’ll double-check things and get back to you.”

    Pure in heart, like uncut jade
    he cleared the muddy water
    by leaving it alone.
                                            -Lao Tzu

    The senior DBA walked over to his apprentice and said “We’re not likely to see him again today.” The apprentice turned away from his monitor and asked “I guess it was a program change that caused it after all, huh? By the way, what does PEBCAK mean?”

     

    The senior DBA smiled and nodded, whispering “Problem Exists Between Chair And Keyboard” into his apprentice’s ear.

     

    Slowly, the apprentice was learning the way.

     

  • Why you should not ignore normalization during logical data modeling.

    I received the following question and thought I'd answer it on the blog:

    Question: I work in a data-modeling environment with the responsibility of logically modeling (LM) data for mainframe and server databases. LM for the OLTP systems are done to 3rd NF. Recently, the DBA recommended LM for localized data in a much less rigid format, i.e. less or potentially no normalization.

    Are you aware of any discipline of LM for localized data? If so, can you recommend some reading on this topic?

    Here is my Answer:

    Not only am I not aware of any resource that would advocate logical modeling of data in an unnormalized fashion, I would be against anyone reading it if it existed. The logical data model should be normalized. Normalization is the process of identifying the one best place each fact belongs.

    Normalization is a design approach that minimizes data redundancy and optimizes data structures by systematically and properly placing data elements into the appropriate groupings. A normalized data model can be translated into a physical database that is organized correctly.

    So the goal of normalization is to eliminate redundancy from data. An entity is in third normal form (3NF) if and only if all non-key columns are (a) mutually independent and (b) fully dependent upon the primary key. Mutual independence means that no non-key column is dependent upon any combination of the other columns. I won't go into a full explanation of normalization here, though. Suffice it to say that normalization was created by E.F. Codd in the early 1970s. Like the relational model of data, normalization is based on the mathematical principles of set theory. Although normalization evolved from relational theory the process of normalizing data is applicable generally, to any type of data.

    It is important to remember that normalization is a logical process and does not necessarily dictate physical database design. A normalized data model will ensure that each entity is well formed and that each attribute is assigned to the proper entity. Of course, the best situation is when a normalized logical data model can be physically implemented without major modifications. However, there are times when the physical database must differ from the logical data model due to physical implementation requirements and deficiencies in DBMS products.

    Take the proper steps to assure performance in the physical database implementation for the type of applications you will need to create, the service level agreements you will need to support and the DBMS that you will use. This may mean "de-normalized" by combining tables or carrying redundant data (and so on) but this should be undertaken for performance reasons only. And, the logical model should not have any of these "processing" artifacts in it.

    For more details on normalization, check out this Wikipedia entry.
  • What Are the Hottest IT Roles Today?

    According to a Global Knowledge article which cites Forrester Research data, there are 16 hot roles that IT professionals should consider pursuing. The trend toward IT workers taking on multiple responsibilities and filling hybrid positions is in full swing this year. Forrester Research polled its analysts to narrow down the numerous demands on IT departments to 16 roles that are critical to an organization's success in the coming months. The demand won't necessarily send CIOs to the job boards looking for new candidates; instead, IT managers will assess their in-house talent to match existing skills with emerging roles.

    As I read through the linked article, the thing that struck me was the high number of roles that focus on data. The article states that Level 1 includes the "hottest IT roles (which) are information/data architect and information security expert." Reading a bit further down, "(i)n the extremely hot, Level 2 category, Forrester includes data- or content-oriented business analyst, business architect, enterprise architect, and vendor-management expert."

    So, once again, we have a study confirming that managing data and turning it into useful information for business consumption is a hot IT skill. Can't say I'm surprised... how about you?

     

     

     

  • DBAs Becoming Application Administrators?

    I ran across an interesting opinion piece on the CIO Magazine web site by Juan Irizarry on the expanding role of database administration. His thesis, is as follows: The field of database administration is constantly evolving and growing. Enterprise wide applications, higher efficiency demands and the need to align IT roles more with the business create the necessary elements to engage DBAs into fields in the periphery of their databases. The role of the DBA is evolving quickly from managing the database to managing the information.

    He goes on to talk about the importance of enterprise-wide applications and aligning more closely with business. These are themes I've touched upon in the past and I agree with Mr. Irizarry. Over time DBAs will need to adopt a better understanding of the content (information) within the containers (databases) that they manage.

    It is no longer sufficient for DBAs to just understand "bits and bytes" and "speeds and feeds" - but they need to know what the impact of their actions are on the business. This requires not just data, but also metadata.

    What do you think? What does the future hold for the DBA position?

  • The Old "Information Overload" Issue

    Yesterday, Computerworld ran an article title "Information overload: Is it time for a data diet?" which piqued my curiosity. My whole career has been based on managing data and producing information and, as such, I am intrigued with the issue of information overload -- or the perception that there is too much information. A former boss called me an information bottom-feeder because I always seemed to have a nugget of information or two that applied to her projects and quests.You see, I'm of the opinion that you can never have enough of information.

    The Computerworld article raises many of the pertinent issues, citing that "(t)he idea of information overload has been discussed for decades, but never before has it seemed so relevant." I agree with this statement, and I can relate to the ton of information sources "out there" to contend with. And yes, better technologies to manage and sift through all the data and information would be nice.

    But I just cannot bring myself to embrace the concept that there is too much information available. I would rather be overwhelmed with an avalanche of details than to have little to no data available to use for business analysis and decision-making. Can you imagine ever going back to the pre-Web, pre-Google days of actually having to go to a library or pay exhorbitant fees to use something like Lexis/Nexus to find relevant information?

    Let me put this whole discussion into a different framework. Think about the topic or thing about which you are most passionate. Could be a hobby, a sport, or anything, really. Can you ever get enough information about that topic? Does your day brighten when you happen across a tidbit of information you did not know? If you are a fantasy football nut don't you dig through every resource available looking at stats and predictions before you pick your team? If you are a collector (coins, stamps, books, etc.), don't you enjoy scanning through information and tracking down the missing items from your collection?

    Myself, I am an avid music fan. I have in excess of 6,000 CDs and albums. I scour the web looking for information about my favorites and trying to fill gaps in my collection. There is no such thing as too much information about these areas... to me. Now your hobby, OK, there is too much information out there about that... see what I mean? 

    So how can we deal with "information overload?" One key is having the ability to discern quickly what data is applicable to your current needs and to rapidly move through that which does not apply. Another key is to stop looking when you feel comfortable that you have enough information with which to make an intelligent decision. I think many who claim to suffer from "information overload" do not apply these two keys to their information gathering process.

  • The Impact of Data Volume on Operational Databases

    Operational databases are growing in size for many reasons. There is the overarching trend of more and more data being generated every year. But also, there is the growing need to store more data for longer periods of time due to regulatory compliance issues (see previous blog postings).

    As data volumes expand, it impacts operational databases in two ways:

    1. additional data stresses transaction processing by slows things down, and;
    2. database administration tasks are negatively impacted.

    In terms of performance, the more data in the operational database, the less efficient transactions running against that database tend to be. Table scans must reference more pages of data to return a result. Indexes grow in size to support larger data volumes, causing access by the index to degrade because there are more levels to traverse to return an answer. Such performance impacts are causing many companies to seek solutions that offload older data to either reference databases or to archive data stores.

    The other impact, database administration complexity, causes longer processing time and outages to perform such functions as backups, unloads, reorganizations, recoveries, and disaster recoveries.  In many cases the lengthened outages have become unacceptable, causing companies to again seek ways to lighten up the operational databases.

    Even so, these performance and administration issues are ancillary to the regulatory issues. Although both are driving the need to move data from the operational database into an archive data store, it is the legal requirements that have the biggest impact in terms of data volume expansion.

    One approach to handling this growing problem is database archiving. For more details on database archiving, consult these previous blog posts:

  • Musical Metadata, Part 5 - Musical Genre Can Be Tricky.

     Genre can be a tricky piece of metadata to populate accurately.

    All MP3 files have a metadata tag known as Genre. The genre is the type or category of music for the song in question. Is it rock or country, classical or blues? Unless you keep it very simple, populating the genre of your computerized music can become a big hassle. Now, if you’ve been reading this series, you know that when it comes to music, I rarely keep things simple. So my MP3 files are tagged with all sorts of genres. I’m constantly trying to clean them up, with varying degrees of success.

    Backing up a bit, remember that I have a Filemaker database of all my CDs and albums. I did not originally have a genre field in that database. When I started, back in the 1980’s, I decided that it would be too complicated for me to assign a single genre to every disc I owned. As I struggle with this field in my MP3 files today, I realize just how prescient I was back then! But I gave in and added one – and then meticulously, record by record, added a genre to each recording. I’m sure, though, that I’d recoil in horror if I ever actually looked at a list of the genres that I created.

    So, what advice can I give you about genres? Either stick to a small, simple list of genres without breaking them into a myriad of categories. You know, like using “Rock” to cover Aerosmith, Black Sabbath, Kid Rock, Nirvana and Sweet. Or, if you decide you want more specific genres, be sure to create a domain list and stick to it. Might I suggest the following:


     Classical

     Jazz

     Fusion

     Pop

     Bubblegum

     Dance

     Disco

     Comedy

     Novelty

     Ambient

     Electronica

     Folk

     Easy Listening

     Country

     Blues

     Rock

     Rock and Roll

     Rockabilly

     Glam

     Heavy Metal

     Punk

     Grunge

     New Wave

     Alternative

     Industrial

     Goth

     Power Pop

     R+B

     Soul

     Funk

     Rap

     Latin

     Hawaiian

     Musical

     Polka

     Reggae

     Ska

     Gosepl

     World

     Pop Vocal

     

    Of course, your list might vary. The important point is to create a specific domain and then use only those genres, adding to the list only rarely. If you don’t stick to the domain you will wind up with all kinds of weird genres. Here are a few that existed in my MP3s until I cleaned them up: Hair Metal, Showtunes, Oldies, Jungle, Indie, Experimental, Euro Pop, and Choral Pop (well, what else would you call the Polyphonic Spree?).

    I also have a bunch of hyphenated genres that I may, or may not, clean up eventually. Things like Alt-Pop, Alt-Country, Dance-Pop, etc. I think it may be better for searching if I just listed multiple genres separated by commas instead of all the hyphenating. For example, “Alternative, Country”. This way, the song will show up on iTunes in both genres when you search.

    You might wonder about some of the genres I list in my domain. How about Pop Vocal? I use that for artists like Tony Bennett, Frank Sinatra, Dean Martin, Bing Crosby, and the like. I could have used Easy Listening, but I wanted to distinguish these artists from the typical easy listening fare (you know, artists like Barry Manilow, Air Supply, and The Carpenters).

    And some folks might quibble with having both Soul and R&B. After struggling with this distinction for a bit, I might agree. But I would list Ray Charles, Al Green, and Marvin Gaye as soul; Aretha Franklin, James Brown, and LaBelle as R&B. But in trying to come up with solid example here I wound up with a whole bunch of artists that I could not reasonably defend as being one or the other (or even, perhaps, bordering on disco), such as Kool & The Gang (disco or R&B?), Otis Redding (R&B or soul?), and The Pointer Sisters (R&B or pop)? Actually, I’ve been thinking about eliminating the Soul genre for quite some time now.

    But that’ll take a lot of clean-up time. I have very close to the 160 GB limit on my iPod. And any time you monkey around with the metadata and its definition it means you will have work to do to synchronize the metadata with the data... which is sort of the whole point of this series of postings.

    I bet you wondered if I was ever going to come to a point, didn’t you?

  • Musical Metadata, Part 4 - Dealing with artist name issues.

    Today I will continue my series on musical metadata by discussing the various issues involved in managing one specific piece of metadata: artist name. The artist name is the moniker of the performer of the song contained in the MP3 file. First of all, I’m not sure that that definition is actually very good, but at least it avoids one of the biggest metadata traps of definition by restatement. In other words, it is better than saying “the artist name is the name of the artist.” My friend, Bob Seiner, refers to these type of definitions as cheeseburger definitions (you know, a cheeseburger is a burger with cheese!)

    So, we have a definition that seems quite simple. All we have to do is enter the artist name in the appropriate field or capture it using Gracenote (or some other MP3 metadata service). Not so fast!

    How do you want to handle actually using the artist name in your iPod? If you click on the Menu, choose Music, and the Artists, your iPod will bring up an alphabetical list of the artists for which you have songs stored on the device. Sounds good… but where would you find the godfather of soul, James Brown (RIP)? Under “B” or under “J”?

    Let me digress for a minute. In an earlier posting in this series I divulged the fact that I have a Filemaker database of all of my CDs and records. This started out as a dBase database back in the 1980’s. Being the stickler that I am, I wanted to store “James Brown” in the artist name but be able to sort by “Brown, James”… OK, maybe we need to normalize this data and have a last name and first name for the artist? Nope, too complicated for my simple database. I wanted the artist name as a whole regardless of whether it was a solo performer (James Brown), a duo (Simon & Garfunkel) or a band (The Fountains of Wayne). But I wanted James Brown to sort with the B’s… so I created a separate column called SORT_KEY.

    The SORT_KEY column, which I meticulously populated, contains the exact term that I wanted my reports to sort on – which is the order in which I physically store my CDs and LPs. So, the ARTIST column is set to “James Brown” (actually all caps in my database) and the SORT_KEY is set to “BROWN, JAMES”.

    This also works well for groups. Consider, for example, “The Beatles”. I don’t want every group that starts with “The” to sort with the T’s. So into the SORT_KEY goes “Beatles, The” or “Who, The”.

    At first I tried to continue something like this with my first ventures into MP3 players but there was no SORT_KEY in the MP3 metadata fields. So I compromised and used the artist field. So I entered “Clash, The” where I just had to have the “The” there to make me comfortable. And where I didn’t require the “The” I’d just enter the band name (for example, I was fine with “Supremes” and “Long Ryders”, but not with “Knack” and “Who”). But I just couldn’t bring myself to rename all of those solo artists, so I search on first name in my MP3 player (but I don’t like it).

    Now, lo and behold, the iPod works better than my previous devices because it ignores all of the The’s in the artist names (I wonder what it would do with that 1980’s band The The – I don’t have any of their music on MP3 yet, but I’ll have to investigate this one!). So, “The Manhattans” are now perfectly sandwiched between “The Mamas & The Papas” and “Marc Cohn” – just where they belong. Well, sorta… Marc Cohn really belongs up there with the C’s. I like this and I don’t like this. I like it because I think this is the way it SHOULD work (the “The” should be ignored) but I don’t like it because I had to go in and change all of my metadata! OK, I didn’t have to, but I did.

    Which brings me to a special case (sort of). Being an 80’s music fan, I have recordings by A Flock of Seagulls and A Certain Ratio. And I always, always sort them under “A” and not “F” or “C”. But that is me. I can understand either way. But you, as a user of your device, have to decide which you want and populate that artist field appropriately.

    The other nice thing about the iPod and iTunes is that it gives you a Sort Artist field, which is exactly like my SORT_KEY field! So I can enter the artist name any way I like and sort it any way I like… the best of both worlds!

    Keep in mind, those online services (e.g. GraceNote) that automatically populate your musical metadata won’t necessarily set things the way you want them to be. So always take a look at how they populate your metadata before blindly accepting it.

     

  • Musical Metadata, Part 3 - Automated Metadata Population

    Be careful about automatically populating your metadata from online databases because the information is not consistent, nor is it always accurate.

     

    In the first two installments of this series 1, 2, we discussed how metadata can impact the usability and enjoyment of your iPod. Now it is not just metadata, but accurate metadata that truly unleashes the potential of these devices. Unfortunately, there are all sorts of barriers to accuracy.

    Perhaps the biggest contributor to getting all that metadata into your iPod will be the online music database Gracenote. Most digital music software relies on the Gracenote database to automatically populate your musical metadata. If you are connected to the Internet when you pop a CD into your drive to rip the songs, you’ve probably used Gracenote (it used to be called CDDB).

    And Gracenote is awesome for many reasons. Foremost among them, though, is that it automatically identifies the CD based on its content (and almost always gets it right). And then Gracenote automatically populates the artist, title, recording name, and other metadata fields so you don’t have to. Without this technology a lot of people would have multiple songs out on their hard disks and iPods with titles like Track 1, Track 2, etc.

    OK, sounds great right? So what is the problem? Well, if you are a stickler for accuracy, like me, Gracenote might at times annoy you (even though you’d never do without it). For example, one of the things I like is consistency. If I am ripping a double disk set (say, something like Hymns to the Silence by Van Morrison). The way I want the album/CD title to appear is “Hymns to the Silence (Disc 1)” and “Hymns to the Silence (Disc 2)”. Gracenote will not always be this consistent. Sometimes it will put the parenthetical subtitles in, sometimes it won’t; sometimes it will use parentheses sometimes brackets, sometimes a dash (it depends on the actual album and what is stored for it).

    But where things can really gets inconsistent is box sets. Take something like Beg, Scream & Shout!: The Big Ol' Box Of 60's Soul. There are six discs in this box. But it also has a subtitle. So, making sure that we consistently label each of the six discs, as well as making sure that the subtitle is there (or not there) for all six discs, can be quite a chore. Of course, you may not care if things are quite that consistent, but I find it makes it easier to navigate through thousands of titles if they are. And I'm also a bit of a perfectionist when it comes to documenting my record collection.

    By the way, Beg, Scream & Shout! is a great collection, but is a bit hard to find now-a-days. It was released in 1997 and the six CDs are disguised as full-size 45s, dropped into authentically designed sleeves, and packed into a replica of the old 7" 45 carrying case we all had as kids (well, I did anyway). If you're looking for a nice, broad R&B overview, then this box is not to be missed.

    Sorry about that, but I sometimes have to comment on the discs I mention (I guess it is the repressed record reviewer inside of me struggling to get out). But back to metadata…

    Of course, there are other metadata consistency issues you’ll likely struggle with. How about artist name? Do you want complete accuracy, or should we fudge things to make finding things easier? For example, do you have both Ben Folds and Ben Folds Five, or is everything Ben-related just under Ben Folds? This is probably a poor example because they’d sort right next to each other. How about Paul McCartney? Do we have Paul McCartney, Paul McCartney & Wings, and Wings all based on the actual artist name associated with the disc in question? Or do we just lump all Paul into one of these categories. I’d choose one, Paul McCartney, and be done with it. But that doesn’t mean Gracenote will ensure that consistency – you’ll have to do it.

    And, of course, you can specify different Artist names while using the Sort Artist tag (available in iTunes, not sure about others) to group them all together. So you could have 'Them' as the Artist, but 'Morrison, Van' as the Sort Artist, if you like.

    But this gets us into another issue and I think we've discussed enough for today… In the next installment I’ll talk a bit more about the metadata consistency and usability issues when dealing with Artist Name…

  • Musical Metadata, Part 2 - If the Music is the Data...

    This is a the second part of a multi-part series on metadata for MP3 music files such as those used in iPods and other portable music players. Part 1 can be found here

    So, the music is the data... which begs the question, what is the metadata?

    Getting the metadata correct can be one of the most important aspects of setting up your iPod (or any MP3 player) for maximum enjoyment. You see, the music is the data. It is the reason you bought the device in the first place, right? The whole purpose of the device is to entertain you by allowing you to carry around and listen to songs. Of course, newer devices also allow you to transport and watch video, as well, but let’s focus on music for the time being.

    Accurate and up-to-date metadata makes the iPod experience more enjoyable. What type of metadata? Well, most people, at a bare minimum want to know the song name and probably the artist performing the song. This information – this metadata – makes the music on your device accessible by some means other than random playing. If you make sure that the metadata about the music is accurate when you move it to your device then you can pick and choose the songs you want to play using the device’s interface.

    Now there are some MPs players – like the iPod shuffle – that have no user interface so all they can do is play songs randomly. But I bet you had the metadata before you downloaded the songs to that player? In other words, you didn’t just populate the player with a bunch of random MP3 files without knowing what songs they were. So even these type of devices benefit from metadata as you populate them with your music.

    So what type of metadata do you need? Think about it before you go about downloading music. As we mentioned earlier, at a bare minimum you’ll want artist and song name. You’ll probably also want to know the album name the song is from, especially if you want to be able to listen to entire albums on your iPod. And it can be crucial if you have different versions of the same song, for example a studio version and a live version... If you know the album name, such as "Frampton Comes Alive!" you can probably safely assume that that version of "Do You Feel Like We Do" is the live one.

    And the Artist metadata field can be problematic.How would you handle the group Panic! At the Disco? The exclamation point was in their name for their first album, but removed for their second. Or how about Matchbox Twenty, which has also gone by Matchbox 20? I suggest choosing one spelling and using it consistently regardless of the actual name used on the recording. That way the songs will group together nicely. Of course, there are other issues with the Artist field, but let's move on for now...

    Probably the next piece of metadata you’ll want is one of the most vexing to get: genre of music. At least it has been troublesome for me. Why? Well, the term is not rigorously defined. Is there a difference between Rock and Hard Rock? What about Hard Rock and Heavy Metal? Do you want to slice genre even finer so that you’d have Thrash Metal, Death Metal, Rap Metal, Hardcore, and maybe even Hair Metal? It might make all the difference in the world to you if you are a metal fan. Or would you classify Led Zeppelin, Judas Priest, Poison, and Slayer all as simply Heavy Metal? Or maybe you don’t care enough about metal music at all, so you’d classify anything even remotely metal-ish simply as Hard Rock… or maybe just Rock.

    That is not the only example either. How would you classify Pink Floyd? Rock? Hard rock? Progressive? Some might even classify it as electronic.What about Pop, Power Pop, Bubblegum, and Glam? And the how would you classify Sweet? And would “Little Willy” be classified the same as “Fox On The Run,” or “Love is Like Oxygen,” for that matter?

    You really do need to put some thought into the categories you want for genre. If you download all of your songs from online stores like iTunes then the metadata should be set up for you. But it might not agree with what you want. One of the most frustrating things I’ve found is genres like “General Country” and “General Alternative.” Why would anyone want the word “General” in there – making it simply Country or Alternative makes it easier to search later when you are looking for songs by genre. And it seems like every Punk song is classified as "Alternative & Punk" instead of just Punk, which also annoys me.

    And be careful about automatically populating your metadata from online databases because the information is not consistent, nor is it always accurate. But I’ll leave that discussion for my next post…

    But before I leave today’s post, I want to elaborate on other pieces of metadata you might want.

    • You might want to know the Composer of the song – that is, who wrote it. This comes in handy if you are looking, say, for all Lennon & McCartney songs, even if they aren’t done by The Beatles.
    • Album Artist can come in handy, too. For example, you might have the song “The Saints Are Coming” by U2 and Green Day (Artist), from the Album “18 Singles” by U2. In this case, the artist is U2 and Green Day, but the Album Artist is U2.
    • Another piece of metadata I like, but that isn’t usually associated with MP3 files is the number from the spine of the album or CD. I store this in my Filemaker database of CDs, but not with my MP3 files or in my iPod.

    Let’s stop here for today. Future entries will discuss the benefits and problems with automatic music metadata services, metadata sorting and display issues, dealing with duplicates, using metadata to create playlists, and more. So stay tuned.

  • Musical Metadata, Part 1

    I wrote a series of popular blog posts about metadata and MP3 files for a previous blog (which has since been mismanaged by its owner and the content is no longer available). Since that is the case, and I recently upgraded my iPod to a 160 GB version, I thought it would be a good idea to re-run an updated version of these posts. So, here is the first in a series of postings on musical metadata...

    Anyone who knows me reasonably well knows that I'm an avid music fan. I own more than 6,000 CDs and record albums. Yes, I still own and play (and even sometimes buy) vinyl records. In fact, last year I went through a phase where I converted some of my vinyl to MP3s so I could listen to them on my PC and my iPod.

    Anyway, as I rip my collection to MP3s I can't help but to think about metadata. I know, only a true geek like me would think about metadata when listening to music and playing around with my iPod, but stick with me.

    Database guy that I am, I have my entire record/CD collection in a Filemaker database that I sync up with my Treo smartphone so that I always have information on my collection handy. Without that, unfortunately, I’ve been known to buy a CD or two that I already own.

    Now you might be thinking, “You’re just getting around to using an MP3 player?” Good question – and the answer is no. I bought one of the first generation MPs players on the market many moons ago, as well as several others over the years, but only recently did I upgrade to an iPod. I got my first iPod (the 80 GB version) a year or so ago when my brother got one for Christmas. He he let me play with it, envy set in, and I just had to get one for myself. And now I've upgraded to the iPod with the largest amount of storage (160 GB)... but what I really want is even more storage along with the new interface of the iPod touch - you know, the one that looks like an iPhone - and, by the way, throw in phone capability, too!

    OK, so why did I start thinking about metadata? Well, MP3s are all about the metadata! If you don’t get the metadata right, then the MP3 player is not as easy to use as it could be. I’ll be talking more about this in this series of blog entries (didja notice that Part 1? Yes, that means there will be additional parts upcoming.)

    The goal of this series to to be fun, yet educational. If you want to read something less “fun” about metadata, check out another of my multi-part blog series on The Importance of Metadata: Part 1, Part 2, and Part 3. And thanks for coming  along on this musical metadata voyage with me... 

     

  • Is Your Pen Smart?

    I just installed and started using the Livescribe Pulse smartpen. I read about it in the USA Today a couple of weeks ago and then watched the demo online and I was hooked.

    The smartpen has a tiny camera that captures what you write. It will capture text, diagrams, doodles, basically anything you write. Of course, you have to use special paper with "microdots" that the camera uses as it captures your scribbles. Don't worry, the dots won't be visible in your text. Really, you have to squint very hard to even see them on the supplied paper (one pad).

    Another interesting capability of the Pulse smartpen is that it can record audio as well -- and sync that audio with the notes you take. This makes it an intriguing gadget for capturing notes in a meeting, a presentation, or a class room. 

    My plan is to use it in several ways:

    1. I want to personalize my technical presentations by making crude drawings that I then capture and include in my Powerpoint slides. I thought this might lend my presentations a bit of warmth with a personal touch... and, as an aside, it'll make it easier to recognize when folks are using my slides in their presentations because I'll recognize my own scribblings.
    2. I hope to use it to take notes in important meetings where I may need to refer back to the actual audio of what was said. This can be especially helpful in technical product meetings and presentations.
    3. Maybe I'll use it when I attend sessions at conferences like IDUG, SHARE, and IOD -- at least for some of the sessions.
    So if you see me in one of your presentations, speak up... I want to make sure the microphone on my smartpen captures every word you say!

    If you are interested, the Pulse SmartPen is now available on amazon.com.
  • On Becoming a DBA

    One of the most common questions I am asked is: How can I become a DBA? The answer, of course, depends a lot on what you are currently doing. Programmers who have developed applications using a database system are usually best-suited to becoming a DBA. They already know some of the trials and tribulations that can occur when accessing a database.

    If you are a programmer and you want to become a DBA, you should ask yourself some hard questions before you pursue that path. First of all, are you willing to work additional, sometimes crazy, hours? Yes, I know that many programmers work more than 40 hours already, but the requirements of the DBA job can push people to their limits. It is not uncommon for DBAs to work late into the evening and on weekends; and you better be ready to handle technical calls at 2:00 a.m. when database applications fail.

    Additionally, you need to ask yourself if you are insatiably curious. A good DBA must become a jack-of-all-trades. DBAs are expected to know everything about everything -- at least in terms of how it works with databases. From technical and business jargon to the latest management and technology fads, the DBA is expected to be "in the know." And do not expect any private time: A DBA must be prepared for interruptions at any time to answer any type of question -- and not just about databases, either.

    And how are your people skills? The DBA, often respected as a database guru, is just as frequently criticized as a curmudgeon with vast technical knowledge but limited people skills. Just about every database programmer has his or her favorite DBA story. You know, those anecdotes that begin with "I had a problem..." and end with "and then he told me to stop bothering him and read the manual." DBAs simply do not have a "warm and fuzzy" image. However, this perception probably has more to do with the nature and scope of the job than with anything else. The DBMS spans the enterprise, effectively placing the DBA on call for the applications of the entire organization. As such, you will interact with many different people and take on many different roles. To be successful, you will need an easy-going and somewhat amiable manner.

    Finally, you should ask yourself how adaptable you are. A day in the life of a DBA is usually quite hectic. The DBA maintains production and test environments, monitors active application development projects, attends strategy and design meetings, selects and evaluates new products and connects legacy systems to the Web. And, of course: Joe in Accounting just resubmitted that query from hell that's bringing the system to a halt. Can you do something about that? All of this can occur within a single workday. You must be able to embrace the chaos to succeed as a DBA.

    Of course, you need to be organized and capable of succinct planning, too. Being able to plan for changes and implement new functionality is a key component of database administration. And although this may seem to clash with the need to be flexible and adaptable, it doesn't really. Not once you get used to it.

    So, if you want to become a DBA you should already have some experience with the DBMS, be willing to work long and crazy hours, have excellent communication and people skills, be adaptable and excel at organization. If that sounds like fun, you'll probably make a good DBA.

  • Q+A: Tell Me Everything About Databases, Please

    As regular readers of my blogs know, every now and then I take the opportunity to answer e-mail questions via blogging. I can't always answer e-mail questions directly, but I try to whenever time permits. Which brings me to today's Q+A... 

    The following series of questions came to me all within a single e-mail. That is not a problem, sometimes folks have more than one question, but take a look at these questions.

    1. How important is the database management system on the firm?
    2. How to maintain the database management system?
    3. How often is the proper upgrading of an information system?
    4. How does an information system work?

    When confronted with these types of questions I attempt to give some broad, high-level answers and then direct the questioner to additional resources for education. Blogs and e-mail are not the proper forum for learning everything that is implied in these very broad-ranging questions.

    Anyway, here is the answer I provided:

    These types of questions are very difficult to answer succinctly in this type of forum, except perhaps the first one. And the answer to that is "very important."

    Data is the lifeblood of the modern organization and every company should use DBMS technology to store, retrieve, modify and manage their critical data. Failure to use a DBMS, opens data up to be corrupted or accessed without authorization.

    To answer the remainder of your questions, I am going to recommend a couple of books. The first is one that I wrote titled "Database Administration: The Complete Guide to Practices and Procedures" (2002, Addison-Wesley, ISBN: 0201741296). This book provides complete coverage of the role of the DBA and will answer your question about how to maintain and manage a DBMS. It can be ordered at this link.

    I would also recommend a good book on the basics of database processing and database management systems. The best book for that is probably Chris Date's "An Introduction to Database Systems (Introduction to Database Systems, 7th Ed)" (2000, Addison-Wesley, ISBN: 0201385902). It can be ordered here.

    For some, though, Date's book can be intimidating. If you wish a simpler introduction to database systems try one of these books instead of (or in addition to) Date's book:

    Now regarding your other questions (3 and 4), they are a bit too general for me to be able to give you any type of useful advice. It seems like you would benefit from some comprehensive training in MIS and IT. Again, perhaps some reading material will prove helpful. Suggested reading includes:

    Good luck!

  • Follow Me on Twitter

    After listening to John Dvorak (on Cranky Geeks) talk about Twitter I decided to give it a go. What is Twitter, you ask? Well, according to Wikipedia, Twitter is a free social networking and micro-blogging service that allows users to send "updates" (or "tweets"; text-based posts, up to 140 characters long) to the Twitter website, via short message service (SMS), instant messaging, or a third-party application. Folks can sign up and follow the tweets of the Twitterers. 

    After signing up and starting to use Twitter, I found some other database folks out there twittering (Willie, Troy, James). And it is quite addictive!  I put up Twitter feeds on my home page and on my DB2portal blog, too. If you want to try it out yourself, my Twitter page is at twitter.com/craigmullins - or click here  to go straight over there!

    This week I am twittering about my experience at the IDUG conference in Dallas. So sign up on Twitter before next week if you want to virtually attend IDUG by following our twittering. If it is successful, I'll do it for other conferences I attend (like IOD and SHARE).

More Posts Next page »
Powered by Community Server, by Telligent Systems