A database administrator’s most important mission is “protect the data.” It is a mantra we live, breathe, and get pounded into our head by teachers, mentors, and white papers. Every month it seems as if the news is reporting yet another “data theft” horror story. White-knuckled with the fear of security breaches, we slog through our day with those three words echoing in our heads, knowing if we fail in our duty we can lose our jobs and our reputations.
But is “protect the data” really the most important element of a DBA’s job?
I’m going to swim against the current on this one and say “no.” Don’t get me wrong. Data protection is an essential core mission, but it is not THE mission. The way I see it, my real mission is to “protect the information.”
I know a lot of people who would tell me that data and information are the same exact thing. I would beg to differ. There is a huge difference between the two, which I can sum up in one word: context.
Data is nothing more than letters and numbers. For example, I have a piece of data that equals “5296,” a piece of data that equals “N,” a piece of data that equals “Bower” and a piece of data that equals “St.” By themselves, each piece of data is meaningless. Once I put them together, though, they have context. If I put them together in order of appearance, I get 5296 N Bower St and many people’s first reaction is “Hey, that’s a street address.”
But what if I get the data in a different order? What if “Bower,” “5296,” “St,” and “N” are the order in which I receive these bits? Well, the context is missing so I don’t even know if this data is supposed to go together or if it is a street address unless I have a data definition document that tells me that this data goes together. The act of context-defining random data with boundaries and rules-turns the letters and numbers of data into information.
Now here’s a thought. I can have 2.5 terabytes of data and not a single bit of information. I cannot, however, have information without data. Data is the core of information, and information is the core of business.
Here’s an example of data:
89652 C 321477934 Bower St UV 33321
How do you translate this? What if “St” doesn’t stand for “street,” but is meant to be the abbreviation of “state”? What if one of those numbers is a checking account number, and one is a United States zip code? How do you tell the difference? Remember U.S. zip codes have between 5 to 9 numbers, depending on if it’s an extended zip. And bank account numbers can be whatever size the bank in question chooses to use. Oh, and let’s not forget the “What the heck does C stand for?” question.
The reason this is important is if a data thief gets a file full of 1s and 0s, he might have a binary file that he can translate into information such as names, addresses, bank account numbers. Then again, he might just have a file full of 1s and 0s that are just data and not information translated to binary. In the later case, he can’t cause harm to anyone because 1s and 0s that are just 1s and 0s are harmless without the context that defines them as information.
Data thieves don’t steal data. They steal information because information is actionable and worth money. No one wants data without context. It’s too much work to transform data into something that can be used. IMHO, data thieves should rightly be called information thieves, but that’s too big a mouthful for most people.
So as we return to the daily grind, ready to do our best and fulfill our mission, let’s ask ourselves again: What is our most important task? Protecting the data, or protecting the information?

