Have you heard the one where a man wins an election with a database?
If you’d paid any attention to Election Day in the United States, then the answer is a resounding “YES.” CNN recently posted a story about how President Barack Obama used databases to win the election. Read the article, see what I mean, then come back here. I’ll wait.
It’s called data mining or business intelligence. This is my job. Well, okay, not actually analyzing the numbers part. But taking all that wonderful information about George Clooney and Sara Jessica Parker and their fans and mashing it all together into a data soup served up to data analysts for their use, that’s me. Creating the data mine, that’s me. Me, and database administrators across the planet, build the houses that store this information (the database) and design the windows (the reports, cubes, and other analytical tools) so that the number crunchers can peer into the house and see what’s going on.
Tell me I did not just describe analysts as perverts… Whoops! Not the image I intended. Still, a house is the best metaphor I can think of to describe a database to a non-DBA.
The politics of data mining is becoming a real issue in our modern world. Privacy concerns, identity theft, and business usage are on a lot of people’s minds. There are all sorts of laws (and bills) that make the life of a database administrator difficult. I can only see these issues becoming more important and relevant to the masses now that President Obama successfully used data mining in his campaign.
I’ve spent quite a bit of time thinking about how the Obama-DB might be set up. Truth be told, the campaign probably used a more specialized database called a DataWarehouse, but we’ll stick with the term “database” and its short form “db” for the rest of this blog.
The first thing we need is a voter table. We keep it very simple with only two columns: Name and Registered Party Affiliation. A secondary table (Address) has the street address, a second line for additional address information (commonly called Address1 or Addr2), City, State, Zip. We could include county in that address table. This is, after all, an election campaign where we want those kinds of details, but there’s a better way of doing this set up. I envision a District table which includes Councile District, Congress District, Senate Distric, House District, School District. (We might even have an EffectiveDate on the District table to account for the gerrymandering redistribution of voting districts.) Then I create a “join table” between Voter and District called Voter District which includes the identifiers VoterID and DistrictID, as well as a Precinct # and an EffectiveDate (after all, people do move, so we want to make sure we can select their records by their most recent district info).
There’s probably a VoterHistory table that includes their past party affiliation, the last time they voted (it’s public record, folks), the last place they voted. Additionally, there’s probably an AddressHistory table to track where the voters have lived before because this information can actually account for their economic status and possible political issue preferences (as well as other tastes). After all, if someone moved from the wealthy side of town to the local ghetto, an analyst can assume that job security and economic issues are on the forefront of their minds.
And then there are the social media tables. A table for Email Addresses that include known email addys (gotten from people who signed up to receive campaign news or registered on a political campaign page). That would be simple: VoterID, EmailAddress, EffectiveDate, ValidAddress (a flag for marking bounceback emails so we don’t waste our time emailing bad addresses). Then the SocialNetwork table which would include VoterID, SocialNetworkName, SocialNetworkHandle, EffectiveDate, ValidRecord. These tables would be normalized so that I might have 5 entries in the SocialNetwork table, one for Twitter, one for Google+, one for Facebook, one for LinkedIn, and one for GreatestThingSinceSlicedBread.com.
Now this is all the basic stuff the campaign has to have before it can move forward with the analyzing. But it needs a lot more to get things really rolling. A volunteers table to track voters who have also volunteered to work the campaign. A table of non-registered or non-voting voters in each district (very important for Get Out The Vote movements). A table of what voters have signed up for, purchased, listened to, watched, and done over the years.
“Wait, what?” you might say.
Remember, there’s a lot of information about you out in the world. All the more so since the advent of the internet. Now all those Facebook checkins and Likes, all those shared Twitter and Google+ posts, all those announcements of “I just bought Great American Novel #10″ are freely available to anyone with a bot capable of scraping the information off the internet. If you answered surveys, if you gave demographic information such as age, gender, salary, etc., this information call all be tracked back and used.
Go back and read the above article. Read the part about Parker and Clooney. They didn’t just come up with that information out of the blue.
The 2012 election is the best, perfect example of what businesses want to do with their data. Know who you’re selling to. Target consumers with coupons and ads that will get them into the store. Make offers to those (and only those) you KNOW will buy new product X because you know they love to shop on Tuesday, their favorite color is purple, and they have just enough money burning a hole in their pocket that they’ll pay that little extra for the “Walmart Special Release” instead of waiting for the “toy” to come out at another store. (Target has this data mining thing down to a science, BTW.)
Well, guess what? Someone other than the business world finally figured out there’s more to information than selling sweaters. I promise you when 2016, all campaigns who can afford it will be snapping up DBAs, data analysts, and database hardware / software to make sure their numbers get crunched effectively.
The question is, will you be in a position to take advantage of this sudden need?
Brush up on your DBA / Data Mining skills, everyone. The one IT job that won’t get easily outsourced (and hasn’t been in the past) is this one. And the open positions for database administrators, business intelligence experts, and data miners are rising in number even as other IT positions get cut.
If you want to see my faux-Obama-DB, post a comment and let me know.

