When I was working on databases and related applications for the Corporate Security department world’s largest software company. This WAS a few years ago, but I still can’t forget the good times. I had written some software that included a lot of moving parts, and I had to work with some end users to use my reporting tool, I had to teach some developers about Star Schema datamart designs and how to write ETL automation tools. I had written a fair number of Data Management applications using ASP, but those didn’t need a ton of additional development and support.

When I started this project we had a 10tb database that was running on a Pentium 3 server. It was amazing to me that a single 32bit server could manage this much data. Data was being fed from 200,000 other machines into this database. There were intermediate servers that staged the data and it all fed into one Sql Server database. I had been told by someone in the marketing department that this project was one of the largest database implementations on this platform anywhere. But 3 or 4 years later when I was at Safeco, we had some projects that would process once a quarter and these processes would take a month and a half to run. The limitations that the industry had with the 32bit world were lifted in about 2004, and these days hardware is about 100 times faster than it was 15 years ago The 10tb dataset that I was working on, was logging information for proxy logs for every time that any app connected to the internet. Things were easier 20 years ago then they are today, because there was an agent that was required on every machine that allowed us to store more information than what is available today.

There were hundreds of proxy servers that we were reporting on, and those proxy servers were reporting their data to a handful of servers to allow us to find virus activity and other problems like P2P, Porn and Overall Bandwidth. The success that we had wasn’t just due to the security benefits of finding problems faster, we also had to combat other threats with a legal consequence.

During the project I heard things like:

Your reports have saved us millions of dollars in bandwidth alone

Jenn Lemond, Corporate Security

The ability to do a better job of finding data was just one small part of what we accomplished on this project. There was a team of about a dozen people that spend their whole day writing queries by hand in order to look for virus activity. They would write queries by hand and execute these queries against an unindexed heap table. There was no normalization no relationships and no way to fix the performance issues.

Before Aaron

  • Avg Response Time: 1 hour
  • The ability to query a database in order to find important security information was relatively important. The CEO of the company had written a treatise about the importance of improving the security of their software because they had gotten some bad press over security incidents recently. Things had to get better
  • People maintained spreadsheets to keep track of different types of malware. Sometimes important security outbreaks would get escalated to the point that an important search term was shared around the team. In general, there were a dozen people working independently that searched on the same search strings literally they were only able to identify one application name per search. If they moved towards using an in clause to search for AppName in (‘kazaa’, ‘napster’, ‘bearshare’) then the query would fail after an extended period of time. In the rawest form, the ability to scan a 1tb Operational Data Store was limited to looking for a single app at a time. People could write a query to look for Kazaa and they could write a second query to search for Napster. But writing a single query to search for Kazaa OR Napster would not complete successfully because of the scope of the data and the amount of traffic to and from this database server.
  • At this time, this software company had uncontrollable virus and malware issues and it was a struggle to merely connect any machine to the network and NOT get infected. Our department write the policies on how to avoid malware and viruses and it was our department that made the policies on what was the necessary next step if you had been infected. (This is probably the timeframe that I fell in love with PXE deployment and installs of Windows. At the time this was delivered by Remote Installation Services but today that function is served by Windows Deployment Services. Being able to re-image a machine in ten minutes is a nice alternative to using a DVD media.

After Aaron

  • Avg Response Time: 1 second
  • I had written screen scraping scripts to lookup all of the applications and build a hierarchy of applications. This allowed you to double click on the category labelled ‘Malware’ and then drill down to traffic for Code Red and Nimda. And in that same hierarchy, they could drill down from the category called ‘P2P’ and then see individual apps like Kazaa and Napster and Bearshare.
  • Dimensional Datamart with data entry web apps to allow users to modify the categories and classifications of traffic by Port, AppName, Source and Destination

When you suddenly make a team of a dozen people 3600 times faster you begin to realize some other efficiencies. The ability to see multiple applications at once and to see the trend of traffic from one day to the next, it was an exciting project. To this day it is my favorite project of all time, I wish I was still there.

%d bloggers like this: