2012-05-01

Business intelligence and empty space.



 1995 I inherited a 4 GB SCSI hard disk from my boss. After much work with drivers and SCSI terminators I had oceans of space in my computer. (I suspect my boss didn’t have the time and patience to install it in his PC and that is reason I got that super large hard disk). With my new large disk I realized I could build a Business Intelligence (BI) application within my desktop computer. Even though I worked a lot with space management in IBM mainframes during the early 1990ties, I never had so much free disk space available before. I know I once said ‘space management is shuffle  files around with a shoehorn and defrag disks and compact files’.

Disk space was scarce and expensive. Then came the spintronic revolution to computer hard disks and inflated the disks. By the end of the century you could find affordable 200GB disks and today you find 3 TB byte disks at the computer retail store. The last server I built can handle 8 TB of data and those 8 TB cost me less than 1500€, dirt cheap I would say. These are SATA3 disks, experts tell me - ‘You cannot use SATA disks in demanding server environments, they are to slow, they cannot handle the load they break down etc’. But that is not the case, I built my own BI servers for 12 years now (about 15 servers) and I used about 100 hard disks not one has crashed, power supplies break down occasionally, SATA disks don’t crash, touch wood.
Since the active BI data is about 500GB in my new 8 TB server, I got ‘oceans of space’ a phrase I use a lot , colleagues say  ‘Lars has oceans of space’ as a joke. In BI you often want an extra backup while you ‘massage’ data, take ‘snapshots’ in time for data marts etc. I always have a complete daily and a weekly backup of my data in the database server, this way restore is simple and fast and you allow yourself the luxury of sometimes cutting corners since you know if things go wrong you can restore fast.

For BI cheap disk space is as important as reliable disk space, yes even more important. It doesn’t matter what your boss say up front or what the stakeholders promise you. If the disk space is expensive you will not get ‘oceans of space’, you will have to use the old shoe horn, defrag & compact. This will take time and make you inflexible, I have seen this many times and I still do. BI is large volumes of data, you collect historical data and as soon as you start compromise with what you store you became inflexible. I do no longer slim down historical data only storing the ‘important’ figures, I store it all. In BI I rather go for cheap than super reliable hard disks. If God forbid a disk break down, it’s a simple task to replace and restore. You will be down for an hour or so, but who cares? It is only a BI system.  

In BI it is not unusual  to have customers asking for new large databases  ‘We acquired this company and now we have some figures to analyze and we need to do it now’. Where do they go?  ‘We go to Lars he got oceans of space’.  (I have not got this request but similar ones and it is a cool end of this post.)

No comments:

Post a Comment