2020-06-27

The Swan Song of my Data Warehouse

It is not the end, but the beginning of the end. Finally the finale is approaching for my Data Warehouse. I have for some months participated in a project migrate my Data warehouse to the cloud. For reasons I do not understand the decision has been taken, the first step to decommission or rather replace the Data warehouse is to move it to the cloud. My Data Warehouse must live for yet some years until the replacement whatever it is can take over. Something must be done to upgrade the  Data Warehouse it runs on 2012 hardware and most software is from about the same time. After years of hesitation we will now take a grip on the accumulated technological dept, upgrade all software to current versions and move it to the cloud. There it will run happily until it is replaced in its entirety. In the end, no funeral feast or commotion of any sort, it will just fade away with a silent Fzzzz.

The Data Warehouse started as concept or some ideas I had back in the mid 90ties while working as a Business Intelligence Analyst consultant (now it is called Data scientist). Year 2001 I had an opportunity to put my ideas into practice. Some years later I developed version 2, after 2008 I only added a few changes and fixes. The last hardware upgrade was 2012, since then the Data Warehouse has been 'maintenance free', just accumulating tech debt over the years.

Data Warehouse is a misnomer, it is a hybrid of a Data Lake and a Data Warehouse, more of a Data Lake than a Data Warehouse, but the name 'Data Lake' was not invented at that time. In the company no one knew the what a Data Warehouse was, so I humbly took the name 'the Data Warehouse' for my system. Data is imported from various sources and stored in tables and Business Query Sets, a self contained super set of reports in tabular form. The Business Query Sets are excellent for self service BI or as a source for graphical front ends.

From the beginning my Data Warehouse was separated into a processing server and storage i.e. a database server, with a dedicated gigabit switch in between for uninterrupted fast communication between the processor and the storage. This has now proved to be a hurdle when migrating the Data Warehouse to the cloud. I wanted to move the database server first to the cloud, but the response time from the cloud database server has been way to slow a round trip in between is about 25 times slower than present setup. We are investigating this delay right now but since this is in the cloud we cannot analyze the problem ourselves, we need to ask the cloud supplier still we have not got answer. In the meantime I will start look at upgrading the processing server. This is not a simple task as it involves replacing and adjusting code, quite a lot of code. The total upgrade and migration of my Data Warehouse will probably keep me fairly busy the rest of the year.
Right now we have a heat wave here in Stockholm with temperature of 30 Celsius grades. This is very hot for us swedes, I cannot think about work now, just sit in the shade an sipping ice cold beer. I should not complain though, there are those who have worse problems these days.