2022-07-20

Transpose away about 92 percent

It has been a long time since I wrote here about The Data Warehouse. I have spend time upgrading PHP from 5.6

to 7.4 and Mysql latin1 to utf8. None of this is in production yet. I also spent time on the long overdue Mysql from

5.6 to 5.7, which makes me go bonkers. In the old days when I had full control of the environment I just downloaded

and installed the MySQL version I wanted. Now I try to follow Ubuntu's rules, and I just fail, I can upgrade a test server,

but the prod server refuses to be upgraded. I have no clue what goes wrong. This is really the backside of complex system

procedures that intercept you from compromising the system, in this case the Ubuntu system. I'm not interested in digging into the

finer details of Ubutu's APT system, but it seems I will in the end.  

Anyhow I did some SQL tuning lately, this is boring laborious work, but at the same time satisfyingly rewarding.

Some weeks ago I stumbled upon this code: 

insert into MAEX_US (   

            SELECT

            A.MATNR,

            A.ALAND,

            A.GEGRU,

            A.ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU='US'      

            );

insert into MAEX_EU (   

            SELECT

            A.MATNR,

            A.ALAND,

            A.GEGRU,

            A.ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU='EU'      

            );

insert into MAEX_E1 (    

            SELECT

            A.MATNR,

            A.ALAND,

            A.GEGRU,

            A.ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU='E1'      

            );


INSERT INTO LEGAL_REGULATION_BOM

    SELECT 

            A.PLANT

            ,A.MATNR

            ,A.DESCRIPTION

            ,A.EXT_MAT_GRP

            ,A.MATNR_COST

            ,A.MATNR_COST_CURR

            ,A.PARENT

            ,A.COMPONENT

            ,A.BOM_LEVEL

            ,A.QTY

            ,A.COMPONENT_COST 

            ,A.COMPONENT_COST_CURR

            ,A.UNIT_OF_MEASURE

            ,A.COMP_DESCRIPTION

            ,A.ORIGIN

            ,A.DUTYNO

            ,A.BOM_PLANT_IS_PTD_0101_ACQPLANT

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.ALNUM 

                WHEN A.PLANT='6100' THEN C.ALNUM

                WHEN A.PLANT='6130' THEN D.ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN E.ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN N.ALNUM

                WHEN A.PLANT='3125' THEN Q.ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'E1_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN F.ALNUM 

                WHEN A.PLANT='6100' THEN G.ALNUM

                WHEN A.PLANT='6130' THEN H.ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN I.ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN O.ALNUM

                WHEN A.PLANT='3125' THEN R.ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'EU_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN J.ALNUM 

                WHEN A.PLANT='6100' THEN K.ALNUM

                WHEN A.PLANT='6130' THEN L.ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN M.ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN P.ALNUM

                WHEN A.PLANT='3125' THEN S.ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'US_LEG'


            FROM LEGAL_REGULATION_TEMP_BOM A

                LEFT JOIN MAEX_E1 B ON A.COMPONENT=B.MATNR AND B.ALAND='BE'

                LEFT JOIN MAEX_E1 C ON A.COMPONENT=C.MATNR AND C.ALAND='HU'

                LEFT JOIN MAEX_E1 D ON A.COMPONENT=D.MATNR AND D.ALAND='IT'

                LEFT JOIN MAEX_E1 E ON A.COMPONENT=E.MATNR AND E.ALAND='SE'

                LEFT JOIN MAEX_E1 N ON A.COMPONENT=N.MATNR AND N.ALAND='FR'

                LEFT JOIN MAEX_E1 Q ON A.COMPONENT=Q.MATNR AND Q.ALAND='US'

                LEFT JOIN MAEX_EU F ON A.COMPONENT=F.MATNR AND F.ALAND='BE'

                LEFT JOIN MAEX_EU G ON A.COMPONENT=G.MATNR AND G.ALAND='HU'

                LEFT JOIN MAEX_EU H ON A.COMPONENT=H.MATNR AND H.ALAND='IT'

                LEFT JOIN MAEX_EU I ON A.COMPONENT=I.MATNR AND I.ALAND='SE'

                LEFT JOIN MAEX_EU O ON A.COMPONENT=O.MATNR AND O.ALAND='FR'

                LEFT JOIN MAEX_EU R ON A.COMPONENT=R.MATNR AND R.ALAND='US'

                LEFT JOIN MAEX_US J ON A.COMPONENT=J.MATNR AND J.ALAND='BE'

                LEFT JOIN MAEX_US K ON A.COMPONENT=K.MATNR AND K.ALAND='HU'

                LEFT JOIN MAEX_US L ON A.COMPONENT=L.MATNR AND L.ALAND='IT'

                LEFT JOIN MAEX_US M ON A.COMPONENT=M.MATNR AND M.ALAND='SE'

                LEFT JOIN MAEX_US P ON A.COMPONENT=P.MATNR AND P.ALAND='FR'

                LEFT JOIN MAEX_US S ON A.COMPONENT=S.MATNR AND S.ALAND='US'     

It took me quite a while to figure out the meaning of all these left joins.It is a  nifty way of transpose ALNUM from the MAEX table over MATNR and ALAND  into E1_LEG, EU_LEG and US_LEG like:

This code is clever or rather it was clever when constructed but now the data volumes have grown to large numbers

disclosing the intrinsic weakness of the many joins, today they take a very long time to execute (some 1100 seconds).

I failed to come up with a better way to do the transpose in the last SQL. I then ‘parked’ this problem.

Two days ago while jogging I realized I attacked the problem wrongly, it was not the last query that was the root problem,

but the first three queries splitting the MAEX table. If I do a full transpose over MATNR, I end up with one row per MATNR

which is what I need in the next phase.I remembered I read about a similar problem some years ago where the max

function was combined with group by.

insert into MAEX_PIVOT (    

    SELECT

            A.MATNR

           ,max(if (A.GEGRU='E1' and A.ALAND = 'BE', A.ALNUM, NULL)) as E1BE_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'HU', A.ALNUM, NULL)) as E1HU_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'IT', A.ALNUM, NULL)) as E1IT_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'SE', A.ALNUM, NULL)) as E1SE_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'FR', A.ALNUM, NULL)) as E1FR_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'US', A.ALNUM, NULL)) as E1US_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'BE', A.ALNUM, NULL)) as EUBE_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'HU', A.ALNUM, NULL)) as EUHU_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'IT', A.ALNUM, NULL)) as EUIT_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'SE', A.ALNUM, NULL)) as EUSE_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'FR', A.ALNUM, NULL)) as EUFR_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'US', A.ALNUM, NULL)) as EUUS_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'BE', A.ALNUM, NULL)) as USBE_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'HU', A.ALNUM, NULL)) as USHU_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'IT', A.ALNUM, NULL)) as USIT_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'SE', A.ALNUM, NULL)) as USSE_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'FR', A.ALNUM, NULL)) as USFR_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'US', A.ALNUM, NULL)) as USUS_ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU in ('E1','EU','US') and A.ALNUM is not NULL 

            group by A.MATNR

);    

220720 075316.546624 8628 Note SQLSTATE=00000, ERRORNO=0    

220720 075316.546695 8628 Note SQLINFO Records: 32606 

This request replacing the three original requests splitting the MAEX table gave MATNR with 18 ‘transpose’ columns

on each row:

Then it was just a matter of remove joins and map the ‘transpose’ columns to the right position:


INSERT INTO LEGAL_REGULATION_BOM

    SELECT 

            A.PLANT

            ,A.MATNR

            ,A.DESCRIPTION

            ,A.EXT_MAT_GRP

            ,A.MATNR_COST

            ,A.MATNR_COST_CURR

            ,A.PARENT

            ,A.COMPONENT

            ,A.BOM_LEVEL

            ,A.QTY

            ,A.COMPONENT_COST 

            ,A.COMPONENT_COST_CURR

            ,A.UNIT_OF_MEASURE

            ,A.COMP_DESCRIPTION

            ,A.ORIGIN

            ,A.DUTYNO

            ,A.BOM_PLANT_IS_PTD_0101_ACQPLANT

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.E1BE_ALNUM

                WHEN A.PLANT='6100' THEN B.E1HU_ALNUM

                WHEN A.PLANT='6130' THEN B.E1IT_ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN B.E1SE_ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN B.E1FR_ALNUM

                WHEN A.PLANT='3125' THEN B.E1US_ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'E1_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.EUBE_ALNUM 

                WHEN A.PLANT='6100' THEN B.EUHU_ALNUM

                WHEN A.PLANT='6130' THEN B.EUIT_ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN B.EUSE_ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN B.EUFR_ALNUM

                WHEN A.PLANT='3125' THEN B.EUUS_ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'EU_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.USBE_ALNUM

                WHEN A.PLANT='6100' THEN B.USHU_ALNUM

                WHEN A.PLANT='6130' THEN B.USIT_ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN B.USSE_ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN B.USFR_ALNUM

                WHEN A.PLANT='3125' THEN B.USUS_ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'US_LEG'

            FROM LEGAL_REGULATION_TEMP_BOM A

            LEFT JOIN MAEX_PIVOT B ON B.MATNR = A.COMPONENT  

    ;    

220720 081115.168632 19727 Note SQLSTATE=00000, ERRORNO=0    

220720 081115.168706 19727 Note SQLINFO Records: 2583726

This reduced the time with 1012 seconds (from 1099 seconds to 87)

2022-01-16

Finally DW runs on PHP 7.4

After quite some work I have upgraded my Data Warehouse to PHP7 (4.29). I started by switching from Mageia 4 to first openSUSE where I developed the PHP7 version, and finally to Azure Ubuntu. I still use KDE as desktop though. In the beginning of DW 2001 I used Mandrake Linux and PHP4. Since 2014 the DW was upgraded to PHP5.6. I thought that was the last upgrade of the Data Warehouse PHP code. But last year the big guys asked me to migrate the Data warehouse to Azure. The big thing was the SAP connection, I had to replace the SAP connector to saprfc-Kralik a very fine piece of software, works like a charm. Relacing the SAP connector was not the big problem, but all the code I built around it was a big problem now that code is changed to support the new SAP connector. 
I now have a version ready for beta testing. After beta testing we move the Data Warehouse to Azure. That will be a lot of work, the Azure 'lift-and-shift' slogan is just a slogan.The way to Azure has been painful, the first attempt just failed, the so called experts who help did not do a very good job. Going from in-house to cloud, does not only mean you do not have to have hardware skills anymore, instead you need very good network skills and 'cloud skills'. The cloud providers are shouting out load the first 'You do not need any hardware skills or hardware support', but they do not even wisper about the latter network and cloud skills. this time we have had the help of a competent solutions architect, with necessary cloud skills, this is a 'sine qua non' condition for a successful transision to the cloud.

What about PHP8.1? 
I will start the PHP8.1 upgrade tomorrow, I expect that to be 'a walk in the park' compared to the PHP5.6 to PHP 7.4 upgrade.

2021-12-31

Another Year Has Passed

It's been a while since I wrote a post now, this year is not so much to write home to mum about. Not much €work, not much work at all really. Since the Covid struck business have been bad. But more work than Yesteryear, actually a little more work than I want. I'm finishing an upgrade of my Data Warehouse to PHP 7 (and 8.1 soon). The big problem was the SAP connector, it was much more work than anticipated. I'm now using Kralik's SAP connector, a very nice piece of software! Now I will start port the Data Warehouse to Azure, this will be interesting eventhough I question the wisdom of the move. The company is writing a replacement for the Data Warehouse. I think it's best just leave my data warehouse as it is, but the company really want to move it to Azure. Many years ago I predicted I would retire before the Data Warehouse, I was right, I was retired years ago. Now I start question myself if I will stop work before the Data Warehouse will be decommisioned. When that happens I will probably ask myself if I will die before the Data warehouse:) Next year 2022 apart from porting the Data Warehouse to Azure (wich is much more than 'lift-and-shift') I will help the DW guys with DW support, and maybe a bit with the DW-successor, I have discussed the Bill-of-Material logic which is a bit complex, and they are under-staffed so they may ask for some help with the . Next year I have promised myself to pick up another (computer) language, I'm interested in Raku and D. Also Svelte and Webassembly looks very interesting. I'm also qurious about Spark. And Yesterday Andreas told me I would enjoy an Oculus headset. So I'm not out of ideas for the future, the question is am I eager enough to start up something new. Time will tell. Anyway apart from peace on Earth I wish you all a happy 2020.