2022-07-20

Transpose away about 92 percent

It has been a long time since I wrote here about The Data Warehouse. I have spend time upgrading PHP from 5.6

to 7.4 and Mysql latin1 to utf8. None of this is in production yet. I also spent time on the long overdue Mysql from

5.6 to 5.7, which makes me go bonkers. In the old days when I had full control of the environment I just downloaded

and installed the MySQL version I wanted. Now I try to follow Ubuntu's rules, and I just fail, I can upgrade a test server,

but the prod server refuses to be upgraded. I have no clue what goes wrong. This is really the backside of complex system

procedures that intercept you from compromising the system, in this case the Ubuntu system. I'm not interested in digging into the

finer details of Ubutu's APT system, but it seems I will in the end.  

Anyhow I did some SQL tuning lately, this is boring laborious work, but at the same time satisfyingly rewarding.

Some weeks ago I stumbled upon this code: 

insert into MAEX_US (   

            SELECT

            A.MATNR,

            A.ALAND,

            A.GEGRU,

            A.ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU='US'      

            );

insert into MAEX_EU (   

            SELECT

            A.MATNR,

            A.ALAND,

            A.GEGRU,

            A.ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU='EU'      

            );

insert into MAEX_E1 (    

            SELECT

            A.MATNR,

            A.ALAND,

            A.GEGRU,

            A.ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU='E1'      

            );


INSERT INTO LEGAL_REGULATION_BOM

    SELECT 

            A.PLANT

            ,A.MATNR

            ,A.DESCRIPTION

            ,A.EXT_MAT_GRP

            ,A.MATNR_COST

            ,A.MATNR_COST_CURR

            ,A.PARENT

            ,A.COMPONENT

            ,A.BOM_LEVEL

            ,A.QTY

            ,A.COMPONENT_COST 

            ,A.COMPONENT_COST_CURR

            ,A.UNIT_OF_MEASURE

            ,A.COMP_DESCRIPTION

            ,A.ORIGIN

            ,A.DUTYNO

            ,A.BOM_PLANT_IS_PTD_0101_ACQPLANT

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.ALNUM 

                WHEN A.PLANT='6100' THEN C.ALNUM

                WHEN A.PLANT='6130' THEN D.ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN E.ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN N.ALNUM

                WHEN A.PLANT='3125' THEN Q.ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'E1_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN F.ALNUM 

                WHEN A.PLANT='6100' THEN G.ALNUM

                WHEN A.PLANT='6130' THEN H.ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN I.ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN O.ALNUM

                WHEN A.PLANT='3125' THEN R.ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'EU_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN J.ALNUM 

                WHEN A.PLANT='6100' THEN K.ALNUM

                WHEN A.PLANT='6130' THEN L.ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN M.ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN P.ALNUM

                WHEN A.PLANT='3125' THEN S.ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'US_LEG'


            FROM LEGAL_REGULATION_TEMP_BOM A

                LEFT JOIN MAEX_E1 B ON A.COMPONENT=B.MATNR AND B.ALAND='BE'

                LEFT JOIN MAEX_E1 C ON A.COMPONENT=C.MATNR AND C.ALAND='HU'

                LEFT JOIN MAEX_E1 D ON A.COMPONENT=D.MATNR AND D.ALAND='IT'

                LEFT JOIN MAEX_E1 E ON A.COMPONENT=E.MATNR AND E.ALAND='SE'

                LEFT JOIN MAEX_E1 N ON A.COMPONENT=N.MATNR AND N.ALAND='FR'

                LEFT JOIN MAEX_E1 Q ON A.COMPONENT=Q.MATNR AND Q.ALAND='US'

                LEFT JOIN MAEX_EU F ON A.COMPONENT=F.MATNR AND F.ALAND='BE'

                LEFT JOIN MAEX_EU G ON A.COMPONENT=G.MATNR AND G.ALAND='HU'

                LEFT JOIN MAEX_EU H ON A.COMPONENT=H.MATNR AND H.ALAND='IT'

                LEFT JOIN MAEX_EU I ON A.COMPONENT=I.MATNR AND I.ALAND='SE'

                LEFT JOIN MAEX_EU O ON A.COMPONENT=O.MATNR AND O.ALAND='FR'

                LEFT JOIN MAEX_EU R ON A.COMPONENT=R.MATNR AND R.ALAND='US'

                LEFT JOIN MAEX_US J ON A.COMPONENT=J.MATNR AND J.ALAND='BE'

                LEFT JOIN MAEX_US K ON A.COMPONENT=K.MATNR AND K.ALAND='HU'

                LEFT JOIN MAEX_US L ON A.COMPONENT=L.MATNR AND L.ALAND='IT'

                LEFT JOIN MAEX_US M ON A.COMPONENT=M.MATNR AND M.ALAND='SE'

                LEFT JOIN MAEX_US P ON A.COMPONENT=P.MATNR AND P.ALAND='FR'

                LEFT JOIN MAEX_US S ON A.COMPONENT=S.MATNR AND S.ALAND='US'     

It took me quite a while to figure out the meaning of all these left joins.It is a  nifty way of transpose ALNUM from the MAEX table over MATNR and ALAND  into E1_LEG, EU_LEG and US_LEG like:

This code is clever or rather it was clever when constructed but now the data volumes have grown to large numbers

disclosing the intrinsic weakness of the many joins, today they take a very long time to execute (some 1100 seconds).

I failed to come up with a better way to do the transpose in the last SQL. I then ‘parked’ this problem.

Two days ago while jogging I realized I attacked the problem wrongly, it was not the last query that was the root problem,

but the first three queries splitting the MAEX table. If I do a full transpose over MATNR, I end up with one row per MATNR

which is what I need in the next phase.I remembered I read about a similar problem some years ago where the max

function was combined with group by.

insert into MAEX_PIVOT (    

    SELECT

            A.MATNR

           ,max(if (A.GEGRU='E1' and A.ALAND = 'BE', A.ALNUM, NULL)) as E1BE_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'HU', A.ALNUM, NULL)) as E1HU_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'IT', A.ALNUM, NULL)) as E1IT_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'SE', A.ALNUM, NULL)) as E1SE_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'FR', A.ALNUM, NULL)) as E1FR_ALNUM

           ,max(if (A.GEGRU='E1' and A.ALAND = 'US', A.ALNUM, NULL)) as E1US_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'BE', A.ALNUM, NULL)) as EUBE_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'HU', A.ALNUM, NULL)) as EUHU_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'IT', A.ALNUM, NULL)) as EUIT_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'SE', A.ALNUM, NULL)) as EUSE_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'FR', A.ALNUM, NULL)) as EUFR_ALNUM

           ,max(if (A.GEGRU='EU' and A.ALAND = 'US', A.ALNUM, NULL)) as EUUS_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'BE', A.ALNUM, NULL)) as USBE_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'HU', A.ALNUM, NULL)) as USHU_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'IT', A.ALNUM, NULL)) as USIT_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'SE', A.ALNUM, NULL)) as USSE_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'FR', A.ALNUM, NULL)) as USFR_ALNUM

           ,max(if (A.GEGRU='US' and A.ALAND = 'US', A.ALNUM, NULL)) as USUS_ALNUM

            FROM MASTERDATA.MAEX A

            WHERE A.GEGRU in ('E1','EU','US') and A.ALNUM is not NULL 

            group by A.MATNR

);    

220720 075316.546624 8628 Note SQLSTATE=00000, ERRORNO=0    

220720 075316.546695 8628 Note SQLINFO Records: 32606 

This request replacing the three original requests splitting the MAEX table gave MATNR with 18 ‘transpose’ columns

on each row:

Then it was just a matter of remove joins and map the ‘transpose’ columns to the right position:


INSERT INTO LEGAL_REGULATION_BOM

    SELECT 

            A.PLANT

            ,A.MATNR

            ,A.DESCRIPTION

            ,A.EXT_MAT_GRP

            ,A.MATNR_COST

            ,A.MATNR_COST_CURR

            ,A.PARENT

            ,A.COMPONENT

            ,A.BOM_LEVEL

            ,A.QTY

            ,A.COMPONENT_COST 

            ,A.COMPONENT_COST_CURR

            ,A.UNIT_OF_MEASURE

            ,A.COMP_DESCRIPTION

            ,A.ORIGIN

            ,A.DUTYNO

            ,A.BOM_PLANT_IS_PTD_0101_ACQPLANT

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.E1BE_ALNUM

                WHEN A.PLANT='6100' THEN B.E1HU_ALNUM

                WHEN A.PLANT='6130' THEN B.E1IT_ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN B.E1SE_ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN B.E1FR_ALNUM

                WHEN A.PLANT='3125' THEN B.E1US_ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'E1_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.EUBE_ALNUM 

                WHEN A.PLANT='6100' THEN B.EUHU_ALNUM

                WHEN A.PLANT='6130' THEN B.EUIT_ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN B.EUSE_ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN B.EUFR_ALNUM

                WHEN A.PLANT='3125' THEN B.EUUS_ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'EU_LEG'

            ,CASE

                WHEN A.PLANT IN ('0101','0104') THEN B.USBE_ALNUM

                WHEN A.PLANT='6100' THEN B.USHU_ALNUM

                WHEN A.PLANT='6130' THEN B.USIT_ALNUM

                WHEN A.PLANT IN ('1100','1110') THEN B.USSE_ALNUM

                WHEN A.PLANT IN ('3110','3115') THEN B.USFR_ALNUM

                WHEN A.PLANT='3125' THEN B.USUS_ALNUM

                ELSE 'MISSING_LEGI'

            END AS 'US_LEG'

            FROM LEGAL_REGULATION_TEMP_BOM A

            LEFT JOIN MAEX_PIVOT B ON B.MATNR = A.COMPONENT  

    ;    

220720 081115.168632 19727 Note SQLSTATE=00000, ERRORNO=0    

220720 081115.168706 19727 Note SQLINFO Records: 2583726

This reduced the time with 1012 seconds (from 1099 seconds to 87)