2015-12-31

Happy New Year

This was supposed to be a rather lengthy post, but zscaler was tired today, making me waiting for gateway.zscaler.net for to long, and now time is running out but...

Another year have past again. At work I didn’t accomplish at all what I had hoped for. But I have done some other things, like helping out with evaluating software and some integration problems. I did not do one line of JavaScript coding which I really had hoped to do, this is only partly due to lack of time I hadn’t any good projects for javaScript and I do not code just for the sake of coding. I didn’t buy a raspberry Pi same thing there I didn’t have a good project for a raspberry Pi. I didn’t try Perl6. There are a lot of things I didn’t do. But I tested the Amazon infrastructure learned a few thing about Azure cloud. I managed to create a travel expense report by myself, I never done that before so I’m happy for that.   
Two colleagues have left this year, that is never fun. A data warehouse specialist left and that was not only un-fun it was almost painful, I’m still emotionally attached to my old data warehouse and those who work there.
Actually the zscaler just went bananas, it blocked my internet access with this prompt
I would never type in a password in a frame like this. Need help contact your IT team, this is the 31 of december at 16:45 do not think anyone will answer.
Years end is the most IT-important day of the year, and you should make an extra effort to avoid anything that can interrupt or disrupt IT services this and next day of the year.

2015-12-06

Extracting SAP projects with BAPI - 3

I am restructuring some long running Data Warehouse extraction workflows. In total these workflows takes some 4 hours today, which is ridiculous. One part of restructuring the workflows is to use modern Integration Tag Language idioms, so newcomers  can understand, the present archaic syntax is a bit tricky. I have rewritten the second part in much the same way as the first.
So far I cut down execution time from 4 hours to 30 minutes. This is achieved by parallelizing the jobs running the BAPIs. I have rewritten the last parts of the workflow much in the same way I rewrote the first part.
The result is good, but not good enough Still the runtime is dependent on the amount of objects defined in SAP, in a few years when projects have doubled so will the runtime. I strongly advocate full load over delta load, since full load is much simpler to set up and is self healing if something goes wrong. But here is a case when full load is not performant enough, 30 minutes and growing by the day. I will rewrite these workflows from full load into a hybrid delta load where I aim at a more stable run time below 10 minutes.    

One job in the last rewrite is of interest: SAP information BAPIs are structured with one BAPI giving a list of keys to all objects and then you have an army of BAPIs giving detailed information about individual objects. BAPI_NETWORK_GETINFO is a bit different it takes an array of network identities and respond with detail info of all objects in one go, here the € list operator comes to the rescue, it  takes a PHP array and reformats it into a RFC import array.

The BAPI_NETWORK_GETINFO is run once for all networks in sequence.

  1. The <forevery> job iterator creates the iterator from the NWDRVR mysql table.
  2. Then runs the BAPI_NETWORK_GETINFO BAPI for each row in the SQL result table one by one. (Addressed by @J_DIR/driver1)
  3. Lastly stores all results in corresponding MYSQL tables   

If the list of network objects is large enough you have to spit the array into chunks and execute them separate to overcome SAP session limits and performance problems. We have some 9000 projects and that is to many in our environment to execute in one go.
A small rewrite of the job will split the SQL result into 8 chunks and distribute them over separate workers and execute the in parallel:

Here BAPI_NETWORK_GETINFO is run in 8 parallel workers.

  1. The <forevery> iterator splits the SQL result into 8 chunks, each chunk is executed by a separate worker in parallel
  2. Each worker then runs the BAPI_NETWORK_GETINFO BAPI for each row in the SQL result table of the worker one by one. (Addressed by @R_DIR/driver1)
  3. Lastly each worker store all results in corresponding MYSQL tables

With this slight modification the run time for the job is cut by a factor of 8. This is really parallel programing made easy. Compare this with visual workflow programing so popular today, I think you will agree with me this is easier to set up.