Some years ago I was asked to extract project information from SAP for reporting/BI purposes. I decided to base the extraction solely on BAPIs. I wanted to test the BAPIs thus avoiding writing ABAP code and/or tracing what SAP tables containing project info. It sounded like a good strategy no SAP development just clean use of SAP premade extraction routines. It turned out to be quite a few BAPIs I had to deploy for complete extraction, first I started with the project list BAPI:
BAPI_PROJECTDEF_GETLIST to get all projects (if you are not familiar with BAPI extraction read this first). Then I just had to run all the other BAPI one by one for each project:
In the beginning it was fine running these BAPIs in sequence, very few projects only one company (VBUKR) using projects. Last time I looked it took about 30 minutes to run the routine, it was a long time but what the heck 30 minutes during night time it’s not a big deal. Last week I had a call from present maintainers of the Data Warehouse, “Your project schedule takes hours and hours each night. The code is a bit ‘odd’, can you explain how it works, so we can to do something about it”. To understand the ‘archaic’ code in the schedule first thing I had to do was to clean it up, replacing obsolete idioms with more modern code constructs others could understand. Then I split the original schedule into smaller more logical schedules, the first one consisting of:
took more than two hours to run. A look into the projects data showed 16000+ projects belonging to more companies than I created the extraction for. Now we replaced the BAPI_PROJECTDEF_GETLIST with direct extraction of the SAP PROJ table selecting only the interesting company about 8000 projects and run the BAPIs in parallel this brought down the execution time to about 1 hour 20 minutes. Analysing job statistics showed the three first BAPIs only took little more than 500 seconds each, BAPI_PROJECT_GETINFO 5000 seconds and finally BAPI_BUS2054_GETDATA about 1000 seconds. Distributing BAPI_PROJECT_GETINFO on 9 workers and BAPI_BUS2054_GETDATA on 2 workers should make all BAPI execute in between 500 to 600 seconds. This is a balanced scheme and the execution time is acceptable, from over 2 hours to 10 minutes. In the next post I will show the new improved execution schedule.