In Part 1 I described how I wanted to add a tiny little feature to the Data Warehouse’s SAP connector. And as far as part 1 goes it looks like a walk in the park, but then reality hit me.  
I wanted to implement support for prereqs in  <sql> statements  like this:
| 
<sql> 
   <prereq>@TABLE_ROWS</prereq> 
   <autoload duplicate='replace' database='@C_DB' truncate='yes'/> 
</sql> | 
.
But after spending some four five hours looking at the code, I realised it would take me a weekend implementing this, so I decided to implement prereq on autoload level, which was much easier like this: 
| 
<sql> 
   <autoload duplicate='replace' database='@C_DB' truncate='yes/> 
       <prereq>@TABLE_ROWS</prereq> 
   </autoload> 
</sql> | 
Then I needed to carry the new @TABLE_ROWS symbolic variable over to subsequent jobs, so I placed it into the job symbolic table. But it turned out that the symbolic table was already ‘fixed’ at this late phase of the execution of the job. It turned out to be a somewhat tedious job to analyse and code the ability to carry over the @TABLE_ROWS, since it appeared in two different symbolic tables (I have still not figured out why, it looks awkward) written to disk. Finally after some ten hours of looking at the code I decided to do a somewhat ugly hack in the SAP connector:   
| 
if (array_key_exists('prereq',$job['sql'][0]['autoload'][0])){ 
  if (file_exists('TABLE_ROWS.CSV')) { 
     file_put_contents('TABLE_ROWS.CSV',trim(trim(file_get_contents('TABLE_ROWS.CSV')),"$Zsfd")); 
     $TABLE_ROWS = trim(trim(file_get_contents('TABLE_ROWS.CSV')),"$Zsfd"); 
     file_put_contents('TABLE_ROWS.CSV',$TABLE_ROWS); 
  } else { 
    $TABLE_ROWS = 0; 
  } 
  $Zjdir = $job['_directory']; 
  $job['_xlate']['TABLE_ROWS'] = $TABLE_ROWS; 
  file_put_contents("$Zjdir".'/JOB',serialize($job)); 
  $log->enter('Note',"Table rows inserted into $Zjn _xlate; Saving job in $Zjdir"); 
  $Zsdir = $schedule['_directory']; 
  $Zjn = $job['name']; 
  $Zji = findJobIndex($Zjn,$schedule['job']); 
  $schedule['job'][$Zji]['_xlate']['TABLE_ROWS'] = $TABLE_ROWS; 
  file_put_contents("$Zsdir".'/SCHEDULE',serialize($schedule)); 
  $log->enter('Note',"Table rows inserted into $Zjn($Zji) _xlate; Saving schedule in $Zsdir"); 
} | 
Yes, it took me some ten hours to get those lines of code right. Now I can do this:
| 
<job name=’extractFromSAP’> <sap> ... </sap> 
   <sql> 
      <autoload duplicate='replace' database='@C_DB' truncate='yes/> 
          <prereq>@TABLE_ROWS</prereq> 
      </autoload> 
   </sql> 
</job> 
<job name=’postProcessSAPData’> 
   <prereq>@extractFromSAP.TABLE_ROWS</prereq 
   … 
</job> | 
TABLE_ROWS is the number of data lines extracted from SAP
All in all adding just a tiny little feature took some twenty hours. It is long time since I looked at the code, most of the time I spent on understanding the code and the consequences of the code change. As you have seen the actual code added and changed is small. If I had a better understanding of the code it would have taken one to three hours complete the change. Now it took about 20 hours, it could have taken 200 hours, you never know what it will cost or what the end result will be when you add a tiny little ‘thing’ to a complex system. Was this change worth the efford? Nah can’t say that, but it was a bit fun.
Still a question remains, what is the result of the extractFromSAP job if the prereq prevent autoloading? Was it a success or a failure? Or was it bypassed?
You can argue for all three alternatives, now it is counted as a success, but I think 'bypassed' better reflects the result. I have to think about that.
Excerpt from the job log:
| 
160522 090816.992827 20590 Note Enter execJob, name=extractFromSAP 
160522 090816.997957 20590 Note Enter evaluateTagArray, Stage=execute     
160522 090817.021057 20590 Note Enter include, script=sap2.php     
160522 090817.039549 20590 Note Enter connect2sap, Connecting to SAP=acta_prod with codepage=1100     
160522 090821.602623 20590 Note Enter writeSqlCreateLoad, Table=DATA, id=6 
160522 090821.604143 20590 Note Enter writeSqlCreateLoad, Table=FIELDS, id=7 
160522 090821.605823 20590 Note Enter writeSqlCreateLoad, Table=OPTIONS, id=8 
160522 090821.607159 20590 Note Enter writeSqlCreateLoad, Table=TABLE_ROWS, id=0 
160522 090821.608432 20590 Note Enter writeRRT_stmts, table=VBAP 
160522 090821.609183 20590 Note SAP rfc_read_table row length=101 
160522 090821.610056 20590 Note Enter fixRfcReadTableRowsDelim, from=DATA.CSV, to=VBAP.CSV 
160522 090821.611289 20590 Note Enter include, Table rows inserted into  _xlate; Saving job 
160522 090821.612002 20590 Note Enter include, Table rows inserted into extractFromSAP(0) _xlate; Saving schedule 
160522 090821.625621 20590 Note Enter connect2mysql, DB=OPS,database=OPS,user=xxx  
160522 090821.626233 20590 Note Current character set in Mysql: utf8 
160522 090821.626996 20590 Note Prereq Autoload  0 statement 0 result=False 
160522 090821.627232 20590 Warning Prereq(s) for autoload noname not satisfied, processing intercepted. 
160522 090821.627557 20590 Note Exit script=/home/tooljn/dw/pgm/opsproc/sap2.php result=1 autoload bypassed (prereq) 
160522 090821.628088 20590 Note script /home/tooljn/dw/pgm/opsproc/sap2.php ending with rc=1 
160522 090821.646820 20575 Note JOB extractFromSAP result=Success,(True,True,True) start=090816, duration=5 
160522 090821.650473 20575 Note Enter connect2mysql, DB=OPS,database=OPS,user=KalleAnka - dwdb via TCP/IP 
160522 090821.651115 20575 Note Current character set in Mysql: utf8 
160522 090821.656999 20575 Warning Prereq(s) for job PostProcessSAPData not satisfied, processing intercepted.  
160522 090821.657559 20575 Note SCHEDULE hua_bulkrep result=Success,(True,True,True) start=090816, duration=5   
160522 090821.658128 20575 Log  executed successfully (True,True,True), of 2 jobs, executed OK=1 Failed=0, Bypassed=1 | 
 
 
No comments:
Post a Comment