The other day I created an interesting job in the Data Warehouse. It’s not a Data Warehouse job as such, I read logs from another application PricePoint and mails the result to concerned. PricePoint is a Windows application and store it’s data in an MS SQL server database. Many here at the company thinks the Data Warehouse is an odd application Linux only, not able to communicate with ‘proper’ software at all (proper is MS). This is of course false, the DW is compatible with many things. The DW have no problem accessing MS SQL server databases or MS office document if required.
There was a little twist with the job I was asked to do. There is a table PendingMessages containing transactions not yet processed. ‘Can you send a mail with failed transactions attached when all transactions are processed, but not later than eight thirty in the morning?’
I could just schedule a job 08:30, but I decided to be a bit creative, so I created a job that checks when all transactions are processed and then sends the mail but no later than 08:30. First I created prereq that checked for this:
This is what this prereq does; It call the PricePoint MSSQL database via PDO and selects number of pending transactions. Since we want the prereq to be false as long as there are pending transactions and true when there is no transactions left we simply negate the result, (we could have created a correct result by writing a more complex SQL query, but just negate the prereq is simpler). We check the prereq every 300 seconds and we do this 20 times or until 08:30 whichever happens first. But there is a snag to this, when time is up we should proceed no matter what the prereq says but this prereq fails when time is up and that is what you want most of the time, wait for an event and if that event does not happen fail. There are some ways to circumvent this problem. You can put the prereq into an <init> section and then explicitly set the init return code to TRUE with a rectify statement like this:
But this not only looks ugly, the logic behind is not obvious and it is a misuse of both the <init> section and the rectify statement. I do not like to trick my way around programming difficulties so I did not go for this solution. Instead I inserted a <prereqwait> which handles the timing functionalities of the prereq. When I implemented prereqwait I thought it might make sense to allow for a happy ending so I introduced timesup=’ok’ like this:
This is definitely an improvement, actually it’s quite neat, but having second thoughts about the entire job I decided to skip the prereq altogether and go for the simpler solution, just schedule the job at 08:30 like this:
Much better, now no one has to figure out what the odd looking prereq is good for, but can appreciate the simple MS SQL connectivity via PDO. Take a look at the schedule, there is some nice functionality in there. And I also showed some nice timing functionality by <prereqwait> and <prereq>. I think my jobscheduler with the Integration Tag Language can give any job scheduler on the market a run for their money. (I would love to see someone prove me wrong:)
At last this is what the laconic result from a run looks like in MS Outlook:
Here you see not all transactions were processed at 08:30 and the attachment is a CSV file, it should have been an xmlx file but I missed that, however its just a matter of changing the sql converter in the getMsgs job above.