2012-04-30

Job scheduling with PHP - 3

In a previous post Job scheduling with PHP - 2  I introduced the three main XML scripts my PHP job scheduler use for defining scheduled events. In this post we are going to take a more detailed look at the context  and the schedule  scripts.
In the next post Job Scheduling with PHP - 4  I will describe the job XML script and the job iterator , which is the most interesting feature of my job scheduling system.

The context

There is not so much to tell about the context. it is the master script of the config library and as the name suggests it sets the context for the job scheduling.
The <prereq> tag contains boolean statements that must evaluate to TRUE for execution to commence. Most of the other tags in there are self explanatory, it is declarations of files and databases the scheduler use.
An example of a context script

The schedule

The schedule XML script define a scheduled instance or a unit of work you submit for execution. The schedule defines a chain of jobs, run time parameters, prerequisites, alert lists, logging  how to interpret the success of job executions etc. There are a lot that goes into a schedule. Here I will discuss the basics and little more. In the previous post I ended by running an empty schedule, which is my scheduler version of ‘Hello, World’. Here is a more verbose schedule.
The control_day schedule contains a lot of things I will explain the most important:
I start from top:
‘mustcomplete=’yes’        -  all actions must be successful otherwise executions is aborted
 notify=’admfail.xml’        - after execution those in the list will be notified depending on the result
<variant>                - defines runtime parms e.g. sap=acta_prod defines a SAP subsystem
<prereq>                - boolean conditions that must be true for execution to commence
<job>                        - job declarations
<exit>                        - opsys commands executed after  the job have executed
wait=’no’                - spin off the command and proceed with next task
parallel=’yes’                 - is a hint to scheduler execute in parallel (fork) if possible
<init>                        - opsys commands executed before  the job have executed
This schedule is a production schedule and it is kicked off from Cron via a shell script:
Here you see how the control_day schedule is kicked off from a bash shell script.
As you see my job controller starts by invoking a PHP script scriptS.php , (this is not a good name, but it hangs on from my very first PHP script which I named scriptS where S stands for Start ).   I end this post by showing the the scriptS.php, I left out the initial documentation.
(In Job Scheduling with PHP - 4  I will describe the job XML script and the job iterator.)  

scriptS.php

2012-04-22

Job Scheduling with PHP - 2

Background 2

In my previous post Job Scheduling with PHP - 1  I described the basics about job scheduling in general. Here I write about the background and the underlying design principles for my job scheduling system written in PHP.

I created my job scheduling system for a Business Intelligence BI System. An important process in any BI system is Extract Transform & Load (ETL). The ETL process is background job scheduling. Good scheduling is essential for BI systems. I didn't have any funds for my BI project so I had to use existing (scrapped) hardware and Free Software. I mentioned in my previous post I was (still am) not very impressed by Job Scheduling systems on the market. I am a programmer by trade and I had used scripting languages for controlling processes before. I looked around for software to use, and Linux and MySQL was easy ones to pick. A programming language was harder to find, first I looked for ReXX my scripting language par preference. But I couldn't find ReXX in Linux, the languages I found were PERL and PHP. PERL was the better language, my impression of PHP was a tool for simple Web apps. But I couldn't resist the challenge to use PHP for advanced background processing. I did the first ETL controller in two  crude simple PHP scripts.  scriptS.php (S as in start), and scriptF.php  (F as in function) where I stored the functions used in scriptS.php. All ETL processes were hard coded  and everything was very primitive, but it worked pretty well. But as the BI system grow, it became clear my two scripts were a dead end, my hard coded scripts were not scalable, I had to go back to the drawing board and begin from scratch again.

Redo and do it right.

By now (2005-2006) Object Orientation had arrived in PHP, I'm not fond of OO programming, but a logging subsystem is perfect to objectify, so I learned PHP OO by creating a logger class before I started design version 2 of my job scheduler. Next I disconnected all configuration and job definitions from the PHP scripts. I wanted a strict but extendible syntax that was easy to parse. This was an easy pick; XML was the obvious choice, and PHP had a very simple and capable enough XML parser simpleXML. Now I had to define the environment , the scheduling and the jobs.

The context

 

The environment defines execution elements like databases, programs, directories etc. all is defined in XML scripts, the main context script  points to other XML scripts defining the total execution. Here you see a <sap> tag pointing to XML script defining a SAP system. The <prereq> are Boolean statements that must be true.

The Schedule

 

The schedule XML script defines a  chain of jobs  that is scheduled for execution with or without dependencies.  Variant define startup parameters and the first job point to an XML script 'exp8_generate_iterators'.

The Job

                 ...

This job XML Script defines the execution of series of  SQL statements.

The Execution

Having laid a sound foundation with three well defined entities (context, schedule & job) and a logger, I also needed an execution plan for my scheduling system. I decided to have divide execution into three phases :

  1. Read and parse all XML scripts and syntax check. And check prerequisites i.e. access to input files and subsystems like MySQL , predecessor conditions etc.
  2. Create the execution environment, it is a directory structure where all things from the execution of a schedule are stored, e.g. log files
  3. The actual execution of a schedule.

Phase 1 creates  an execution tree which basically is the parsed XML files into a PHP array structure. This tree is passed to phase 2 where more 'things' are added to the executions tree as the execution environment is created. This environment is then passed to phase 3, which then executes the schedule job by job and records the outcome of each job into the execution tree.

This is a schematic view of a schedule execution (the picture is old and some entities have been renamed).

Design patterns

I often use my own Garbage In - Garbage out  design pattern which means all not recognized is treated as noise and defaults are non destructive. This design pattern is both code friendly , the code do not have to consider unknown parameters etc, and user friendly  you do not need to know all details, try the software you will not destroy anything by misspell a parameter or leave something out.

But you have to give defaults some thoughts, they should be non destructive and sensible. This is actually quite hard. I’m sure you many times have seen idiotic defaults.

Another design pattern I often use is something I invented years ago when I did large systems in assembler language. I posit all goes wrong use boolean FALSE return code and return as soon as I find something wrong, this gives submodules often with many FALSE returns and only one TRUE or non-FALSE return at the end. If you are careful and design your submodules or functions with minimal side effects, you can often avoid cleanup code. The caller either deals with the false return code or exit himself with the FALSE return code. This design pattern gives flat, efficient and robust programs.

Return Codes

I already stated I prefer a simple boolean return code structure TRUE  or FALSE . Either an action is a success or not, black or white if you wish, no gray zones. You probably have seen other return code schemes. For reason of the branch on count  assembler instruction very many return code schemes is based on zero=success, 4=remark,8=warning,12=serious warning etc. This code scheme is not only confusing, error prone it is also out of sync with modern computer languages where zero=Boolean FALSE and everything else is Boolean TRUE. Multi value return code schemes may also  force you to write code like if not ok then ok else not ok or even more horrid constructs.

For a job scheduling system return codes are very important, jobs are dependent of predecessor jobs, errors must be fixed in successor jobs, you must be able to set up guards that kicks in when things go wrong. A simple return code structure is a boon not only in the code of the of the job scheduling system itself, but also to the job scheduling.  In my system a schedule can only successfully execute or fail execution  and the same goes for the jobs or almost.

In job scheduling you have to deal with the situation where a job is bumped over due to preconditions not met. Is that a failure or a success? IBM’s Job Control Language treat that as a success. This may seem absurd, but the opposite may be equally absurd, it depends entirely from what angle you entering the problem of a bumped over job. Even considering non executed jobs is a simplification, there are more things to consider when deciding the outcome of a job.

My defaults - all jobs must execute successfully, bumped over jobs are considered a success the schedule execution is intercepted when a failure is detected  covers more than 95% of all job planning as long as you run one job after another single threaded, when you run jobs in parallel things get more complicated.

Parallel processing

Remember I wrote Job Scheduling is important for the BI ETL process? BI systems contains large amounts of data and imports large amount of data via ETL processes, parallel processing to cut ETL execution time short is essential for BI systems. My job scheduler deals with parallel processing in basically two ways one by parallel process jobs and cut jobs up in smaller pieces/chunks. This I have described in when fast in not enough , in parallel processing of workflows  I described parallel execution in detail.

Now parallel processing and multi threading is crude and awkward in PHP, but it can be done.

This and more I will try to write some more posts about. I end this post with the execution of an empty schedule. This is how this empty.xml schedule file looks:

<?xml version='1.0' encoding='UTF-8' standalone='yes'?>

<schedule/>

Remember what I wrote about defaults, non destructive and sensible, here we have quite some defaults to fill in. Here we go:

What can be more appropriate for a default job than to TTY type contents of the little red box in the middle of the log display.

Note the first line; The Job Scheduler still start with the scriptS.php  module.

I hope I will be able to continue  to write about my PHP job scheduler.

Some examples can be found here .

2012-04-21

Job scheduling with PHP - 1


Background

I will try to write some posts of a job planning system I have written in PHP. In this first post I give a background to job planning and job scheduling.
I have partly been working with computer  operations my entire life, so it feels anyway.  Job planning and supervision is an important part of computer operations.  By Job I mean a planned background   task, that do some work in a computer, most often a job is part of an application, like import sales orders from another application or automatically send requirement forecasts to suppliers.

Job scheduling is complex.

If you like to create a process that starts by import Customer Sales Orders and ends with mailing out Purchase Orders of components to your suppliers, there are a hell of a lot of things to do from inbound sales order to outbound Purchase Order. There are many dependent tasks that must be carried out before the Purchase Order is produced. These tasks and dependencies must be defined in a Job Scheduling System.  If we make this extremely simple we create three jobs.

  1. First we create a  job for Sales Order Intake.
  2. Then a job for Material Requirement Planning , (calculate how many components missing).
  3. And at last a job to mail out Purchase Orders for components missing.
With only these three jobs at lot of questions arises. E.g. what shall we do if there is no Sales Order? Is this an error? Shall we notify someone? By mail? SMS? Twitter? Shall we execute the next step(s)? What do we do if there is a problem with a Sales Order? Shall we run this job on Saturdays? If the Sales Order application is delayed shall we wait? If so for how long?
If our database server is down what should our three jobs do? If the mails system is down? Etc.
There are endless possibilities that background jobs go wrong one way or another. In a Job Scheduling System you must be able not only to describe your processes but also alternative actions, notifications, error corrections and relation to other processes or scheduled events.

When I created a Business Intelligence system some years ago I decided to build a job scheduling system of my own based on my experience of computer operations. I never worked with a Job Scheduling System I really liked and I always thought I could do better. As a matter of fact I thought I could do a lot better. Job Scheduling systems I worked with have been to limited, awkward, inflexible, poor plugin capability, bad social skills i.e. do not communicate with other job scheduling systems the list goes on and on. Lately I have seen graphical Job Scheduling Systems, and they are probably the worst. First I do not like point-and-click programming you miss the detailed knowledge of what you are doing, second the graphical interfaces cannot do everything necessary. Too often you end up ‘this cannot be done’ or ‘for this task you must use the TTY interface’.  I do not want to give explicit examples but for those not involved in job planning believe me, there exists a lot of ‘limited’ job planning tools on the market.

In the post Job scheduling with PHP -2  I will describe my Job Scheduling System. Here you find some examples.

Me and LinkedIn

Some days ago I  joined LinkedIn. 
I have had mails in the past asking me to join various people and friends on LinkedIn. I didn't know what LinkedIn was "it's like Facebook but for old farts" a younger friend told me. Thank you very much - but yeah then maybe it's for me, I'm not very much for computer social networking, but I'm undeniable an old fart. I actually was pushed in by a colleague and today I edited my LinkedIn profile with a photo. This is of respect to the LinkedIn and other members, present yourself decently so other know who you are.

Without knowing how I'm now connected to four other LinkedInners. I'm a bit thrilled by this new adventure in social networking. If you reader are LinkedIn please feel free to invite me to your 'circle' or whatever it is called. I doubt this will be read by many, so far I suspect the only ones reading my posts here are Google bots, and bots from strange Russian sites.

P.s. I still do not know very much about Facebook, except that everyone except me are there. My sons are there and if I join I might meeting them there and I'm not sure I want that. I think I prefer physical family meetings. But what do I know, I'm just an old fart just entering LinkedIn, my next step in Social networking.

2012-04-08

Dress code for IT professionals.


The other day I was discussing how to dress as an IT professional. What you wear signals a lot about you. Now dress codes depends a lot about culture and nationality and context. The clothes that makes you blend in at a fashionable club in Berlin, may make you stand out in a PHP user group meeting in Boston. From where I came the south suburbs of Stockholm, we dressed in white T-shirt, jeans and worn out clogs. That was sort of a standard uniform of my ‘college’ days. And this is how I dressed when I started to work. It was practical and comfy clothing, I was absolute uninterested of fashion and clothing and not interested of making an impression (still I am by the way). 
But as the years progressed I got more and more qualified work tasks and gradually  my IT skills grew. I began to appear on user group meetings holding presentations about software development, MRP applications, job control, database administrations, security to name  few. Still most often wearing T-shirt and jeans, I started to realize that even though my skills were good few really listened to what I said. At that time the 1980ties most people were older than me dressed in strict business  suites.  At a database software user meeting in Munich, I appeared in neon-green pants with leopard pattern T-shirt and sneakers. Although the questions I raised were both relevant and important (I had found some serious deficiencies in the software) no one took any notice. Coming home I discussed this with my at the time very new and much loved girl friend (now my not so much loved ex-wife). She had a better understanding of dress codes and told me if I wanted to be taken seriously I had to dress seriously. She took me out and bought some business suites, pinstriped shirts and some ties. And that made a difference, now people did not only listened to what I had to say, they asked for advice. Soon after I was recruited by a Norwegian software company as a consultant. As a consultant I always dressed in decent business suites, I worked as a successful consultant for some 15 years in Holland, Sweden and Boston US (for a short time). I been on assignment in most west European countries, part of my success is the changed dress code.

I was recruited by a consultant company in mid90ties specializing in software for  mobile telephone operators. The owner and CEO was my junior by 15 years, he hired some youngsters and me. He once told me ‘I hired the best computer technicians in Stockholm dressed them up in smart business suits and deployed them as management consultants. You I hired for the age and looks!, I needed someone that looked senior, trustworthy and intelligent’. He also said ‘when you visit a client wear an ordinary business suit not an Armani, and go in an ordinary car’. I think that was very wise of a newly rich guy not 30 years old. This was an advice some SAP consultants working for Ericsson beginning of the 2000, had not heard. At that time I had many friends working with IT for Ericsson. They were annoyed by and  envied the SAP consultants and they told me many times, ‘of course you have to charge that much when you dress in those fancy suites and drive those expensive German cars’.
Later when we started our SAP migration project we told our consultants (that did dress in normal business suites and drive ordinary cars) to remove their ties when they went to our factories. The consultant boss tore off his tie and said happily we will not wear ties in this project. This was out of respect for the factory workers and not to create artificial barriers between them and us. And that is very much what dress codes are about, IT professional or not.

SAP Business Warehouse and Easter Eggs from Evry

Last week just before Easter I went down to Hoeselt in Belgium on a business trip. We had our annual physical Application Steering Committee meeting in Hoeselt this year. We pick one town where any of the committee  members live, which happens to be Essen, Hoeselt, Nantes and Stockholm. Next year it might be Gölshausen in Germany since we have acquired SCA Schucker , a company specializing in applying glue. This is far more than it sounds, it’s actually an hitech industry with a future as more and more assemblies are tied together with glue. Anyway at the meeting in Hoeselt I learned from a Belgian colleague that all Church Bells had gone to Rome. I asked him why there was Chocolate Bells for sale together with Easter Eggs and Bunnies. In Sweden we only have Easter Eggs, most of us know of the Easter Hare, but not what he is supposed to do. My colleague told me all Church Bells goes to Rome to bring back Easter Eggs, during the week before Easter Belgian churches are silent because the bells are gone. At the return to Belgium the bells hand over the Easter Eggs to the Easter Bunnies who distribute and hide the eggs for the children to go and seek for them.

This winter we have finalized a Proof of Concept for SAP Business Warehouse, or rather we are successfully finalizing the PoC together with two consultants from Evry . The consultants Thomas and Lars have done an excellent job, the PoC was about importing SAP COPA data and create the monthly report to the Group reporting system, which turned out to be excruciatingly hard with our COPA, our cost distribution and our reporting. Now when the job was done and it was Easter time. Thomas and Lars thought it would be a good idea to send us some Easter Eggs to celebrate  the good work we had done together, so they ordered 12 Easter Eggs to give to me and some colleagues that had worked with them. Now normal sized swedish Easter Eggs are big enough to hold 2-3 tennis balls and they are filled with sweets. I was on my way down to to Belgium when Thomas and Lars delivered the eggs, they called me up and told me there had been a slight misunderstanding when they ordered the eggs but now the eggs were delivered. Coming back to the office I saw what the misunderstanding was about. The Easter Eggs was huge, probably containing about two kilo sweets each. Lars and Thomas had supplied the entire HQ with sweets or almost, and still there were lots left so I took one of these gigantic Eggs with me home. On Long Friday I had about one kilo sweets myself, together with the rest of Easter eating I have probably gained one or two kilo. This is not a good start on beach-2012, I have promised myself to lose about five kilo before summer. Today Sunday I still feel drowsy after my excessive sugar intake. I have just been out for a 10km light jogging, but this is not even close to balance out the Easter Egg. And today me and my boys are going home to their Granny, more food, more Easter Eggs.