When fast is not enough

Tree Structures

T h e other day I was reading an interesting article about three structures Moving Subtrees in Closure Table Hierarchies . The article describes a design pattern for tree structures, which makes it easy to create hierarchical ‘tree reports’ as BOM (Bill-Of-Material) lists and their counter parts where-used lists. Trees are everywhere, not only the BOM structures, they exist in accounting and HR defining organization and reporting structures, genealogical trees etc. Tree structures are a natural way of visualize structures of the real world. There are some problems with tree structures, they are complex to manage in most programming language and more important they do not fit in the relational database model, which is based on the mathematical theory of sets, where there is no order or structure ^[1] . Tree structures are also a formidable problem for the human programmer ^[2] .

Tree structures and Business Intelligence

Apart from the challenge of creating and maintaining tree structures, the storage model has to be performant. Analyzing via three structures is very common ‘what components are most critical for next week production?’, ‘what happens if we replace part X with Y and Z?’ Almost all financial reports are ‘sum thrus’ of tree structures. At period closing efficient tree structures are essential for fast and expedient reporting.

When we introduced tree structures in our business intelligence application ‘The Data Warehouse’, we put in a lot of effort into create a simple yet effective tree structure design. After some initial testing we decided to try a modified traversed tree structure design pattern. I did a prototype where each node in the tree structures is stored as separate rows much like the above mentioned ‘closure table hierarchy’. We also store the top node in all rows. This makes it easy and efficient to query the tree structure. We do not store any pointers for ‘SQL exploding’ the tree, we rely on the physical order in the database table. This is an ugly hack, and is in breach with every database design guideline I ever read or written. (The traversed three structure model contains pointers that enable plain SQL to navigate up and down in the tabular tree, but we never implemented them in our prototype.) The upside of our tree structure, it’s simple, performant and easy to work with. The downside it’s very hard to update! We do not update our tree, we always recreate the entire tree table.

Prototype Traversed Tree Model – One Example

Part/Material 0909100260 is a lubrication kit , shown in Figur 1. As you can see 0909100260 consists of five components or sub nodes.

Figur 1 - Tree structure for 0909100267

Figur 2 shows the table representation of 0909100260 plus the SQL query to select the database rows. The relevant columns are:

T_MATNR The top node
P_MATNR The parent node
C_MATNR The child node
TREELVL The tree structure depth
TREEQTY The tree structure quantity

SELECT * FROM `DB3_BOM_BASIC_TREE` WHERE `T_MATNR` = '0909100260'

Figur 2 - Table representation of tree structure for 0909100267

SELECT * FROM `DB3_BOM_BASIC_TREE` WHERE `T_MATNR` = '4150166760'

Figur 3 - Table representation of tree structure for 4150166760

As you can see in Figur 1 there are actually two structures in there. First we have the entire structure with top node 0909100260 and in this structure we have a sub structure 4150166760 which consists of two sub nodes shown in Figur 3. This table representation of a tree structure is both simple and fast to query. There are basically two problems with the prototype, create it and update it.

Create the prototype traversed tree structure

We receive a tree structure from SAP in the form of the adjacency list model from SAP see Figur 4.

SELECT * FROM `DB3_BOM_BASIC` WHERE `P_MATNR` = '0909100260'

OR `P_MATNR` = '4150166760’

Figur 4 - Tree structure for 0909100260 according to SAP

Looking at Figure 4 two things become apparent, first for those who knows SQL, it is very complex to write an SQL query to show the tree 1. Not only is it complex to traverse up and down the tree it is also inefficient. This tree representation is not good enough for performant business intelligence applications ^[3] . The other notable thing, there is less rows. Our prototype tree structure store a lot more rows. AC Tools store 421.131 design BOM rows in SAP, these rows become 1.787.016 when we transform them into the prototype traversed format, luckily it is just disk space we waste, and we happily sacrifice disk space for performance.

A normal data warehouse ETL process extract, transform and loads the SAP tree structure into our prototype traversed notation. Figur 5 shows a part of the job that transform and loads the design BOM, it consists of three jobs:

createTopNodes list all material that do not have any parent nodes
explode recursively find all child nodes of the top nodes
loadit load the result from explode into the Data Warehouse

At the time we introduced the tree structure in our Data Warehouse it took some 40 minutes ^[4] to run the explode job, the other jobs are negligible. But what the hey, 40 minutes is not bad, it’s some heavy processing going on here and we only load the structures once each night. We decided to use the prototype for a while, as it was very fast and simple for reporting. And we didn’t have a problem with the nightly recreate of the entire structure table.

Figur 5 ETL job Transform & Load SAP tree structure

Network impact

The SIKLAN project replaced our network switches with new 100 Megabit switches. Before we had our servers hooked up to Gigabit switches which are ten times faster ^[5] . Now we had a network setup that looked like Figur 6.

Figur 6

Our tree structure transform procedure shovel a lot of data between our ETL and Database servers and the impact of the slower switch was notable (from about 40 minutes to 60 minutes). This was a move in the wrong direction so we rerouted the data traffic through a special Data Warehouse switch, a D-Link Gigabit switch for about 120€. See Figur 7.

Figur 7

After rerouting the data traffic we were back to normal execution times again, which soon dropped to 30 to 20 minutes due to server and software upgrades. Now this got me thinking, ‘if these changes affect the execution time that much, why don’t we try to do something clever about the explode job (in Figur 5), the job that consumes all resources. In the explode job there is an iterator , the < forevery > that iterates this job for every top node coming from the job createTopNodes. Now this iterator has the capability to chop itself up in chunks and run them in parallel following the map and reduce design pattern ^[6] . And this is exactly what the reworked version of the explode job in Figur 8 does. Now the explode job chops the top nodes into chunks with 1000 entries in each chunk. These chunks are run in parallel but not more than 7 processes at a time. Why not more than 7 processes? Well I measured this job and 7 parallel processes seemed to be the optimum ^[7] . When all chunks are processed, the < exit pgm=’reduce… > assemble all transformed structures into one file that is handed over to the last loadit job (in Figur 5). Now we are down to less than five minutes for the entire transformation process! This is more than fast, it is phenomenal.

Figur 8 Transform evenly chunks in parallel

But we can do better, once you start optimize a process, you soon come close to the edge of meaningless sub optimization, and this is what Figur 9 is about. Here we use knowledge about material number distribution and structures. The chunks in our iterator can vary in size, this pattern of progressively smaller chunks chops another minute of the execution time down to four minutes! This chunk-size pattern is far from ideal, it’s just better than even chunks, and it is the best the iterator can do out of the box. With little trouble we could create a job that produce a better optimized chunk-size pattern, but that would be going over the top… (at least for now).

There is always a danger doing this kind of optimization, when conditions change the assumptions may be invalid, but it’s good to know you can do more if needed.

Figur 9 Transform 'optimized' chunks in parallel

At the moment of writing I think the present bottle neck is the D-Link switch, you just do not get a super-duper switch for 120€, in a simple test I did it was capable of deliver 600Mb/s, this is certainly more than the SIKLAN 100Mb switch, but that high quality switch deliver 100Mb. Sometimes you only get what you pay for, I’m convinced a pricy Gigabit switch like those we had before would do better than the D-Link. Then we have the database server, we have done no tuning at all on that server. But today it does not matter, we are very fast and that gives us the luxury to use an awkward tree structure that is very fast and easy to work with for reporting and that is what matters at the end of the day. And yes it is still called prototype traversed tree structure.

Real life Example

In this example a SQL SELECT joins three tables:

DB3_BOM_PROD_TREE - 880269 rows (= nodes in the production BOM)
DB3ARTSTATDAY - 74947 rows
CB1ARTSTATDAY - 34570 rows

As you can see the result table contains 32116 rows and the query took 0.8272 seconds to execute.

What is the query is good for? It is used to help to decide at which assembly lines materials should be stored in our Tierp factory .

More examples of my PHP job scheduler can be found here .

[1] When I first learned about relational databases, my initial reaction was ’show me the tree, without decent tree support this will never fly’. Since then I have learned ways to deal with trees in relational databases, (nowadays I consider them to be the best way to store data in computers by far so far).

[2] It’s common to say ‘trivial’ about programming problems, but I never had said so about tree structures. My biggest mistake ever in applications design was about tree structures. In an MRP application for the CMT Simba workshop (and later for Industrial Technique) I decided to hide away all technical complexity with ‘phantom structures’ from the user. This lead to some extremely complicated BOM update programs. The application lived for some 25 years and I was always terrified someone would call and say ‘Hey, we have a problem with the phantom structures, can you fix it?’ (This application also had a year 2K bug which I ignored ‘this application is scrapped long before year 2000, anyway I will not be here. Little did I know then, year 2000 I was back and among the first things I had to do was to fix the bug, the shop calendar for 2001 was off by one day.)

[3] Then why do SAP store three structures this way? SAP is an operational system and online updates of tree structures are frequent. This simple parent-child representation is simpler to update than other tabular tree structures. Simpler but not simple, it is still very complex to maintain this tree structure.

[4] The figures are not exact; there are a lot of processes running in our Data warehouse servers so timing for individual jobs varies.

[5] This is a simplification, I have written about the SIKLAN impact on our Data Warehouse more in detail ’Why Gigabit matters’.

[6] Reading Wikipedia, it looks like I have invented a PHP/XML variation  . I’m confused about the Google patent, I’ve used map and reduce for ages. I called it M assive Parallel Processing , it’s very natural when you parallel processing 哈哈 .

[7] There is a lot more running in parallel together with these jobs, e.g. we transform the production BOMs in parallel along with the design BOMs.

2012-03-27