A BI developers journey
Part 1: The Single Source of the Truth
Many moons ago, a young BI developer went in search of the Single Source of the Truth.
First he saw a priest, who held up a copy of the Bible and happily proclaimed that he held in his hands the Single Source of the Truth, delivered from high above.
Next he sought out an Imam, who held up a copy of the Koran and happily proclaimed that he held in his hands the Single Source of the Truth, delivered from high above.
Frustrated, he sought out a Scientologist, who held up a copy of Dianetics and happily proclaimed that he held in his hands the Single Source of the Truth, delivered from high above.
He turned to his venerable Data Warehouse manager. “I am so confused. All I sought was a Single Source of the Truth.”
The Data Warehouse manager held up a copy of the Corporate Information Management Policy, and happily proclaimed that within it was defined the Single Source of the Truth – the Data Warehouse – and this was delivered from high above by the Enterprise Architects.*
At this point, the young BI developer suspected all was not right with the world.
* In no way am I suggesting Enterprise Architects are godlike, except perhaps in their ability to give vague and largely useless directives
Part 2: What is this Truth thing, anyway?
The young developer turned to someone who cared deeply about numbers and facts, the Finance manager.
“Finance manager. What is this Truth business that I seek to define? Surely in your world it is simple? Cash goes in, Cash goes out, at some point the CEO takes a big pile of it to the Italian car dealership and buys something red and expensive?”
The Finance manager nodded sagely. “Well, yes. And .. er, no.” She looked at the young developer and asked a question. “When you say cash in, do you mean our bank balance? Or monies received? Do we include or exclude cheques that have not cleared? How about late instalments from our creditors?” She whispered conspiratorially to the young developer. “The truth is often whatever it needs to justify paying for the CEO’s next shiny toy.”
The young developer was even more confused at this point. If precision was not a guide, then uncertainty perhaps could be – so the developer sought out the Marketing manager. “Marketing manager. What is this truth business I seek to define?”
The Marketing manager grinned broadly, showing his perfect array of white teeth. “That depends on who I am trying to sell to. If I announce to the market the success of a new product, should I tell them how many we sold? Or perhaps shipped? Or how much we sold something for? It depends on who I am trying to please or impress.” He whispered conspiratorially to the young developer. “The truth is sometimes we don’t know and just tell them something that sounds good.”
The young developer was now terribly confused. Not only were there many sources of the truth, but sometimes the truth could be interpreted different ways – and even worse, sometimes it didn’t even matter.
Part 3: The revelation of the nature of truth
So, in his frustration he turned to his friend, a physicist, for surely a scientist should know what is true and what is false.
“Physicist friend. Surely the truth is not so hard to define? A thing is a thing, is it not? It is where it is, and it has characteristics that can be defined?”
The Physicist shook his head. “Let me tell you about the universe. At the tiniest scale we can know where something is, and how fast it is going. But the more we know about its speed, the less we know about its location. The more we know about its location, the less we know about its speed. We cannot know everything about it with certainty.”*
At this, the young BI Developer had a revelation.
“So, we cannot know everything. Just some things with certainty! We shouldn’t attempt to know everything, but just know what we can within the constraints that we put around it!”
With that, the young BI Developer wandered off and built one of many sources of truth, defining clearly when each would be true, and made some of the users happy.
* This is real physics – the Uncertainty Principle
Epilogue: What on earth is the BI Monkey on about now?
This odd story came about following from a conversation I had with a Data Warehouse manager recently who was very concerned that users would mash the official Data Warehouse data up with non-official data if given access to some of the PowerPivot technology. It’s not an unusual or invalid concern as when users get the numbers wrong the DW team can get the blame.
However I believe (and am not alone in doing so) that the world has moved on from a Data Warehouse being the Single Source of Truth for the enterprise. It cannot hope to keep up with or even reliably store all that data users may want, nor can it keep up with all the different views of what “the truth” actually is. So the Data Warehouse needs to change its role: become a Single Source of Truth for certain things, within defined boundaries.
A simple example of this could be Employee headcount. There are plenty of different interpretations of this depending on how you count full time and part time employees, contractors, volunteers and so on. The DW can step back from this and say – we will apply this rule, to get this number, and that’s the corporate standard. If a user wants to do it differently in their own analysis, feel free. Just don’t expect the DW to have your back or help you when the numbers look wrong or don’t tie up to the Annual Report.
This approach allows the production of all corporate sanctioned stuff – Annual Reports, Regulatory reporting, etc. out of the DW as a traditionally defined Single Source of Truth – as those boundaries are very well defined. It also allows users to do wild and crazy things, mashing up official and non official data and doing their own analysis and interpretation – without having to wait for the development cycles of the DW team.
The Single Source of Truth isn’t dead… but perhaps it is time for it to develop boundaries.