I recently had to demo SSIS to an enterprise as part of an ETL tool evaluation. One of the Microsoft BI stacks weaknesses is the lack of Data Lineage tracking. What this means is there is nothing embedded in the toolset that allows you to identify clearly the source of a data item in a package / report / cube without digging into the development environment. Rumours are this will be fixed in the next release, however nothing has yet been confirmed.
However, where the Microsoft BI stack has a competitor beating edge is the 3rd Party ecosystem – so where there is a gap in the toolset, often another company will step in and fill it. In this case, Data Lineage issues are addressed by a tool created by Pragmatic Works (which is run by Brian Knight, an SSIS heavyweight) called BI Documenter.
BI Documenter Review
So what does the tool do? It has 3 key functions:
Documentation Generation: The tool auto-generates documentation for Databases, SSIS packages, SSAS Cubes and SSRS reports. It’s quick, and the output is pretty – and it’s really a bit useless. Its benefit is if you have to say you’ve produced some documentation and need to do so with minimal effort. The reason I say this is because it provides no context for why things have been done, what the purpose of the component is, where it fits in to the framework etc. My usual gripe with documentation I come across is that it only tells me the what, not the why, and the why helps me solve a problem. This tool can’t do anything to address this.
Data Lineage: Now this is where the main value lies. Through simple navigation you can select any object (table / view / package etc) in the solution and see what objects depend on it and what objects it depends on in turn. This is great in a complex system where if you need to make a change and need to find out what that impacts.
Now its not perfect – it seems to skip documenting some sources, such as flat files, so they get missed in the impact analysis. And the level of granularity is at the object level – for example you can’t see the impact of an individual column change, just at the table level, but its still a great start and a useful tool.
Snapshot comparison: A final piece of value which can be useful in troubleshooting. BI Documenter takes snapshots of your solution to document – and you can compare these to identify changes in the solution. The detail level is pretty good and will be a great place to start tracing changes in your system when your source control systems fail.
Is the tool worth it? At a maximum cost of US$500 a seat, it’s definitely got a place somewhere in your organisation. The documentation tool is of limited use but the Data Lineage and Snapshot comparisons are worth the cost of the product. Full details here: Pragmatic Works – BI Documenter