The Slowly Changing Dimension Transformation, part 2 – Type 2 Dimensions
In this post I will be covering how to use the Slowly Changing Dimension (SCD) Transformation to update a Type 2 Dimension, that is, one that tracks changes in values over time. The sample package can be found here for 2005 and guidelines on use are here. This is the second post in the series looking at some slightly more advanced behaviour – for the basics of the Slowly Changing Dimension (SCD) Transformation please read my post: The Slowly Changing Dimension Transformation, part 1 – Type 1 Dimensions.
Configuring the SCD for a Type 2 Dimension
The first work you need to do for a Type 2 dimension actually resides in your dimension table design – you need to decide whether you are going to track changes in your table using either a simple indicator to identify current and expired records, or if you want to use effective dates – the component doesn’t natively allow you to use both, though you can customise the output to do so. The Current / Expired indicator actually uses a small text string which can either be set to the string value pairs ”True” / ”False” or “Current” / “Expired” – no customisation of these is allowed in the component (again, you can customise the output to change this, but the wizard will only allow mapping of the column to one that will accept text strings). The Effective dates option requires a start and end date datetime column, and in the wizard you use a variable to set the time used. The sample package demonstrates a few possibilities but below I will describe using effective dates.
First of all, note when on the first page, Select a Dimension Table and Keys the Effective dates (and Current indicator) are not mapped. Because I have named the columns in line with what the SCD expects for such indicators, it ignores them completely in the mapping – they cannot even be selected as Input Columns. If you name them differently in your design, simply map them as “Not a Key Column”.
In the Slowly Changing Dimension Columns page set the change type of each column to “Historic” so the component will track history of changes.
The wizard will present a page that is only displayed when you have selected Historic change type columns. Here the start and end date columns are specified, and the component needs a datetime variable to use to set the expiry of old records and the start date of new records. Here I have just used the Package Start time variable – in practice you may well want to specify a variable populated with something else, such as the extract date of the data.
When go to finish the wizard, you will note the additional “Historical Attribute Output” will be generated. In practice this means a setof components will be output to manage the changes, which are illustrated below (click to zoom in). The derived columns add the effective start and end date columns and the OLE DB Command expires old records. Please review the sample component to see how this works in practice.
SCD Considerations for Type 2 Dimensions
One of the most important things to bear in mind is that the component is not intelligent in terms of knowing which data is new – so if you had two records for a given key in the sample file, you would have to sort it so it would feed it the most recent item last so that item would be the current one. It also provides no support for data which has its own change dates – for example if a record had an update date and you wished to use that to form the effective date.
The SCD component is only really suitable for tracking Type 2 changes in sources where there will be one record per key per extract and the source itself has no change tracking capabilities. Given this weakness and the difficulties with using this component generally (in terms of configuration and performance) – you may well want to look at the alternatives I mentioned in my original post about the SCD. This is a component that definitely needs an overhaul for the next release.