Data Quality Services Composite Domains and SSIS

The handling of Composite Domains in Data Quality Services is not intuitive. You can map a column to a composite domain in the DQS Cleansing Component, as below:

Fig 1: Mapping Composite Domains in SSIS DQS Cleansing Component
Fig 1: Mapping Composite Domains in SSIS DQS Cleansing Component

 

 

 

 

 

However there is no under the hood cleverness to automap the Sides column into the Domain as well. This makes some sort of sense as there is no reliable way for SSIS to guess what columns should be able to be mapped into the other parts of the composite domain.

What would make sense is that you should be able to map multiple columns into a Composite Domain along with a column sequence so you can manage it through the UI. Something like this:

Fig 2: Possible Composite Domain Mapping UI
Fig 2: Possible Composite Domain Mapping UI

 

 

 

 

 

The effect of the mapping as in Fig 1 is that a naive user would expect the DQS Cleansing component is being clever, when actually it is doing what it is told – passing the single column’s value to the Composite Domain in DQS. This results in records being treated as if only the first part of the CD is being passed in – in this case Shape. The result being that records are by default Invalid (or, as in my KB, sometimes Corrected).

How to correctly manage Composite Domains

The correct way to pass in data to a Composite Domain in SSIS for evaluation is to build it into a concatenated string using a Derived Column, and map that string into the Composite Domain – thanks to Matt Masson of the SSIS team for explaining this – see the comments at the bottom of this post.

My example string would be built as below, space delimited:

Fig 3: Creating the Concatenated String for the Composite Domain in a Derived Column
Fig 3: Creating the Concatenated String for the Composite Domain in a Derived Column

 

 

 

 

Which would then be mapped into the Composite Domain in the DQS Cleansing Component as:

Fig 4: Mapping the Concatenated String into the Composite Domain
Fig 4: Mapping the Concatenated String into the Composite Domain

 

 

 

 

 

This will then start churning out the expected results.

The database, DQS Knowledge Base & SSIS Package can be found here.

I’ve also put up a Connect to cover this issue, please vote it up if you think it important.

2 thoughts on “Data Quality Services Composite Domains and SSIS

  1. Dear BiMonkey,
    You forgot to mention a very importante step. How you convert the corrected result to the corrected columns in the SSIS…?
    I dont understand why Microsoft didint fiz the CD rules…
    Regards and thanks.
    Pedro

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>