The handling of Composite Domains in Data Quality Services is not intuitive. You can map a column to a composite domain in the DQS Cleansing Component, as below:
However there is no under the hood cleverness to automap the Sides column into the Domain as well. This makes some sort of sense as there is no reliable way for SSIS to guess what columns should be able to be mapped into the other parts of the composite domain.
What would make sense is that you should be able to map multiple columns into a Composite Domain along with a column sequence so you can manage it through the UI. Something like this:
The effect of the mapping as in Fig 1 is that a naive user would expect the DQS Cleansing component is being clever, when actually it is doing what it is told – passing the single column’s value to the Composite Domain in DQS. This results in records being treated as if only the first part of the CD is being passed in – in this case Shape. The result being that records are by default Invalid (or, as in my KB, sometimes Corrected).
How to correctly manage Composite Domains
The correct way to pass in data to a Composite Domain in SSIS for evaluation is to build it into a concatenated string using a Derived Column, and map that string into the Composite Domain – thanks to Matt Masson of the SSIS team for explaining this – see the comments at the bottom of this post.
My example string would be built as below, space delimited:
Which would then be mapped into the Composite Domain in the DQS Cleansing Component as:
This will then start churning out the expected results.
The database, DQS Knowledge Base & SSIS Package can be found here.
I’ve also put up a Connect to cover this issue, please vote it up if you think it important.