SQL Server Data Quality Services – Composite Domains

One of the things I skimmed over in previous posts was the concept of Composite Domains. This is a combination of domains that are assessed with interdependencies on content.

At a very simple level, a Composite Domain addresses these kind of problems:

  • If City = “London”, Country must equal “England”
  • If Wealth Category = “Millionaire”, Bank Balance must be greater than 1 million

They allow us to validate separate data items in combination, thus allowing the writing of more complex rules beyond the already capable single field ones.

Implementing Composite Domains

Below is a couple of screenshots around setting up composite domains. First we have the definition of what fields need to be included:

DQS Client - Composite Domain Properties
DQS Client - Composite Domain Properties

This is fairly straightforward – just pick at least two fields that are interrelated from the available Domain list. The other screen of interest is under the Rules tab:

DQS Client - Composite Domain Rules
DQS Client - Composite Domain Rules

Here it can be seen that a composite domain rule has the capability to evaluate two components at a time. This is a deliberate limitation, so if you wanted to validate 3 fields in combination, you would have to do it via a set of rules that cross over. If you were looking at validating the rule:

If City = “London”, Country = “UK” & Region = “Europe”

You would have to do it through the following rules

If City = “London”, Country = “UK”
If City = “London”, Region = “Europe”

The rules allow for AND / OR at the field level, so you could have rules that read

If City = “London” OR “Birmingham”, Country = “UK”

If Bank Balance >= 2 million and < 1 billion, Wealth Category = “Multi Millionaire”

There seems to be scope for improvement here – the rule capability is a little simplistic but I imagine will meet most scenarios and does make cross field validation possible.

Other features and summary

There are two features I skipped over – Reference Data (for a bigger future post on the whole concept) and the Value Relations tab which, at this point in time seems not to be working and is just a statistical summary of the values found in the data.

There’s not much to close out on otherwise – Composite Domains allow fields to influence other fields from a data quality perspective. The documentation on this feature is sparse at this point so hopefully we’ll get more information soon.

2 thoughts on “SQL Server Data Quality Services – Composite Domains

  1. Thanks James for overview of CD, my question is can we have sql query in domain rules by which validity is checked. For example, if my set of columns in same row repeats or set of columns might have some other underlying business functionality which is failing that can be identified by query. Hope you got my question

  2. The domain rules are limited to what DQS will support within the client – there’s no option to call out to an external engine such as a database via T-SQL.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>