This is a snippet of a post on the performance of the DQS engine when called from SSIS. I’ve created a simple number based Domain rule and replicated it 5 times in my knowledge base. My package then feeds copies of the same set of data into the DQS component (5000 rows) and runs it through 1 – 5 domains.
The performance profile is as below:
There seems to be a fairly linear relationship between the number of domains being processed and execution time. Note that I’ve created a dummy value for “0” to indicate what the start-up time of the DQS component might be, as it’s impossible to have a DQS Cleansing Component in the flow with no columns mapped.
I’d ignore the actual numbers – this is on a development VM which is definitely not configured for performance – and I’m aware the DQS Team are working on performance issues (though by the looks of it, better be working hard).