Loops and Lookups – a performance problem

A bit of a performance tuning nugget around loops and lookups which I faced recently.

We have a scenario where we are looping a few hundred times to execute a series of changing SQL statements, then passing the data through lookups before writing to target. We are doing this with a mix of the standard lookup and Cozyroc’s Lookup Plus task. However we were finding that while executing each SQL query took less than a second, the SSIS package was taking a long time to run each loop.

The loop was taking the query and then pausing for a while before moving any data. Eventually we realised this pause was SSIS building up the caches for the lookups. The standard lookups were running off a pre-built cache but the CozyRoc task doesn’t support caching and had to rebuild each lookup cache every time the loop was instantiated – not a big job but repeated a few hundred times those few seconds soon mount up.

The lessons here are:

  1. If you are reusing a lookup multiple times, use the Cache Transform to preload the cache into memory (2008 onwards)
  2. Be aware of the performance implication of caching lookups and the fact that they will be rebuilt each time the Data Flow is started

Read More