Preventing SQL Agent Jobs calling SSIS reporting failure

The ETL Control system I have in place has a series of parent and child packages, the lowest child of which can possibly and legitimately fail, and the Control flows I have in place gracefully handle that possibility, logging the failures back to Audit / Control database.

However, whenever a task fails, the error gets reported up the Parent / Child package chain all the way to the SQL Agent Job running the control framework and the SQL Agent job itself reports failure. This gives the wrong impression to anyone viewing the job history – it looks like the job has failed when in fact it has run just fine, handling the problems it encountered.

Rushabh Mehta presents an elegant solution in this post: Gracefully Handing Task Error in SSIS Package. His solution is to create an OnError Event Handler for the task that can possibly fail, and in that Event Handler find the System Variable Propagate and set its value to False. What the Propagate property does is decide whether Error messages are propagated up through the package. By setting it to false, you tell the package not to send Error messages up the execution chain. The Event Handler itself doesn’t have to do anything, it just has to exist and have the Propagate property set to False. This way the task can fail, not cause the SQL Agent Job to fail, but still allow handling of Errors in the Control flow through the normal use of Precendence constraints.

Unfortunately if you have a chain of Parent – Child packages this doesn’t quite work. If you try to apply this technique to an Execute Package task, it doesn’t work because if the Child package fails at any point – even if you prevent errors propagating up from inside the Child package, the Parent package will still raise errors and report failure. What I have found is that you have to have an Event Handler that prevents the propagation of errors at each step in the chain of packages that call that child package – so if you have Parent > Parent > Parent > Child, stopping propagation of errors in the Child isn’t enough to get the SQL Agent Job to believe it succeeded, you need to stop error propagation at each Execute Package task. I have a set of sample packages demonstrating this process here. It appears the only way to ignore errors at the bottom level is to ignore errors at every level, which is not a good approach.

As far as I can tell this is a bug, so I have reported it on connect.

If anyone has a solution to this problem I would be grateful to hear it – I don’t want to disable errors in every called package, just stop them propagating from the Execute Package Task at the bottom of the chain.

Connect Update 18 Aug 2010: 

Looks like this won’t get fixed as it currently stands, the response from Microsoft was to Close it as “By Design”:

The “propagate” value only controls if an event should be propagated up
repeatedly. It doesn’t affect if the event is propagated to the client.
For child task, client is the parent package.

We looked the issue and decided that we cannot change the current
behavior. To address the issue properly, we will consider introducing
option like “Propagate to client” in the future.

 This does actually make sense architecturally – so we will just have to hope that the “Propagate to client” property becomes a reality.

About BI Monkey

Comments

5 Responses to “Preventing SQL Agent Jobs calling SSIS reporting failure”
  1. Onedreamer says:

    Thanks a lot BI Monkey,

    You summrized very well this “missfunctionality” in SSIS
    I wasted nearly all the day trying to stop propagating erros from child package to the parent but it doesn’t work.
    Thanks…

    ******** It was said : Microsoft stop making money and fix your software **************

  2. abagag says:

    Thanks for an interesting post.

    I have a design in which there is a master package fired by an agent. The master package, in turn, fires subpackages (which fire packages and so on).

    I do not recognise the problem of the failing agent (running R2). As long as you make sure not to set FailPackageOnError and FailParentOnError to true and MaximumErrorCount to desired value packages will continue to execute and the things you do not want to execute can be handled by control flow design. To the agent running the dtexec the return code will be (0) successful regardless of how many sub-packages or tasks failed in the package.

    The problem of error propagation to my master I solved by setting propagation to false for any packages and containers that I do not want to be presented at the master package level.

  3. BI Monkey says:

    Abagag, sounds like you have a theoretically similar approach – i.e. to get SQL Agent to not report failure, you have to ignore a lot of failures at the package level.

  4. Shovan Mukherjee says:

    Hi,
    Changing the Propagate property value works fine while working with OnError event handler.But while working with OnTaskFailed event handler doesn’t matter whether Event handler level, you set the Propagate property>>False or not,you must have to make the MaxErrorCount property of both the Task and ForLoop set to 0,otherwise the Package execution completely stops on the first iteration itself,causing Task & ForLoop failure.

    I didn’t’t get any explanation for such behaviour.Pls share your idea why it’s not happening in case of OnTaskFailed event handler.

    One more doubt here.I’ve taken a ExecuteSQL task inside a ForLoop container and tested your way making Propagate property–>False,that absolutely works fine.
    1) But if I take ExecuteSQL task inside the ForLoop container, the OnError event handler fires twice,inspite of once.
    2) Again if I take Script task inside the ForLoop container,the OnError event handler fires only once as per the expected behaviour.

    I didn’t get, Why the Event Handler is firing twice in case of failure of ExecuteSQL task,inside the ForLoop.

    Many Thanks,
    Shovan

  5. BI Monkey says:

    It’s been a while since I’ve dealt with this so forgive my vague responses:

    OnTaskFailed and OnError are different events so probably behave differently. I’m not sure of the details though.

    Why you are seeing multiple events for a single failure is by design though; you get one event for the ExecuteSQL, then one for the Container it’s in. I’m not very sure why the Script task doesn’t have the same effect though.

    Cheers

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!