Using .NET Framework 3.5 in SSIS Scripts

Thanks to Valentino Vranken who provides this useful post: Using A .Net 3.5 Assembly In SSIS 2008

By default SSIS Script Tasks / Components reference .NET Framework 2.0, which was confusing me as I tried to implement a solution around managing times in different Time Zones which required the TimeZoneInfo Class which only exists in 3.5 and higher. (This was what I was trying to implement: Daylight Savings in Script Task). Because the script referenced .NET Framework 2.0 the TimeZoneInfo class wouldn’t work. Switch the target framework to 3.5 and bingo – all good.

The tl/dr version: In your script component – while editing the script – open the Properties and change the Target Framework to 3.5 to use functions available in 3.5.

Read More

The Script Transformation part 3 – as a Destination

Fig 1: The Script Transformation
Fig 1: The Script Transformation

In this post I will be covering using the Script Transformation as a Destination. The sample package can be found here for 2005 and guidelines on use are here.

How can you use the Script Component as a destination?

As before, when dropping the Script Component on the Data Flow surface, you get three choices for what you want to use it as – in this case, you would select Destination. This by default creates a component with a single input and no outputs.You will have to manually define and type the input columns, and the component will only accept a single input.

Each option selected provides a different code template as well, and for a Destination you get code designed to consume incoming rows via the ProcessInputRow method. My example code below simply counts the rows pushed through the component.

Public Class ScriptMain
Inherits UserComponent

Dim RowCount As Integer = 0

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

‘ Increase row count
RowCount = RowCount + 1

‘ Message box giving progress

MsgBox(RowCount.ToString, MsgBoxStyle.OkOnly, “I have processed this many rows:”)

End Sub

End Class

This pretty much concludes my review of the Script Component, though there is more that can be done with it. However to get much further i’d have to get deeper into the coding, an area I have sketchy knowledge of!

Where would you use the Script Component as a destination?

This component can be used in this manner in order to deliver data to a destination that cannot be reached by a standard component, such as a Web Service or non standard flat file.

MSDN Documentation for the Script Transformation can be found here for 2008 and here for 2005.

If you are still struggling, try these additional resources:

If you need specific help or advice, or have suggestions on the post, please leave a comment and I will do my best to help you.

Read More

The Script Transformation part 2 – as a Source

Fig 1: The Script Transformation
Fig 1: The Script Transformation

In this post I will be covering using the Script Transformation as a Source. The sample package can be found here for 2005 and guidelines on use are here.

How can you use the Script Component as a source?

One thing I skipped over in my previous post was that when you drag a Script Component on to the Data Flow surface, you are prompted as to whether you want to use it as a source, destination or transformation (the default). If you opt to use it as a source, it configures itself differently, so it has no inputs and a single output by default, to which you need to add columns, as below:

b
Fig 2: Adding Output Columns to the Script Transformation

To populate rows, in the code you add a row to the Output Buffer using the AddRow method. You then assign values to the columns that you specified – the columns will be properties of the Output Buffer which will appear on Intellisense. Each time you hit the AddRow method you close off the previous row for that buffer. Below is a simple example generating some rows of data using a simple code loop. Each time the loop starts a iteration, I call the AddRow method for the Output0Buffer (the default one created when you set up the component). Then I set the values for the RowId and RowName columns using them as properties of the Output0Buffer. In this case, the code speaks a thousand words:

Public Overrides Sub CreateNewOutputRows()

‘ Define a loop counter
Dim i As Int32

‘ Loop the counter between 1 and 10 with a For / next loop
For i = 1 To 10

‘ Call the AddRow method on the Output Buffer variable
Output0Buffer.AddRow()

‘ Set the RowId property of the Output Buffer to the Loop counter
Output0Buffer.RowId = i

‘ Use a Case statement to decide the RowName property value
Select Case i

Case 1, 2, 3
Output0Buffer.RowName = “Under 3″

Case 4, 5, 6, 7
Output0Buffer.RowName = “Between 4 and 7″

Case 8, 9, 10
Output0Buffer.RowName = “Over 8″

End Select

Next i ‘ Do the next iteration of the loop

End Sub

So it’s not really that complicated to build a simple source. For those looking for some more advanced capabilities i’ve also included a couple of extra examples in the sample package. In the first (2b) I output rows to multiple outputs, sending a different set of rows to each output. In the second I read data from a table in the AdventureWorks database within the script source and modify it a little before sending it out to the data flow. It’s important to note you need to use a ADO .NET (Datareader) connections to access a database through a script’s own connection manager collection. Adding an ADO.NET connection manager for use in the script is simple – as shown below:

b
Fig 3: Adding a Connection Manager to a Script Component

Note the Connection Manager is from the Connection Managers for the package, not internalised to the Script Component itself.

Where would you use the Script Component as a source?

I’ve used it a couple of times to generate rows (such as default values for dimension tables), and the other common use would be when you need to access a data source that cannot be reached by standard SSIS components, but can be coded up to return a set of rows out of VB.Net / C#, for example a web service, or a particularly awkwardly formatted text file.

MSDN Documentation for the Script Transformation can be found here for 2008 and here for 2005.

If you are still struggling, try these additional resources:

If you need specific help or advice, or have suggestions on the post, please leave a comment and I will do my best to help you.

Read More

The Script Transformation part 1 – a simple Transformation

Fig 1: The Script Transformation
Fig 1: The Script Transformation

In this post I will be covering the Script Transformation. The sample package can be found here for 2005 and guidelines on use are here.

What does the Script Transformation do?

The question here should really be what can’t it do? The Script Transform fills in the gap when standard components don’t fit the bill by allowing you to create, consume or access rows and columns in the Data Flow and process them within VB.Net code (and also C# in 2008). This opens up a whole array of functionality to access as both VB.Net and C# are powerful and flexible languages. The Script Transformation can function as a Source (providing rows to the Data Flow), as a Destination (consuming rows from the Data Flow) or as a Transformation (changing or creating column values in the Data Flow).

Functioning as a Transformation, you aren’t restricted to just row in, row out operations either – you add new outputs, create multiple rows from single rows and create single rows from multiple rows. The capacity to do impressive tricks with your data is – well, impressive! Because of this array of options I will break each one out into a separate post. This first post will cover a simple one row in – one row out transformation.

Sadly there are some downsides to all this, two of which really stand out for me. The first of these is if you aren’t a programmer (I never moved much beyond VBA) then writing the code, debugging the code, or even knowing what can and can’t be done in the code can make working with this component a bit of a struggle. The second is you lose a lot of visibility over what is being done in the component – unlike with most other transforms there is no nice GUI to show what column is going where and what is being done to it. You need to be able to actually read the code to understand what is going on, and I must warn that the BI Monkey becomes one Angry Ape when code is insufficiently commented!

Configuring the Script Transformation

It is fairly easy to set up the Script Transformation, but you need to use a little more of the Advanced Editor type features than basic developers are probably used to. Key actions are selecting input columns, defining output columns, choosing input variables and connection managers being used within the component.

First up simply check the columns from the Input that you want to access in the script component. By “access” I mean read or alter the value of.

b
Fig 2: Selecting the Input Columns

Second, define the output columns. If you are adding new columns to the Data Flow as I do in the example, click the “Add Column” button which becomes enabled when you select the “Output Columns” folder. Then name it and select the data type. By default when using a Script Transformation as a Transformation a single output Output 0 is created for you to add columns to.

b
Fig 3: Configuring the Output Columns

Finally (in this case – I won’t be using any connection managers here and will cover those in a future post) enter the variables you want to be able to access in the script – there are two options – ReadOnlyVariables and ReadWriteVariables. Fairly self explanatory – if you want to change the value of the variable in the script enter it into the ReadWriteVariables line – and if you don’t want the value to change enter it into the ReadOnlyVariables line. Two quick notes of warning – if there’s a space in your list of variables in the ReadOnlyVariables line it will cause an error in 2005 Script tasks. Secondly, remember variable names are case sensitive.

b
Fig 4: Specifying the Variables

The last thing you need to do is click on the Design Script button that will open your code editor. There is a commented section where you can add your code.

Below is my sample code from the package – there’s nothing too fancy going on here. Note how the columns selected in the Inputs are available as a property of the Row – they simply pop up on Intellisense as you code. Similarly variables are accessible from Me.Variables.

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

‘ Use a standard VB Function
Row.ShipYear = Row.ShipDate.Year

‘ Use a variable in the Me.Variables.<VariableName> format
Row.DoubleDue = Row.TotalDue * CType(Me.Variables.Multiplier, Decimal)

End Sub

I’ll dig into how code is structured in one of the future posts in this mini-series.

Where should you use the Script Transformation?

The Script transformation is there to be used when native SSIS component functionality doesn’t meet your needs – the most common use I have had for it is when the Derived Column editor doesn’t give me what I need – such as when I want to use Regular Expressions, or reuse a function across many columns.

MSDN Documentation for the Script Transformation can be found here for 2008 and here for 2005.

If you are still struggling, try these additional resources:

If you need specific help or advice, or have suggestions on the post, please leave a comment and I will do my best to help you.

Read More

Getting File Information with the Script Task

A requirement for a current client is to capture some information about Flat Files being loaded for logging purposes. Fortunately this is a easy job for the Script Task, using the following VB.Net code:

Public Sub Main()

‘ String variable to hold file name
Dim strFileName As String

‘ File name (including path) passed in from container
strFileName = Dts.Variables(“User::FileName”).Value.ToString

‘ Create a System.IO.FileInfo object to retrieve the data
Dim FileObject As System.IO.FileInfo = New System.IO.FileInfo(strFileName)

‘ Return the information via a message box
MsgBox(“File name ” + FileObject.FullName + ” created on ” + FileObject.CreationTime.ToString)

End Sub

The sample package for this post can be found here. To see what other properties can be obtained, just use Intellisense on the FileObject object in the code.

Read More

Replacing multiple characters in a Text String in SSIS

Today as part of some ongoing painful text file imports, I had to clean up some phone numbers that had been entered into a free text field. Consequently the data contained this kind of garbage data:

(03) 8765 4321

0387654321

03-87654321 – Bob’s mums number

(03)87654321 – or use 04 87654321 during day

Which didn’t lend itself to being fixed in a Derived Column, as every possible replace character would have to be hardcoded, like so:

REPLACE(REPLACE([PhoneNumberText],”(“,””),”(“,””) – and that’s just to clear the brackets. Doing this for evey possible stray character was out of the question.

What I wanted was just to get rid of any character which wasn’t a numeric. My mind immediately leapt to Regular Expressions, but couldn’t see how to do these easily in T-SQL against the source or at all in a standard SSIS component.

Fortunately Brian Knight has a solution – using Regular Expressions within a script component. So, quoting him almost verbatim:

In the Imports area, add the following line of code:

Imports System.Text.RegularExpressions

Next, replace the Input0_ProcessInputRow subroutine with the following code:

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

If Row.ProductNumber_IsNull = False Or Len(Row.ProductNumber.ToString) <> 0 Then

Dim pattern As String = String.Empty

Dim r As Regex = Nothing

pattern = “[^0-9]”
r = New Regex(pattern, RegexOptions.Compiled)

Row.ProductNumber = Regex.Replace(Row.ProductNumber, pattern, “”)

End If

End Sub

Which worked beautifully and saved me writing some truly awful code. However it does lead me to gripe about an absent feature of SSIS that bugs me. The limited number of functions and inability to create custom functions to use in the Derived Column transformation means it is easy to fall back on to the Script Component to do your dirty work. This annoys me because a) i’m not a VB programmer and b) it reduces the transparency of what is going on in the data flow. Fingers crossed this will improve in 2011 – i’ve added feedback on Connect to request this – please vote it up if you consider it important.

Read More

Upgrade to SQL2005 SP2, you fools!

Hitting one of these 3 messages in SSIS on SQL2005?

  • The task is configured to pre-compile the script, but binary code is not found
  • The script component is configured to pre-compile the script, but binary code is not found
  • The task cannot execute in 64-bit environment because the script is not pre-compiled

Hotfixes are available here for Build 1399 and here for Build 1500

All the above can be fixed by applying SP2. And really, if you haven’t put SP2 on your SQL2005 you had better have a good reason for it, as there’s lots of other little things which will catch you out on your BI projects if you don’t. It was released in March 2007 so it’s not exactly a new patch either.

Read More