Convert Text Stream to String

One of the ongoing challenges with SSIS is its difficulty in handling complex or damaged text files. One approach to dealing with such files is to bring them all in as one wide text column and then split them using code. Sometimes, the file is too wide for that approach, so below is an extension of that method where you import the column as a text stream (DT_TEXT, or Unicode DT_NTEXT) and then split the text stream in a script transformation:

        ‘ Declare variables
        Dim TextStream As Byte()            ‘ To hold Text Stream
        Dim TextStreamAsString As String    ‘ To Hold Text Stream converted to String
        Dim StringArray() As String         ‘ To contain split Text Stream

        ‘ Load Text Stream into variable
        TextStream = Row.TextStreamColumn.GetBlobData(0, CInt(Row.Column0.Length))

        ‘ Convert Text Stream to string
        TextStreamAsString = System.Text.Encoding.ASCII.GetString(TextStream)

        ‘ Split string into array and output
        StringArray = TextStreamAsString.Split(“#”)        

        Row.Column1 = StringArray(1).ToString
        Row.Column2 = StringArray(2).ToString
        Row.Column3 = StringArray(3).ToString  

An important thing to note is that in the step where the Text Stream is converted to a string, the Encoding will depend on the type of text stream you are bringing in – Unicode files will need “Unicode” instead of “ASCII”. Also I have used a hash (“#”) as the column delimiter but that value will vary depending on what type of file you are bringing in.

Read More