How to read a Flat File with data delimited by any wild card character like ‘#’, ‘$’ (SSIS)

Requirement:

Generally we deal with Flat Files which are comma (,) separated i.e. CSV files. And in SSIS by making use of ‘Flat File Source’ component we can read them very easily. These files can contain results/data from various sources like IBM, Mainframe, SAP etc. They may also be manually created or generated through some custom applications. And thus due to some specific requirements or limitations; they may have data which is not comma (,), bar (|), semi colon (;) etc delimited. These are among the few possible options available in Flat File connection manager used while reading the Flat Files in SSIS.

Apart from that there may be the requirement of having data containing comma (,) in their values itself. And thus we cannot just replace all the delimiters like dollar or # with a comma to have a comma separated file.

So to make it possible & read the records from a Flat File which can have any known separator for its column values, we need to do some kind of exercise explained below.

 Solution:

Since we don’t have any inbuilt option to select ‘#’ or ‘$’ etc as a delimiter for Flat File in connection manager, we would take the whole line/record into one column and then would split out all the values as per the present delimiter, using a Script Component.

Let’s take an example:

We have a .txt (Flat) file (say InputFile.txt) with three columns A, B and C. These headers are comma separated but the values under these columns are ‘#’ delimited.

Data into the file:

A,B,C

100#200#300

500#600#700

 Steps for the SSIS Package:

  1. Take a Data Flow Task (DFT).
  2. On the data flow tab, Drag one ‘Flat File Source’, edit it and click on ‘Flat File Connection Manager -> New’.
  • Name it as ‘SourceFileConnectionManager’.
  • Select the ‘Inputfile.txt’ in the File Name column by browsing the file.
  • Select Format as ‘delimited’.
  • Select ‘Header row delimiter’ as ‘Comma {,}’.
  • Select the check box for ‘Column names in the first row’.

1

   

 

 

 

 

 

 

 

 

 

  

 

  • Go to Column tab of this connection manager.
  • Select ‘{CR} {LF}’ as Row delimiter.

Screen would look like as:

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

As you can see all the data is now coming into a single column named ‘A,B,C’. 

  • Click OK.
  • Go to ‘Columns’ tab of the Flat File Source.
  • Rename the Output column as ‘A’.

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • Click Ok.

 3. Drag a Script Component & select to use it as a Transformation while adding it.

  • Point the output of Flat File Source to this script component.
  • Edit its properties and select “A” as an input column there.
  • Go to ‘Input and Output’ tab and add three output columns (col1, col2 & col3) to the existing output ‘Output 0’. Select ‘four byte signed integer’ as their data type.

4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • Modify the ‘SyncronizeInputId’ of Output0 to 0. This is to make the script component asyncronized.

5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • Go to ‘Script’ tab and click on ‘Script Designer..’ to edit the script (VB.Net).
  • Select ‘Pre Compile’ as False.
  • Write the following script in the ‘Input0_ProcessInputRow’ event handler:

         Dim values() As String

        values = Row.A.Split(“#”.ToCharArray)

         Output0Buffer.AddRow()

        Output0Buffer.Col1 = Convert.ToInt32(values(0))

        Output0Buffer.Col2 = Convert.ToInt32(values(1))

        Output0Buffer.Col3 = Convert.ToInt32(values(2)) 

  • Close this and Click Ok.
  1. Take one Flat File Destination and with the following shown setting configure its connection manager (DestinationFlatFileConnection).

6

 

 

 

 

 

 

 

 

 

 

 

7

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • Go to Advanced tab and modify the name of the columns from col1, col2 & col3 to A, B & C.
  • Go to the ‘Mappings’ tab of the flat File Destination and do the below shown column mappings:

8

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Now execute the package to have the following output in the ‘OutputFile.txt’ file.

 A,B,C

100,200,300

500,600,700

———————————————————————————————–

About these ads
This entry was posted in SQL Server, SSIS, Uncategorized. Bookmark the permalink.

7 Responses to How to read a Flat File with data delimited by any wild card character like ‘#’, ‘$’ (SSIS)

  1. Pawan K.C. says:

    Hi Munish

    I have gone through your blog and like to ask one question with you.I have a requirement that I get the txt files with large no ofdata(sometimes in GB) and have to bulk insert those files in thedatabase.It is easy to do the bulk insert by SSIS or T-SQL Query if iknow the no columns or header information, but the main problem isthat i am unaware of the no of columns in the file(here i know theformat it may be comma delimited,tab delimited or pipe delimited).Soits very much tough for me to solve this problem .Is there any ways tobulk insert the file in DB using SSIS.

    Regards
    ——————————–
    Pawan KC

  2. ip camera says:

    hey. I just got word of this unique webpage and I should really believe that this unique is a nice write-up. Bless you for this kind of awesome info.

  3. Palas says:

    Hi Munish,

    I found your blog to be really very useful. Great going, keep it up.

    I have a scenario where in I would get input files from Business where I would not know what delimiter is used. However it would be any one of the special characters (like you have explained in your example above).

    for Ex:

    it could be like the below one’s:

    Scenario-1 with Comma as delimiter:
    Id,Name,qty,price
    1,abc,765,28

    Scenario-2 with Hash as delimiter – Note :- the number of columns are varying below:
    Id#Qty#price
    2#76#29

    Is there a way to process such varing input files using a single pacakage? as all these files are dropped in a single folder.

    Your thoughts please?

    Regards,
    Pals.

  4. Erlend says:

    ‘#’ or ‘$’ are not available from the drop-down delimiter selector…. but you can type them into the drop-down delimiter selector…

  5. Tanmayi says:

    Hi Munich,
    I have come across a similar problem where in I am using a script component to read the file into a single column. However, since the data is huge, the single column is exceeding 8000 characters and hence is failing to read. Is there any other option to this? Your help would be appreciated.

  6. Nazzer says:

    Hi,

    I am unable to convert ‘SyncronizeInputId’ of Output0 to 0 and there is error in the “Output0Buffer” word in the script.

    Pls help

    Thanks in advance…

  7. Tony says:

    This is very helpfull . are you still using this blog? i have a question similar to the one you worked on but the difference is that i have a colon separating a row and i need to split into two columns. for instance
    boys:101
    girls:102
    kids:6
    i need to make two columns 1 for the words and another column for the numbers .
    thanks in advance

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s