Importing data from Excel having Mixed Data Types in a column (SSIS)

When we use Excel (Excel source in case of excel 2003 & OLEDB Source in case of excel 2007) as our data source in SSIS, we come across the fact that SSIS decides over the data type of a column of excel sheet based upon the contained metadata. Data source scans the metadata of the excel sheet and determine the data type of columns.

But in this process, there are certain other parameters or settings which also play an important role in this and once decided; all other values with other data type in the same column are imported as NULL.

Example: If we have string and numbers values in the same column then either it will take it as string or number while configuring the excel data source. But again if it considers it as number then string values would be imported as NULL and in case no. of string values are more (in first 8 cells) then it would take string data type which would allow both the values to be imported.

And interesting point to be noted here is, even if we forcefully select ‘DT_WSTR’ (string) data type in the first case, then also it would behave the same.

There are the following settings which cause this unwanted output & these are also the ways to configure them to have the correct results:

  • Setting IMEX=1 in the connection of the Excel sheet. This can be set in the connection string in case of Excel Source (Excel 2003) and can be set using ‘Extended Properties’ in case of OLE DB Source (Excel 2007). This setting tells that excel sheet is having mixed data types and thus import it as ‘Text’ values.

Then we can forcefully select ‘DT_NTEXT’ data type in the data source. 

  • Registry setting TypeGuessRows=8: This setting/value tells the connection manager to consider the first 8 rows of a column and decide over its data type based upon the maximum no. of values type in those 8 values. So if we have first 5 rows/cells of a column of type string but rest of the 100 values are numbers then it would import all the numbers as NULL since data type in this would be String (DT_WSTR). This will be applied for all the packages i.e. on server level.

So if we use a higher value for TypeGuessRows (say 20) then it would take it as Integer.

But again here also it would import rest of the string values as NULL. 

  • Using ‘SELECT * FROM [SheetName$]’ query instead of direct table/sheet name. This will allow all type of values to get imported as ‘DT_WSTR’ which is quite acceptable because anyhow it’s getting all the values of a column. This is really a good option to select.

Note: In all the cases where in we are getting string data (along with number data) be means of some or the other setting, we have to have String column in the destination.

About these ads
This entry was posted in SQL Server, SSIS. Bookmark the permalink.

26 Responses to Importing data from Excel having Mixed Data Types in a column (SSIS)

  1. Ruiweerve says:

    Many of folks write about this topic but you said some true words!

  2. Appreciate this. Very informative posting.

  3. dave says:

    Mate, the IMEX=1 solved it for me.
    Thank you muchly, talk about a freakin obscure bug.

  4. Hey thanks a lot, Dave.

  5. Rodrigo Salvaterra says:

    Verry good mate!
    Congratulations

  6. Thanks for your appriciation, Rodrigo !

  7. Matt says:

    Wow, you just saved me a few years of head-scratching.

    – Thanx!

  8. vishal says:

    First of all Thanks a lot
    but it gives me error
    when i try to read the excel data

    The error is “Could not find installable ISAM.”

    My connectionString is
    “Provider=Microsoft.Jet.OLEDB.4.0;Data Source= ” + filePath + “;Extended Properties=Excel 8.0;IMEX=1″

  9. vishal says:

    Thanks for reply….
    But i had managed to solve the problem…..

    The problem is the Connection String…

    By just Doing this

    @”Provider=Microsoft.Jet.OLEDB.4.0; Data Source= ” + FilePath + “;Extended Properties=\”Excel 8.0;HDR=NO;IMEX=1\””;

    The problem is solved…

    Thanks again

  10. Cool Article! My spouse and i had been simply just debating that there’s a whole lot absolutely wrong details at this matter and also you precisely replaced the belief. Many thanks for a marvelous contribute.

  11. Dennis says:

    Thanks very much! This problem had been driving me crazy. My research turned up many solutions referencing IMEX=1 and TypeGuessRows, but this is the first solution I see showing “select * from [Sheet1$]” which sounds like a much better option, so I will try that. Thanks again!

  12. Niyaz says:

    Hi Munish,

    First of all thanks for this awesome article, that gave a real good insight for me into the problem.

    But i am still facing the issue even after doing the following
    >> changed IMEX=1, it didnt work
    >> so changed the TypeGuessRows = 0 and IMEX = 0
    Still didnt work

    Any insight on it will be really appreciated

    FYR : when i did change HDR=NO and debugged my app. Did get all the columns as string, but i do need HDR=YES option in my case. As i dont have control over the Excel

    Also I am using Microsoft.ACE.OLEDB.12.0 since i am working with .xlsx
    string connectionString = @”Provider=Microsoft.ACE.OLEDB.12.0;Data Source=” + Source + “;Extended Properties=\”Excel 12.0;IMEX=0;HDR=YES\””;

    As i couldnt find any such exception for above provider in

    http://www.connectionstrings.com/excel

    Regards
    Niyaz Rasheed

  13. khan says:

    Hi Munish,

    i am using excel source in foreach loop where my excel sheet name change with the change in excel file. so how to modify “(SELECT * FROM [SheetName$])” independent of the sheet name. always select first sheet data.

  14. KiaraFR says:

    Maybe I have missed something, but the last solution ‘SELECT * FROM [SheetName$]‘ does not work with me… SSIS still “chooses itself” the data type and rejects the modifications I want to make in the advanced properties.
    Any idea ?

  15. manojo says:

    how to create dynamic excel file using ssis and and then store that excel into sql server table as binary datatype

  16. Anand says:

    Really very usefull about ur imex concept thanks

  17. Carrie says:

    Hi,

    I’m newbies in SSIS 2008. Much appreciate your sharing.
    what an impressive guidance, however i couldn’t proceed to the rest. specially when i come to Create Data1 in excel Sheet 2007.

    error: [Execute SQL Task] Error: Failed to acquire connection “Book1.xlsx”. Connection may not be configured correctly or you may not have the right permissions on this connection.

    Not too sure setting correct? as per below:

    specially the setting for Registry: TypeGuessRows=20 ?

    this is my scenarious, Export records from MYSQL to Excel 2007.

    Connection Manager:
    Data Source=;User ID=admin;Provider=Microsoft.ACE.OLEDB.12.0;Persist Security Info=False;Mode=ReadWrite;Extended Properties=”excel 12.0;HDR = YES;IMEX=1″;Jet OLEDB:Registry Path=TypeGuessRows=20;

    1. Execute SQL Task – (remove & replace a dummy book1.xlsx)
    2. Execute SQL Task
    CREATE TABLE `Data1` (
    `Copy of CORE_ACCOUNT_TYPE_CD` NVARCHAR(255),
    `Copy of DT_Posting` NVARCHAR(10),
    `Copy of Acct` DOUBLE PRECISION,
    `Copy of cbal` NUMBER(28,2),
    `Copy of Base` NVARCHAR(6)
    )GO
    3. Data Flow Task – (export records from MYSQL to book1.xlsx)

    Really Need your guidance…. Much appreciate it.

    Warm Regards,
    Carrie

  18. sivareddy says:

    can i insert excel sheet data to sql database with out oledb driver…..

  19. rabt2000 says:

    Reblogged this on rabt2000.

  20. sumit gupta says:

    Thank you very much for your post. It gave the huge support for me.

  21. abhijeet desai says:

    Really informative about IMEX.
    my application worked with IMEX,
    can we use Data Conversion to avoid this changes?
    So what is the difference between IMEX and Data Conversion tool?
    Just convert the type of the column before importing is i am right?
    need some explanation…….

  22. Thanks Vishal :)
    @”Provider=Microsoft.Jet.OLEDB.4.0; Data Source= ” + FilePath + “;Extended Properties=\”Excel 8.0;HDR=NO;IMEX=1\””;

    Solved my problem too for The error “Could not find installable ISAM.”

    Finally Thanks Munish for such a nice explanation.

  23. Hitha says:

    Hi, thank you for the post.
    my connection string is here : Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=’Excel 12.0;HDR=YES;IMEX=1;';
    I have a column, with string and numeric values. first 8 rows numeric and rest combination of both. By now I know that data type is determined based on TypeGuessRows which is 8 and this results in DBNull for the rows with string values.
    Only way I could solve this problem was by editing registry value for TypeGuessRows and setting it 0. It seems to be working now. Do you see any harm in this? Any suggestions welcome.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s