When we use Excel (Excel source in case of excel 2003 & OLEDB Source in case of excel 2007) as our data source in SSIS, we come across the fact that SSIS decides over the data type of a column of excel sheet based upon the contained metadata. Data source scans the metadata of the excel sheet and determine the data type of columns.
But in this process, there are certain other parameters or settings which also play an important role in this and once decided; all other values with other data type in the same column are imported as NULL.
Example: If we have string and numbers values in the same column then either it will take it as string or number while configuring the excel data source. But again if it considers it as number then string values would be imported as NULL and in case no. of string values are more (in first 8 cells) then it would take string data type which would allow both the values to be imported.
And interesting point to be noted here is, even if we forcefully select ‘DT_WSTR’ (string) data type in the first case, then also it would behave the same.
There are the following settings which cause this unwanted output & these are also the ways to configure them to have the correct results:
- Setting “IMEX=1” in the connection of the Excel sheet. This can be set in the connection string in case of Excel Source (Excel 2003) and can be set using ‘Extended Properties’ in case of OLE DB Source (Excel 2007). This setting tells that excel sheet is having mixed data types and thus import it as ‘Text’ values.
Then we can forcefully select ‘DT_NTEXT’ data type in the data source.
- Registry setting “TypeGuessRows=8”: This setting/value tells the connection manager to consider the first 8 rows of a column and decide over its data type based upon the maximum no. of values type in those 8 values. So if we have first 5 rows/cells of a column of type string but rest of the 100 values are numbers then it would import all the numbers as NULL since data type in this would be String (DT_WSTR). This will be applied for all the packages i.e. on server level.
So if we use a higher value for TypeGuessRows (say 20) then it would take it as Integer.
But again here also it would import rest of the string values as NULL.
- Using ‘SELECT * FROM [SheetName$]’ query instead of direct table/sheet name. This will allow all type of values to get imported as ‘DT_WSTR’ which is quite acceptable because anyhow it’s getting all the values of a column. This is really a good option to select.
Note: In all the cases where in we are getting string data (along with number data) be means of some or the other setting, we have to have String column in the destination.
Many of folks write about this topic but you said some true words!
Appreciate this. Very informative posting.
Thanks, Ruiweerve.
Mate, the IMEX=1 solved it for me.
Thank you muchly, talk about a freakin obscure bug.
Hey thanks a lot, Dave.
Verry good mate!
Congratulations
Thanks for your appriciation, Rodrigo !
Wow, you just saved me a few years of head-scratching.
- Thanx!
Thanks Matt.
First of all Thanks a lot
but it gives me error
when i try to read the excel data
The error is “Could not find installable ISAM.”
My connectionString is
“Provider=Microsoft.Jet.OLEDB.4.0;Data Source= ” + filePath + “;Extended Properties=Excel 8.0;IMEX=1″
Hi Vishal,
ref: http://support.microsoft.com/kb/209805
Thanks for reply….
But i had managed to solve the problem…..
The problem is the Connection String…
By just Doing this
@”Provider=Microsoft.Jet.OLEDB.4.0; Data Source= ” + FilePath + “;Extended Properties=\”Excel 8.0;HDR=NO;IMEX=1\”";
The problem is solved…
Thanks again
Cool Article! My spouse and i had been simply just debating that there’s a whole lot absolutely wrong details at this matter and also you precisely replaced the belief. Many thanks for a marvelous contribute.
Thanks very much! This problem had been driving me crazy. My research turned up many solutions referencing IMEX=1 and TypeGuessRows, but this is the first solution I see showing “select * from [Sheet1$]” which sounds like a much better option, so I will try that. Thanks again!
Hi Munish,
First of all thanks for this awesome article, that gave a real good insight for me into the problem.
But i am still facing the issue even after doing the following
>> changed IMEX=1, it didnt work
>> so changed the TypeGuessRows = 0 and IMEX = 0
Still didnt work
Any insight on it will be really appreciated
FYR : when i did change HDR=NO and debugged my app. Did get all the columns as string, but i do need HDR=YES option in my case. As i dont have control over the Excel
Also I am using Microsoft.ACE.OLEDB.12.0 since i am working with .xlsx
string connectionString = @”Provider=Microsoft.ACE.OLEDB.12.0;Data Source=” + Source + “;Extended Properties=\”Excel 12.0;IMEX=0;HDR=YES\”";
As i couldnt find any such exception for above provider in
http://www.connectionstrings.com/excel
Regards
Niyaz Rasheed
Hi Munish,
i am using excel source in foreach loop where my excel sheet name change with the change in excel file. so how to modify “(SELECT * FROM [SheetName$])” independent of the sheet name. always select first sheet data.
Maybe I have missed something, but the last solution ‘SELECT * FROM [SheetName$]‘ does not work with me… SSIS still “chooses itself” the data type and rejects the modifications I want to make in the advanced properties.
Any idea ?
how to create dynamic excel file using ssis and and then store that excel into sql server table as binary datatype