First up this only applies to SQL Server connections – other OLE DB connectivity such as Teradata and Oracle will be unaffected.
In the short term, nothing – if you are using OLE DB it will carry on working for a fair few years yet – but if you are planning for a systems longevity – look to using ODBC for all your SQL Server connectivity needs
It’s a slightly confusing move given that ODBC underperforms OLE DB – so it’s bad news for us SSIS people, unless part of the change also includes some significant performance and capability improvements.
In this first post on SSIS components I will be looking at the OLE DB source component. This is likely to be the most commonly used source in SSIS packages as it connects to a wide variety of OLE DB compliant databases, such as Oracle and SQL server.
There is a sample package reviewing the concepts in this post available here. Please make sure you review the guidance notes on using samples from this site here before downloading.
Data Access Mode
The data access mode is the means by which the OLEDB source component determines what query to send through to the underlying database to retrieve data. This can either be hardcoded in the component or populated using a variable. It is specified using the drop down circled in the image below.
There are four are available data access modes:
Table or view
Table or view name variable
SQL Command from variable
You will note that two of the available data access modes are essentially duplicates that allow the data access mode to be populated from a variable. Each mode is demonstrated in the sample packages, but I will only discuss the basic (i.e. non variable populated) modes below.
Data Access Mode: Table or view
The table view data access mode allows you to select the entire contents of any table or view within the database that you have connected to. However, it does not allow for any constraints on the data that is retrieved, and if you wanted to restrict the rows that you processed you would have to do that downstream in the package. From an efficiency point of view this is not ideal unless you genuinely needs to retrieve all the rows from the source database object. If you follow what is sent to the database using SQL profiler you will see that the component sends a ‘SELECT*’ query against the specified object.
Because of the inability to filter what data you retrieve from the database, this data access mode is not commonly used unless shifting data wholesale from one source to another.
Data Access Mode: SQL Command
The a SQL Command data access mode is much more powerful as it allows you to craft custom T-SQL queries to retrieve the specific data you require from the database, including the use of stored procedures to return data. An SQL Command can also be parameterised to allow even greater flexibility by passing package variables through to the query.
It is sensible to avoid having complex SQL in the SQL Command however, as the parsing engine can get confused, and can send incorrect SQL through to the database, or be unable to understand the columns that it is expected to return. When things start getting too complex, it is best to push the complex code back to the database layer as a view or stored procedure.
Under the columns tab of the OLEDB source component, it is possible to restrict which columns are passed from the component downstream by unchecking the column name as shown below:
Unselecting columns however has no effect on the query that is sent through to the database. If you look at example data flow 5 in the sample package you will see that a number of columns have been unselected, but if you track the query that is sent through to the database engine it is still a ‘SELECT *’.
This means that if you only need a few columns from a large table, should restrict this by using an SQL Command which only retrieves the columns you need as otherwise source component will still retrieve all the data in the table.
The OLE DB source is a common component used to access databases. As a rule you should try to use the SQL Command Data Access mode to reduce the amount of data retrieved from the database. Column selection has no impact on what is retrieved from the database, just on what is sent downstream to the package.
I have been fighting all morning to create a function which helps do a mind wrangling calculation for a client, and I win. All I needed to do was get SSIS to use it to update a table and my work was done.
Except SSIS decides that after i’ve been melting my brain all morning, what I need is some frustration.
The function accepts 4 parameters – a varchar(50), a datetime, a decimal (13,4) and another decimal (9,2). I then used the parameters in an OLE DB command as follows:
SET DATETIME_FIELD = FUNCTION(?,?,?,?)
WHERE KEY = ?
And SSIS complains about not being able to convert a decimal to a datetime. To be precise:
An OLE DB record is available. Source “Microsoft SQL Native Client” Hresult 0x80040E14 Description “Implicit conversion from data type datetime to decimal is not allowed. Use the CONVERT function to run this query.”.
I scratch my head here, check data types, but no luck with anything obvious. So I turn to SQL profiler to try and work out what it’s doing begind the scenes. And the OLEDB command is sending this to the server to validate:
SET FMTONLY ON select DATETIME_FIELD,DATETIME_FIELD,DATETIME_FIELD,DATETIME_FIELD,KEY from TARGET_TABLE where 1=2 SET FMTONLY OFF
So for reasons that confound me, to validate the SQL it is sending through for each parameter of the function it sends the target column of the SET statement. So validation fails, as not every parameter in my function is a datetime.
Unfortunately putting a CONVERT around the offending decimal columns to convert them to decimal (yeah, I know…) only helps temporarily, as when the command builds the SQL to execute the function, it still insists on sending the parameters as datetimes, and thus it stumbles at validation in a slightly different place.
Unfortunately the only workaround I can think of is to move the structure out of the Data Flow and put it in a loop, putting the SQL into a variable that is used by the Execute SQL task. Ugly, but I can’t see how to adjust what SSIS sends to the database engine.
Well here’s a crackingly helpful error message if there ever was one:
[<Component Name>] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E21. An OLE DB record is available. Source: “Microsoft SQL Native Client” Hresult: 0x80040E21 Description: “Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done.”.
What does it all mean? Fortunately the answer is simple – the OLE DB component is trying to deliver a column to the target table and the target database won’t accept the input data type. The component has allowed the mapping because it should / might work but when it tries to execute – blecch! Tortellini!
In my case this was because I tried to push a Text Stream SSIS data type flow into a Varchar SQL Server column. Mapped fine, but failed when trying to run. Apparently a common problem when running data into a different database type to the source, e.g. Teradata to SQL Server.