In this first post on SSIS components I will be looking at the OLE DB source component. This is likely to be the most commonly used source in SSIS packages as it connects to a wide variety of OLE DB compliant databases, such as Oracle and SQL server.
Data Access Mode
The data access mode is the means by which the OLEDB source component determines what query to send through to the underlying database to retrieve data. This can either be hardcoded in the component or populated using a variable. It is specified using the drop down circled in the image below.
There are four are available data access modes:
- Table or view
- Table or view name variable
- SQL Command
- SQL Command from variable
You will note that two of the available data access modes are essentially duplicates that allow the data access mode to be populated from a variable. Each mode is demonstrated in the sample packages, but I will only discuss the basic (i.e. non variable populated) modes below.
Data Access Mode: Table or view
The table view data access mode allows you to select the entire contents of any table or view within the database that you have connected to. However, it does not allow for any constraints on the data that is retrieved, and if you wanted to restrict the rows that you processed you would have to do that downstream in the package. From an efficiency point of view this is not ideal unless you genuinely needs to retrieve all the rows from the source database object. If you follow what is sent to the database using SQL profiler you will see that the component sends a ‘SELECT*’ query against the specified object.
Because of the inability to filter what data you retrieve from the database, this data access mode is not commonly used unless shifting data wholesale from one source to another.
Data Access Mode: SQL Command
The a SQL Command data access mode is much more powerful as it allows you to craft custom T-SQL queries to retrieve the specific data you require from the database, including the use of stored procedures to return data. An SQL Command can also be parameterised to allow even greater flexibility by passing package variables through to the query.
It is sensible to avoid having complex SQL in the SQL Command however, as the parsing engine can get confused, and can send incorrect SQL through to the database, or be unable to understand the columns that it is expected to return. When things start getting too complex, it is best to push the complex code back to the database layer as a view or stored procedure.
Under the columns tab of the OLEDB source component, it is possible to restrict which columns are passed from the component downstream by unchecking the column name as shown below:
Unselecting columns however has no effect on the query that is sent through to the database. If you look at example data flow 5 in the sample package you will see that a number of columns have been unselected, but if you track the query that is sent through to the database engine it is still a ‘SELECT *’.
This means that if you only need a few columns from a large table, should restrict this by using an SQL Command which only retrieves the columns you need as otherwise source component will still retrieve all the data in the table.
The OLE DB source is a common component used to access databases. As a rule you should try to use the SQL Command Data Access mode to reduce the amount of data retrieved from the database. Column selection has no impact on what is retrieved from the database, just on what is sent downstream to the package.