DB2 Integration with SQL Server 2005, Part I: Linked Server Enhancements

DB2
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

As the old saying goes, "good things come to those who wait." At last, the long-anticipated release of SQL Server 2005 has finally arrived, and there are plenty of good things in it. This article focuses on how SQL Server 2005 makes integration with the DB2 database easier thanks to linked server enhancements.

What Is a Linked Server?

Introduced way back in SQL Server 7.0, linked servers allow SQL Server to query OLE DB and ODBC data sources. This means a data source can be another SQL Server, a DB2 database, a spreadsheet, a text file, or almost any other data store. Once the linked server has been defined, remote data sources can participate in joins with local tables or other linked servers. Depending on the capability of the OLE DB provider, the linked server data source may even be updateable. For instance, SQL Server can be used to update or delete rows in the table of a DB2 database based on records in a local table.

As a brief review of using linked servers in Transact-SQL (T-SQL is SQL Server's dialect of SQL), say we have a linked server called DB2400 that points to a DB2 database on the iSeries. Data can be queried from the linked server by using a four-part naming convention or by using the OpenQuery function:

-- Query DB2 table using four part table name
Select*
  From DB2400.S104X824.QIWS.QCUSTCDT A
  Join CustInfo.dbo.customers B On B.CustomerId=A.CustID

-- Query DB2 table using OpenQuery function
Select*
  FromOpenQuery(DB2400,
       'Select * From QIWS.QCUSTCDT')AsRemote
  Join CustInfo.dbo.customers B On B.CustomerId=Remote.CustID

Of course, more complicated queries can be constructed. For a thorough review of linked servers and their capabilities, see "Running Distributed Queries with SQL/400 and SQL Server 7.0" (AS/400 Network Expert, September/October 2000) and "Patch Those Leaky Interfaces" (MC Mag Online, June 2003).

Setting Up a Linked Server

A linked server must be configured before it can be used. Setup requires two steps:

  1. Install the desired ODBC driver and set up a data source name (or install the desired OLE DB provider). This article uses the ODBC driver that comes with iSeries Access V5R3.
  2. Create the linked server by using T-SQL statements (or by using the new SQL Server Management Studio object explorer graphical interface. Expand the Server Objects node to find the Linked Server definitions. Note: This location for maintaining linked servers is a change from the SQL Server 2005 beta and prior editions of SQL Server, where linked servers were located under the Security node.)

The code below demonstrates how to create a linked server called DB2400 that is based on an ODBC connection with a DSN name called AS400:

--
-- Create Linked Server using ODBC DSN "AS400"
--
sp_addlinkedserver
    @server=N'DB2400',
    @srvproduct=N'DB2 UDB for iSeries',
    @provider=N'MSDASQL',
    @datasrc=N'AS400',
    @provstr='CMT=0;SYSTEM=as400.mycompany.com',
    @catalog='S104X824'
go
--
-- Define the credentials that will be used to
-- access objects hosted by the Linked Server
--
sp_addlinkedsrvlogin @rmtsrvname=N'DB2400',
                     @useself='false',
                     @rmtuser=N'MyUser',
                     @rmtpassword='MyPassword'
go
--
-- RPC option is required for doing EXEC AT
--
EXECsp_serveroption'DB2400','rpc out', true
go

The built-in stored procedure sp_addlinkedserver is used to register a linked server. The provider name in this case is MSDASQL, which is the OLE DB wrapper around the ODBC APIs. The data source (datasrc) parm contains the name of the ODBC DSN, which is configured under administrative tools (on Windows 2000 and above). The provider string (provstr) parameter overrides any of the ODBC DSN's default settings. Finally, the catalog parameter should be the name of the relational database entry for your iSeries (by default, it is the same as the system name.)

Stored procedure sp_addlinkedsrvlogin defines the login credentials for the server. In the example, the rmtUser and rmtPassword entries dictate the credentials to be used when SQL Server attempts to access a resource on the linked server.

Finally, in order for a linked server to use some of the new linked server features, the rpc out setting must be set to true. This can be accomplished programmatically by using sp_serveroption, as shown in the example.

To remove the linked server entry, issue the following commands:

sp_droplinkedsrvlogin @rmtsrvname=N'DB2400',@locallogin=Null
go
sp_dropserver @server=N'DB2400'
go

 

Latest Enhancements

SQL Server's ability to access remote data sources via linked servers often makes the programmer's life easier by easily allowing remote data to be joined and updated using T-SQL. However, linked servers have suffered from a few deficiencies, including the inabilities to use parameter markers, execute stored procedures, and execute Data Definition Language (DDL) statements such as CREATE TABLE. Under these circumstances, Data Transformation Services (DTS) or another tool was used.

These other programming options may no longer be needed because linked servers have increased capability. The EXEC command has been enhanced with an AT clause to specify that an SQL statement should execute on a linked server:

EXEC statement AT linked server.

Keep in mind that the SQL statement submitted with EXEC AT will be in the SQL dialect of the remote server (in this case DB2).

Using the DB2400 linked server (defined above), here is a sample of how to use EXEC AT to retrieve data from the iSeries:

--
-- As with OpenQuery, enhanced EXEC support allows pass 
-- through queries to be executed on a linked server.
--
EXEC('SELECT * FROM DATALIB.OrderHdr WHERE OrderID>1')
AT DB2400

Here is an example of issuing a DDL statement followed by an insert to a remote DB2 table using the four-part table convention:

--
-- EXEC AT can issue DDL statements
--
EXEC('Create Table DATALIB.OrderTemp 
(OrderId Int Not Null, PartId Char(15) Not Null)'
AT DB2400
GO

InsertInto DB2400.S104X824.DATALIB.ORDERTEMP
Values(1,'BICYCLE')
GO

This kind of power allows a developer to write a program capable of complex cross-platform database access entirely in T-SQL.

Further, parameter markers can now be used to create optimized, re-useable statements. Here is an example of how the previous insert statement can be done using the new EXEC AT syntax:

-- Insert Example
Declare @OrderId Int
Declare @PartId  Char(15)
Set @OrderId=2
Set @PartId='MOTORCYCLE'
EXEC('Insert Into DATALIB.OrderTemp Values (?,?) 
       With NC',@OrderId,@PartId) AT DB2400

Update statements are allowable as well:

-- Update example
Declare @OrderId Int
Declare @PartId  Char(15)
Set @OrderId=2
Set @PartId='MOTORCYCLE1'
EXEC('Update DATALIB.OrderTemp 
          Set PartId=? 
        Where OrderId=?',@PartId,@OrderId) AT DB2400

Keep in mind that, while most Data Manipulation Language (DML) statements can also be accomplished with the traditional four-part syntax, EXEC AT allows the additional flexibility of using things like built-in functions and user-defined functions that are defined only on the host. For example, if you need to use a datalink-related DB2 function, you would only be able to invoke this function using EXEC AT because T-SQL knows nothing about the DB2 datalink data type and related functions.

Here's an example of a stored procedure call that accepts a parameter.

Set NoCount On
Declare @OrderID Int
Set @OrderID=10249
--
-- Call Parameterized Stored Procedure
--
Exec('Call DATALIB.GetOrders (?)', @OrderID) AT DB2400

To retrieve the result of an output variable from a stored procedure, specify the OUTPUT keyword after the variable name, as follows:

-- Test Stored Proc with output variable
Set NoCount On
Declare @OrderID Int
Exec('Call DATALIB.GETNEXTORDER (?)', @OrderID OUTPUT) AT DB2400

Select @OrderID

If the stored procedure produces a result set, it will be returned as well. This can be beneficial in a client/server environment for applications using both DB2 and SQL Server calls because both systems can be accessed from a single SQL Server connection. This setup relieves the need to have DB2 connectivity installed and configured on each desktop.

One drawback of EXEC AT is the limited programmatic support for SELECT and stored procedure result sets. While EXEC AT provides the ability to specify complex and parameterized SELECTs that will execute more efficiently on the remote server, EXEC AT does not allow joins or programmatic access to the result set. However, this limitation can be overcome by returning the results from EXEC AT to a temporary table using the INSERT/EXEC statement.

--
-- The following Insert Exec example requires
-- a Distributed Transaction so the MS-DTC
-- service must be running.
--
Set NoCount On
Declare @OrderID Int
Set @OrderID = 10248

IfObject_ID('tempdb..#tmpOrderHdr')IsNotNull
    DropTable #tmpOrderHdr

CreateTable #tmpOrderHdr
(OrderID    IntNotNullPrimaryKey,
 CustomerID VarChar(5)NotNull)

--
-- Insert data into local SQL Server table
-- from parameterized SELECT in DB2
--
InsertInto #tmpOrderHdr
Exec('SELECT OrderID,CustomerID 
         FROM DATALIB.OrderHdr 
        WHERE OrderID=? With NC', @OrderID) AT DB2400
--
-- Do programmatic access to result set 
-- such as cursor processing or JOINs here.
--
Select*
  From #tmpOrderHdr

Alternatively, INSERT/EXEC can be used against a stored procedure instead of a SELECT:

--
-- A DB2 stored procedure can be executed
--
Set @OrderID = 10249
InsertInto #tmpOrderHdr
Exec('Call DATALIB.GetOrders (?)', @OrderID) AT DB2400

Running INSERT/EXEC against a linked server requires a distributed transaction. In a nutshell, a distributed transaction (DT) creates an environment where database consistency and atomicity concepts are broadened across multiple database (even heterogeneous) servers. In a DT, data modifications must be successful on multiple systems in order for the changes to be committed. Whereas transactions in the DB2 world normally require journaling, it is not required in this scenario to run an INSERT/EXEC statement.

To use a distributed transaction, the following setup is required:

  • The MS-DTC (distributed transaction coordinator) service must be running.
  • You must have iSeries Access V5R1 or higher (make sure to use the latest service pack as there have been various bugs).
  • The MS-DTC service should have support for XA transactions enabled. On a Windows XP machine, this can be accomplished by selecting Control Panel -> Administrative Tools -> Component Services. Expand the component services and computers node. Right-click on My Computer and choose Properties. Select the MSDTC tab and choose the Security Configuration button. Next, click on Enable XA Transactions. Figure 1 shows a sample configuration screen shot.


http://www.mcpressonline.com/articles/images/2002/Linked%20Server%20EnhancementsV4--03220600.jpg

Figure 1: Here's an example of what your configuration screen would look like. (Click image to enlarge.)

OLE DB Provider IBMDASQL

Linked servers use Microsoft's OLE DB technology. OLE DB providers are often preferred over ODBC drivers because using ODBC requires an extra programmatic interface layer between the OLE DB and ODBC standards. However, in the case of the iSeries, the ODBC driver offers the better of the two options because it supports more features. Because IBM continues to enhance the OLE DB providers for the iSeries, I thought it worthwhile to test one of them.

iSeries Access V5R3 comes with a new SQL-only OLE DB provider called IBMDASQL. Below is a T-SQL script that defines a linked server called DB2400OLEDB that uses the IBMDASQL provider.

sp_addlinkedserver @server=N'DB2400OLEDB',
                   @srvproduct=N'DB2400 UDB for iSeries',
                   -- IBMDA400/IBMDASQL are OLE DB providers 
                   -- for DB2 UDB for iSeries. IBMDASQL is
                   -- available with iSeries Access V5R3
                   @provider=N'IBMDASQL',
                   -- System Name
                   @datasrc=N'AS400.MYCOMPANY.COM',
                   @catalog='S104X824'

go

sp_addlinkedsrvlogin @rmtsrvname=N'DB2400OLEDB',
                     @useself='false',
                     @rmtuser=N'MyUser',
                     @rmtpassword='MyPassword'
go
--
-- RPC option is required for doing EXEC AT
--
EXEC sp_serveroption 'DB2400OLEDB','rpc out', true

By substituting the DB2400OLEDB linked server name in the prior T-SQL examples, I found a mixture of things that did and didn't work (including differences in what worked in the SQL Server 2005 June CTP beta vs. the released product). Because of these issues, I wouldn't recommend using IBMDASQL at this time.

The Integration Factor

When integrating data from heterogeneous data sources, what used to take loads of time to program now can be done quickly thanks to these linked server enhancements. And if SQL Server 2005's linked servers don't offer enough data access versatility, custom stored procedures and table-valued functions can be written in a .NET programming language such as C# or VB.NET. I'll discuss this concept in Part 2 of this series.

If you're thinking of upgrading to SQL Server 2005, know that there are plenty of new tools to help with data integration challenges, whether with the iSeries edition of DB2, Oracle, or MySQL. Microsoft has done a good job of creating versatile and easy-to-use programming tools to make disparate systems talk seamlessly.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$