Universal Database DB2: The Sequel

DB2
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

A couple of years ago, the relational database of the AS/400 had no name. Then, IBM referred to it as DB2/400. As of OS/400 V4R4, however, IBM has christened the relational-database-with-no-name as Universal Database (UDB). Find out the history of UDB and what new features will be coming with this new, improved database.

The AS/400’s integrated database in V4R4 has been renamed DB2 Universal Database (DB2 UDB). However, we AS/400 advocates know that DB2/400 by any other name would smell as sweet. So why the new name? Isn’t UDB just the same old DB2? UDB is the same database we have come to know and love but with a number of powerful new features. (These features won’t come immediately with V4R4, though; they will not be available until the third quarter of 1999.) DB2 UDB has become IBM’s cross-platform relational database (RDB). For us, this is a significant point because our good old DB2 is now on multiple hardware and software platforms, including Intel/Windows NT/98, Intel/OS/2, PowerPC/AIX, SPARC/Solaris, and HPPA/HPUX. With the increased interest in multiplatform Internet applications, our knowledge of DB2, and hence UDB, is becoming very marketable.

Get with the SQL Program

If you’re not an SQL proponent, stop reading, because all the enhancements to DB2/400 that come with its new name are available from SQL. But before you sign off, give me a moment to sell you on SQL. SQL is the true cross-platform language; DB2/400 is one of the few RDBs that allows you to create, read, update, delete, and otherwise maintain your data outside of SQL. Speed comparisons of record-level versus SQL access aside, SQL is the best way to describe your data, the best way to perform set-based processing, and the best way to perform ad hoc updates and retrievals. (And many SQL champions will argue the record-level access issue with you although they are experts in query optimization). When RDBs were first invented, the intended method for manipulating data was always relational algebra. Now, you can quit reading—that is, if you want to ignore the power and ubiquitousness of SQL.

A History Lesson

UDB was not developed recently; it has been on many IBM platforms for several years now. When UDB first became available, however, it was not called a universal

database but rather an object-relational database (ORDB). A half-dozen years ago, vendors of RDB systems (such as Oracle, IBM, and Computer Associcates) came out with ORDBs to compete with a new class of database systems—object-oriented databases (OODB). OODBs came into vogue with the advent of object-oriented programming languages, initially with the success of C++ but more recently with Java. These OO languages dealt with objects, which, conceptually, are data structures tightly coupled with the functions that manipulate the data of those structures. RDBs are also structures of data that can be mapped to OO languages. The problem, however, was that RDBs did not have the facilities to handle the nontraditional data types that modern applications required, such as images, movie clips, graphics, spreadsheets, and also lengthy character-based documents such as legal records or international shipping specifications.

I worked with an OODB for almost two years, and even though I am an OO champion, I was not an OODB advocate. OODBs had problems. They reflected a completely new technology filled with unforeseen glitches. But the biggest problem I had was that OODBs had no query facilities. There was no way to verify database update code without writing more code, and then who’s to say that the verification code was correct? Also, several years ago, there was no way to create ad hoc reports with OODBs. Today, even though OODBs are making query facilities available, they’re still kludgy. The point is, when I’m developing business applications using C++, Java, RPG, or some other programming language, I want my RDB.

Regardless of what I thought, OODBs were gaining in popularity, and to compete, RDB vendors developed what were initially called ORDBs, which provided BLOBs (binary large objects) and CLOBs (character large objects) to support the nontraditional data as well as user-defined data types and user-defined SQL functions. But ORDBs aren’t really object-oriented; they don’t have inheritance or polymorphism capabilities—hence, in my opinion, the change in name from object-relational database to universal database.

Today’s applications need controlled storage for large pieces of data. Just look at the C drive on your PC, and you’ll see a variety of data types, including spreadsheets, images, and documents. Now, associate these data types with applications. Consider a real estate application, for example, which might require the controlled storage of graphical images of homes along with spreadsheet analyses of cost estimates. In this same package, a mortgage application might have lengthy, character-based legal documents as well as signatures that are not quite as long but, being binary, are nontraditional. Current applications also need data types that are strongly typed. For instance, a data type called Money might be a fixed-point, two-decimal, 10-digit field. The Money data type would have type-safety because you can’t add an integer value of U.S. Postal Service code (ZIP code) to it. Applications today also need reusable functions that are more tightly coupled with data than with traditional application functions. UDB gives us these capabilities so we don’t have to resort to OODBs (and OO languages) to create modern applications.

Universal Appeal

IBM says its latest RDBs are universal because they have universal access, application support, extensibility, scalability, reliability, and management. They have universal access because UDB is accessible from a wide variety of protocols (see Figure
1). They have universal application support with Domino, Java, Net.Data, and legacy languages. They are universally extensible with their BLOBs, CLOBs, user-defined data types and functions, and datalinks (yet another strategy for supporting nontraditional data that I’ll get into later). These RBDs are universally scalable because of the symmetrical multiprocessing (SMP) and very large database (VLDB) capabilities of IBM platforms (the AS/400, for instance, supports up to 128 terabytes of disk storage for data warehousing). They are universally reliable because of IBM’s history of stable platforms (the AS/400 is known for its 99.9 percent availability rate). They have universal management because UDB can be configured and maintained with a graphical interface that is the same regardless of the UDB platform.

The BLOB Meets Rochester

Going back to my real estate example, see Figure 2 for the SQL syntax required to create a file (table in SQL parlance) called House. It’s pretty much the same old SQL create- table syntax but for the BLOBs and CLOBs. The HouseImage field is defined as a 2-MB BLOB. The Contract field is a 1-MB CLOB. The Signature field is another BLOB field that proves John Doe did, in fact, sign the contract. But what about that SalesPerson field, with its Datalink datatype? Often, your LOBs may not reside on the same machine as your UDB, or you simply don’t want LOBs stored in your RDB. In UDB, the datalink data type allows you to specify the URL that refers to a LOB on your local or a remote machine. This allows you to keep LOBs in the Integrated File System (IFS), which is optimized for byte-stream nontraditional data-like images. The SalesPerson field can store a URL that points to the location of LOB data on some platform connected via the Internet.

User-defined Data Types

The House file does not contain a field for the sale price. Perhaps I should have created a field with the SQL data type of Decimal with the following:

SalePrice Decimal(10,2)

But remember that I talked about a Money data type, so why don’t I go ahead and create this new data type with the following SQL statement:

CREATE DISTINCT TYPE Money AS Decimal(10,2) WITH COMPARISONS;

Now, I have a new data type called Money. The Decimal data type declaration that follows the AS clause is the source type; it must be one of SQL’s built-in data types. The WITH COMPARISONS clause serves as a reminder that instances of the new distinct type can be compared with each other using six comparison operators: =, <, <=, >, >=, and <>. This clause allows comparisons between fields of the same distinct type. Because they can be compared, so too can they be used by the SELECT statement’s ORDER BY, GROUP BY, and DISTINCT clauses.

The problem with user-defined data types is that the type-safety is so strong that you can’t easily perform mathematical operations on dissimilar data types. To add a standard decimal data type field to a type of field such as my new Money data type, for instance, you would need to use a technique known as casting. For example, suppose I want to increase all salaries by 5 percent. If the Salary field is of my Money type, I would have to use the following SQL statement:

UPDATE Employee SET Salary = Money(decimal(Salary) * 1.05);

The function called decimal casts the value of the Salary field to be of the decimal data type so it can be multiplied by one and five hundredths. The Money function takes the result of that calculation (a decimal value) and converts it to the Money data type. These two functions—the decimal function that converts a Money parameter to a decimal and the Money function that converts a decimal parameter into a Money value—were automatically generated when I created the distinct data type Money. You would think that these function calls increase processing, but in practice, they process very efficiently because, after all, the Money date type really is a decimal.

User-defined Functions

You might expect to be able to add two values typed as my new Money data type by taking the following approach:

SELECT “Total price: “, StickerPrice + RustProofing

FROM CAR
WHERE Make = ‘HONDA’ AND Model = ‘CIVIC’ and YEAR = ‘1997’;

Not so, however. You need to think of the plus (+) operator as a function, the add function. When you want to add two integers together, SQL already has an add function to do that, but when you want to add two values of a user-defined data type, you need to create an add function. Thankfully, the process is pretty easy:

CREATE FUNCTION “+”(Money, Money)

RETURNS Money
SOURCE “+”(Decimal(), Decimal());

The SOURCE clause says to convert each Money parameter to a decimal value and to use the built-in decimal data type’s plus function to perform the addition operation. Optionally, I could have created a function called add:

CREATE FUNCTION add (Money, Money)

RETURNS Money
SOURCE “+”(Decimal(), Decimal());

But then the SELECT statement above would have to say add (StickerPrice, RustProofing). The use of the plus operator is more intuitive.

If you want to be able to multiply a Money value by an integer, you have to create a multiplication function:

CREATE FUNCTION “*”(Money, Integer)

RETURNS Money
SOURCE “*”(Decimal(), Integer);

And if you further want to perform aggregate functions, such as averaging, you have to explicitly define such functions:

CREATE FUNCTION avg (Money)

RETURNS Money
SOURCE avg (Decimal());

So far, all the functions I have created turn around and ask SQL built-in functions to do the dirty work, but you can also program these functions yourself with your favorite high-level language (HLL). You may want to create functions that define behaviors that are significant to the distinct data type. For performance, you might want to create functions that execute some sort of predicate processing to decrease the resulting set before transferring the data to the client:

CREATE FUNCTION orderSelect (Money, Varchar(30))

RETURNS TABLE (orderNum Integer, customerNum Integer, total Money)

EXTERNAL NAME ‘ORD023RG’

LANGUAGE RPG

Cross-platform Databases, Applications, and Programmers

A couple of years ago, the AS/400 was considered a has-been. Now, the AS/400 has been reborn as a RISC-based, 64-bit, reliable, and secure operation system and an RDB that is standard across a wide variety of platforms and operating systems. Thanks to this rebirth, our applications and careers seem to have a bright future. DB2 UDB also gives us the vehicle to transport our applications, databases, and careers to other platforms. The only potential caveat is that we’ll have to use SQL to get there—a small asking price for the cost of admission.

ODBC Microsoft Windows Database Access Standard OLE DB Microsoft Windows Information Access Standard ADO Microsoft Windows Information Access Classes JDBC Java-based Database Connectivity
SQLJ Java-based Embedded SQL
SQL Standard SQL
DRDA X/Open Distributed Database Standard
CLI X/Open Database Access Standard
EDA/SQL IBI's EDA SQL Standard
DAL Apple's RDB Standard APIs
Net.Data Internet Data Access

Figure 1: UDB has what IBM considers universal access with support for a wide variety of protocols.

Create Table House (

ContractNo Integer,

LastName Char ( 25 ),

FirstName Char ( 15 ),

MiddleInit Char ( 15 ),

HouseImage BLOB ( 2M ),

Contract CLOB ( 1M )

Signature BLOB ( 1M )

SalesPerson Datalink( 50)
);

Figure 2: The SQL syntax to create a table that contains character and binary large objects is simple.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$