Practical SQL: Using Old World Tools with New World Data

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Using DDL to define your files provides a wealth of new features, but just which features should you embrace? Some of that depends on the tools you use, and this article explains a couple of pitfalls.

 

The ongoing SQL vs. native I/O debate really consists of two different debates: how to define your data and how to access it. Access is either via native I/O or SQL, but that's not the topic today. Today, I'm focused on the data definition debate, either the older Data Definition Specifications (DDS) or the more modern SQL-based Data Definition Language (DDL). I firmly believe this dispute has been decided in favor of DDL, but DDL is not perfect. Some potential pitfalls exist, and this article will address one of the more onerous problems.

When Is DDL Appropriate?

I'm going to go out on a limb here and say that you should create all of your new tables using DDL. Every one. If you can provide a situation where DDL doesn't do something that DDS does, I can almost guarantee that your application could be rewritten relatively easily in such a way as to no longer need that feature. Of course, the key word "relatively" covers a lot of ground, and there are those diehard DDS fans who point out that DDL just doesn't have all the features of DDS. This is a valid criticism, especially if you like to use select and omit on your logical views and then run them through a program with level breaks. But I'm not going to debate the point; for today's discussion, let's assume you want to use DDL. What sort of things might you want to look out for? Well, sometimes you'll have to use some non-standard keywords to do things in DDL that are specific to the IBM i. For example, you'll need to know some extra keywords to generate a record format name that's different from the file name, or to put both column headings and descriptive text on a field. Look for tips on those specific features another day.

 

For me, one of the biggest things has been to be careful about the attributes of my fields (or columns, in SQL parlance). You may find yourself wanting to use a field type in SQL that makes sense for the situation but causes some inadvertent side effects. I advise that, whenever you want to implement a new data type, you test it thoroughly first. The problem is determining just how to test it and where you might get bitten. Let me give you two examples.

Bad Dates

Sometimes I just can't help quoting favorite old movies (in case it doesn't ring a bell, that's from Raiders of the Lost Ark). Anyway, my problem wasn't exactly bad dates, but more how to handle uninitialized dates. When you move to date data types (type L in DDS, and type DATE in DDL), you have to consider what will happen with an uninitialized date. In the olden days of CCYYMMDD fields (which in turn replaced MMDDYY fields, but we don't talk about that), I could write a record to a file, and any uninitialized date fields would end up as zero. This was especially true if I used the technique of writing to a logical view that didn't define all the fields; my date fields would end up with all zeros. In SQL, you do the same thing by performing an INSERT that does not include that field: any numeric fields get initialized to zero.

 

But with the date data type, that's no longer an option. The L date field doesn't support a value of zeros. In fact, the low value of a date field is 0001-01-01, January 1st of the year 1. But here's the tricky bit: if you write to a file and don't explicitly initialize the date field, you don't get a low value. You get the current date! That's great when you want a current timestamp, not so much when you want a low value. In SQL, there are two ways to handle an uninitialized date: the low value of 0001-01-01 or else make the field null-capable and leave the value null. I'm going to avoid a long discussion on the pros and cons of null-capable fields and stick to the practical aspects (this is "Practical SQL," after all). The biggest drawback to null-capable fields is that they require a little programming gymnastics in order to implement them properly whether you use native I/O or embedded SQL. What RPG does works, but it always seems a little unnatural to me. The best example of how to implement null indicators in embedded SQL was written several years ago by one of my longtime programming heroes, Ted Holt.

 

I'm still not a fan of nulls, though, so instead I found a better way to get around the problem. If you define a date field in DDL, a couple of keywords will initialize the field as a low value without having to make it null-capable. Take a look at my DDL definition for the field named CHANGED:

 

CHANGED DATE DEFAULT '0001-01-01' NOT NULL

 

Ta da! Now if you INSERT a record and don't explicitly initialize the CHANGED field, it will get the low value of 0001-01-01. This same technique should be used for TIME and TIMESTAMP fields if you don't want to deal with nulls. The format is crucial; you see the proper format for DATE fields: TIME fields should be initialized with '00.00.00' and TIMESTAMP fields with '0001-01-01-00.00.00'. Be very careful with the punctuation. Separate date fields with dashes and time fields with dots. Separate the date and the time in a TIMESTAMP with a dash.

My INT Is Bigger Than Your INT

Another problem I ran into was with the data type BIGINT. This is used frequently in SQL tables for counter fields or unique IDs (one day I'll write an entire article on using unique IDs instead of key fields). A BIGINT is a 64 bit binary field. Such a field can go up to 2^64, which is eighteen quintillion or nearly 10^20. If you're familiar with integer numbers in RPG, you know they are defined as either I (for signed integer) or U (unsigned integer). When you specify the size of the field, you specify it in decimal digits: 3 (8-bit), 5 (16-bit), 10 (32-bit), or 20 (64-bit). So, at first glance, it would seem that you can create a 64-bit BIGINT field in DDL and then use that field in an RPG program. And for the most part, that's a true statement. But this is a perfect example of why you need to test all the possible scenarios. I created a file with a BIGINT field, and it worked flawlessly, right up until the time I went to look at the data in the table. Not one IBM i database tool would show it to me. WRKDBF failed (which is not a knock on Bill Reger's fantastic utility). DBU failed. Even the standard IBM i utility UPDDTA wouldn't let me modify a file with a BIGINT field.

 

So the upshot of this is that once I put BIGINT into a file, I can only access it via SQL, whether it's through STRSQL in the green-screen or through some other standard SQL tool such as SQuirreL SQL. I have to admit that I didn't see that one coming. I did come up with a workaround: you can use a 20-digit numeric field (packed or zoned) in place of the BIGINT. You get the same range, but now the DDS-oriented tools can work with the data.

 

So the moral of the story is that while DDL is the data definition technique of the future, not all features of DDL are equally seamless to us legacy programmers. That should not be an excuse, however, to cling to DDS! Go forth and use DDL! Just be forewarned that you may need to do a little testing before you put it into production.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$