Variable-Length Fields

RPG
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

For decades, we've been using fixed-length character fields in RPG to store information. Every one of us has used these fields; it's virtually impossible to write an RPG program without them. Then, about 15 years ago, IBM added a variation to character fields to the database: variable-length fields.

Added to DB2/400 but not to RPG, variable-length fields provide a way to declare very long fields in the database yet allocate only a minimum amount of storage for the variable. This was accomplished by placing the additional data in an overflow area of the database object--not directly in the record itself--when a certain minimum length was exceeded.

Previously, you could read a variable-length field from within RPG, but you needed to create a data structure with two subfields and then map the variable-length field to that data structure. Consequently, few people used variable-length fields, until now.

In OS/400 V4, IBM added variable-length field support to the RPG IV language. RPG III is still dealing with that data structure technique, but RPG IV excels at manipulating variable-length fields. In fact, frequently, variable-length fields can be more efficient than traditional fixed-length fields.

Varying-Length Fields or Variable-Length Fields?

In the DB2/400 database, variable-length fields are known as variable-length fields; in RPG IV, they are known as varying-length fields. I'm not sure what that subtle difference means, if anything, other than the RPG IV developers used a different keyword name for RPG IV ("VARYING") than the DDS team did for the database ("VARLEN").

To declare a variable-length field in DDS, you simply declare a traditional character (alpha) field and add the VARLEN keyword. The VARLEN keyword supports one optional parameter: the allocation length. The allocation length is supposed to be used to represent the average or typical length of the data in the field. The declared length of the field is the maximum length. When an allocation length is specified, data up to that length is stored in the database record, which improves performance.

To declare a field named EMAIL in the database to hold an email address, a field with a length of 250 is declared. However, since most email addresses have fewer than 30 characters, the field is declared as a variable-length field with an allocated length of 40, as follows:

A           EMAIL        250A         VARLEN(40)

This field stores data beyond the 40 bytes in a special area (space) of the database and retrieves it when needed.

To declare this type of field in RPG IV, you declare a regular character field and add the VARYING keyword. To declare our EMAIL field in RPG IV, we'd use the following Definition specification:

D email           S            250A   VARYING                              

For some reason, IBM's developers did not include the VARLEN keyword's allocated length parameter when they implemented the VARYING keyword in RPG IV. It doesn't really apply to RPG IV anyway, which doesn't have the additional compiler support for optimization, so it wouldn't really be of value today.

You can also declare a program-described file's field as a variable-length field using the *VAR "format type" on the Input specification. This value goes in the same positions as the date format of date fields: positions 31 to 34. For example:

I                        *VAR A   51  300  EMAIL

Of course, RPG IV imports externally described files with variable-length fields from the database just fine.

More Efficient Data Manipulation

Using variable-length fields (or varying length fields, if you prefer) can be more efficient than using traditional fixed-length fields. The reason for this is that virtually every operation code that permits the use of a character field has been optimized to work more effectively when a variable-length field is specified.

A variable-length field contains what is referred to as the current length. The current length is supposed to indicate how much data is currently in the field. The current length is stored in two extra bytes in front of the field's data. For example, a 15-byte variable-length character field actually uses 17 bytes of storage. See the table below.

Length
Data Storage


B
o
b

C
o
z
z
i
   
   
   
   
   
   
0
0
C
9
8
4
C
A
8
8
4
4 4 4 4 4 4
0
9
2
6
2
0
3
6
9
9
9
0 0 0 0 0 0

You don't really have direct access to the memory where the current length is stored. You may, however, retrieve the current length and change the current length, if necessary, by calling the %LEN() built-in function.

The current length, which is automatically set when a value is copied to a variable-length field, is based on the length of the data that is copied to the variable-length field. If the value being copied is a fixed-length character field that is 25 bytes in length, then the current length of the variable-length field is set to 25. If the variable-length field is shorter than 25 bytes (as in this example), then the copied data is truncated just as it would be with a traditional fixed-length field.

At compile time, the compiler can take advantage of a variable-length field's current length. For example, assume a variable-length field is copied to a fixed-length field, and the variable-length field is 1,000 bytes in length but contains only 10 bytes of data (i.e., the current length is 10). The MOVEL operation code that copies the value to another fixed-length field or another variable-length field uses the current length to speed up the copy by manipulating only those 10 bytes instead of all 1,000.

To set the current length, the compiler looks at the data being copied to the variable-length field. For example, if a variable-length field named CUSTNAME is assigned "Bob Cozzi," then the current length of the field is set to 9, as follows:

C                   eval      CustName = 'Bob Cozzi'

Since the compiler knows how long the literal value on the right side of the equals sign (=) is, it uses that information to set the current length of the variable-length field and copies only those 9 bytes to the field. That is, it automatically adjusts the variable-length field's current length to the length of the data copied into the field.

As mentioned, %LEN() may be used to retrieve the current length of the variable-length field. In addition, %LEN() may be used to change the current length of a variable-length field. Why would you ever want to do that? Suppose you're working in an application that requires a "buffer" of at least a certain length. You may need to increase the current length of the variable-length field to that minimum required length.

How Long Is It?

A variable-length field can be any length from zero (0) up to the field's declared length. Yes, I said "zero." A zero-length variable-length field is considered to be "empty." You can compare the field's length to zero using %LEN(), or you can compare the field's content to two consecutive apostrophes ('').

When you declare a variable-length field, you declare it with the maximum number of bytes you think you will need. You don't really save any storage over a fixed-length field--in fact, you use up more because of the 2-byte binary-length prefix on the field--but that's virtually irrelevant. The advantage is in the performance of data being copied to and from these fields. In addition, variable-length fields may be used as parameters when calling subprocedures.

To use a variable-length field as a parameter of a procedure, simply write the parameter specification and add the VARYING keyword. That's all there is to it. You can now pass variable-length fields to subprocedures. But it goes further than that....

If you prototype a procedure, you can pass variable-length fields on parameters that are declared as fixed-length fields. The compiler is able to do a little more work for you when you use prototypes than it can when you use the traditional, yet obsolete, CALL/B/PARM syntax.

Variable-length fields are not widely used yet, but they should be. They provide you with an easy solution to several common programming challenges, and they excel at simplifying calls to subprocedures.

Bob Cozzi has been programming in RPG since 1978. Since then, he has written many articles and several books, including The Modern RPG IV Language--the most widely used RPG reference manual in the world. Bob is also a very popular speaker at industry events such as COMMON and RPG World and is the author of his own Web site, www.rpgiv.com, and of the RPG ToolKit, an add-on library for RPG IV programmers. Bob runs his own one-man iSeries consulting and contract programming firm in the Chicago area.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$