String Manipulation

RPG
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times
In the late 1980s, IBM started enhancing the RPG language for the first time in years. The first of the new enhancements to RPG was the introduction of the so-called "string opcodes." "String" is a word we use in programming to mean a series of characters stored either in a field or as a literal. It is somewhat synonymous with a character field. So the term "string" is often used instead of "character string" or "character field."

The string opcode introduced back in the 1980s included the following:

  • XLATE--Translate the contents of string
  • CHECK--Search a string for the first character that does not match any characters within a second list of characters
  • CAT--Concatenate two strings to create a third string that consists of the first two strings
  • SCAN--Search a string for a pattern (case sensitive)
  • SUBST--Substring a string

Later, IBM added CHECKR, which mimics the CHECK operation code but starts from the right side of the string and moves left, returning the rightmost position in the file that does not match the second list of characters.

Today in RPG IV, all of these RPG III operation codes have been implemented as built-in functions, allowing you to perform string manipulation within an EVAL or IF operation. But what is still missing from RPG IV is some of the capability of other languages, such as the ability to center a character string within a field or to extract the rightmost characters from a field.

Since RPG IV supports procedures, we can do one of three things to provide support for a centering routine and a rightmost extraction routine.

  1. Wait for IBM to provide the support.
  2. Steal the support from another language, such as C or Java.
  3. Write procedures that provide the support ourselves.

If we choose the first option, we have two problems: 1) It might be a year or more before the next release of RPG IV is shipped, and 2) even if it includes these routines, they will not be portable back to previous release of OS/400 (a major problem in and of itself).

If we select the second option, we could have a routine that performs as well as it does in the other language and be able to implement it quickly. Unfortunately, I don't know REXX (the only iSeries language that has similar routines), so interfacing with it is an issue.

So we'll select the third option and write the thing ourselves; that way; we not only have a set of routines that is in RPG IV, but also have a routine that is portable back to Version 4.

Centering a Value in a Field

Listed in Figure 1 is the procedure source code for the CENTER procedure. It accepts one parameter--a character string (field or literal)--and returns that value centered within the boundaries of the input field. That is, if a 30-position field containing "IBM Computers" is passed, it would center the words "IBM Computers" within that 30-position field.

A field with a length of up to 256 characters may be passed to the CENTER procedure. If you need to use the CENTER procedure with longer fields, then simply increase the procedure's field lengths as necessary.

0001 P Center          B
      ** CENTER - Center a value within a character string.
      ** 
      ** Note: You probably won't need a length longer than
      ** 256, but if you do, simply change the lengths of
      ** the work fields in this procedure.
0002 D Center          PI           256A   Varying
0003 D  InString                    256A   Varying Const

0004 D nOrgLen         S              5I 0
0005 D nDataLen        S              5I 0
0006 D blanks          S            256A   Varying Inz
0007 D szData          S            256A   Varying
      ** Capture length of original input field
0008 C                   Eval      nOrgLen = %Len(InString)
0009 C                   if        nOrgLen < 1
      ** If the input string is empty, return an empty (blank) value.
0010 C                   return    ''
0011 C                   endif
      ** Strip off the leading and trailing blanks from input string
0012 C                   Eval      szData = %Trim(InString)
0013 C                   Eval      nDataLen = %Len(szData)

      ** Set blank-count to number of leading blanks required.
0014 C                   Eval      %Len(blanks) = 
0015 C                              %Int((nOrgLen - nDataLen) / 2) + 1
0016 C                   Return    blanks + szData
0017 P Center          E

Figure 1: A procedure to center text in a field

How Does It Work?

The Center procedure works by using the VARYING keyword combined with the CONST keyword for the input parameter. By using VARYING and CONST together, the system will accept both fixed- and variable-length fields as parameters. When a fixed-length field is passed, it is converted into a variable-length field automatically, with its length set to the length of the field. Then when we extract the length of the field (line 8 in Figure 1), we are actually retrieving the length of the field passed as the parameter.

Line 12 takes care of right-justifying the input string into the local variable. The %trim() built-in function strips off trailing and leading blanks from the value in the field. After all, there's no guarantee that the input value is already right-justified. It may in fact already be centered, so we need to make sure that it is right-justified. Also, since we are moving the input value into a variable-length field, we want to make sure we don't include the trailing blanks that may be passed with the input string. So we use %trim() instead of the more efficient %trimL() built-in function. The end result is that the szData field contains just the input data with no leading or trailing blanks.

Once the input data is isolated, you can easily calculate its length using the %len() built-in function (line 13). The only thing left to do now is to apply that simple formula we all learned in our 5th week of programming class in high school or college: Figure out the number of trailing and leading blanks in the original field versus the data in the field, and then divide by 2.

Lines 14 and 15 perform this calculation and set the length of the field named BLANKS to the number of blanks needed. You don't actually need to move blanks into the field, because RPG does that for you. You just need to set this variable-length field to the desired length. Then on line 16, use RPG IV's concatenation function to center the data in the field, returning it to the caller.

Using the Center Procedure

To use the CENTER procedure, simply create and include a prototype, and then enter the procedure code listed in Figure 1. Then, call CENTER with an EVAL statement, as follows:

     D Company         S             50A   Inz('The Big Company')
     C                   Eval      Company = CenterCompany )

The before and after images of the COMPANY field are as follows:

         *...v... 1 ...v... 2 ...v... 3 ...v... 4 ...v... 5
 Before:  The Big Company
 After:                     The Big Company

Extracting the Rightmost Characters

Listed in Figure 2 is the procedure source code for the RIGHT procedure. The RIGHT procedure extracts the n rightmost characters from the input string and returns them to the caller.

The first parameter is a character string whose data is used to extract the characters. The second parameter is the number of character to extract.

The RIGHT procedure ignores trailing blanks in the input string (parameter one), so there's no need to pass something like %trimr(myfield) to the procedure.

A field with a length of up to 4096 characters may be passed to the RIGHT procedure. If you need to use the RIGHT procedure with longer fields, then simply increase the procedure's field lengths as necessary.

0001 P Right           B                   Export                               
      ** RIGHT - Retrieve the rightmost N characters from a string.
      **
      ** Note: If you work with larger char fields, 
      ** increase the input and return lengths from 4k to
      ** some larger value that is compatible with your needs.
.....DName+++++++++++EUDS.......Length+TDc.Functions+++++++++++++++++
0002 D Right           PI          4096A   VARYING
0003 D InString                    4096A   VARYING VALUE                               
0004 D nCharCnt                       5I 0 Const                                
0005 D nLen            S              5I 0
.....CSRn01Factor1+++++++OpCode(ex)Factor2+++++++Result++++++++
0006 C     ' '           CHECKR    InString      nLen
.....CSRn01..............OpCode(ex)Extended-factor2+++++++++++++++
0007 C                   if        nCharCnt  <= nLen  AND
0008 C                              nLen > 0
0009 C                   return    %Subst(InString  
0010 C                                      : nLen - (nCharCnt-1)
0011 C                                      : nCharCnt)
0012 C                   endif
0013 C                   return    InString
0014 P Right           E

Figure 2: A procedure to extract the rightmost characters from a field

The RIGHT procedure has been optimized as much as possible. For example, in line 6, the CHECKR (check right) opcode is used to determine the length of the input value. CHECKR is used instead of the free-format %len(%trimr(inString) because of performance issues; CHECKR is much faster.

The first parameter (line 3) is a VARYING field (variable-length field) passed by value. This allows a local copy of the input string to be accessed and modified without altering the original input data. When a field is passed by value (the VALUE keyword is specified), the compiler creates a copy of the input parameter and that copy is sent to the called procedure. The called procedure can then modify the contents of the copy without ever touching the original data.

Lines 7 and 8 are used to check the input length and the extract length, making certain that they are valid. If not, then the original input string is returned to the caller. You might want to simply return an empty string (see line 10 in Figure 1) if an invalid length is specified.

The %subst() built-in function (lines 9, 10, and 11) is used to extract the rightmost n characters from the input string.

Using the RIGHT Procedure

To use the RIGHT procedure, simply create and include a prototype, and then enter the procedure code listed in Figure 2. Then, call RIGHT with an EVAL statement, as follows:

     D Company         S             50A   Inz('The Big Company')
     C                   Eval      Company = RightCompany : 3 )

The before and after images of the COMPANY field are as follows:

         *...v... 1 ...v... 2 ...v... 3 ...v... 4 ...v... 5 
Before:  The Big Company
After:   any

Note that the RIGHT procedure ignores the trailing blanks (positions 16 through 50) in the COMPANY field.

There are other "string" routines that need to be written. Some that come to mind include ToUpper, ToLower, InitialCaps, and SentenceCase. I've already published ToUpper and ToLower in previous issues; I'll leave those last two routines up to you.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$