RPG Gets XML Boost in V5R4

RPG
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

IBM continued its commitment to RPG IV by announcing support for native RPG IV-based XML support. Although you must upgrade to V5R4 of i5/OS, I consider this the first compelling enhancement to RPG IV since qualified data structures and data structures as arrays were introduced back in V5R1. If you're unsure of what XML is or you need a primer on it, visit the W3C organization for more information.

Before I review the XML capabilities of RPG IV at V5R4, let's look at the other changes IBM introduced into RPG IV with this release.

SEU Syntax Checking for /Free

While virtually no one else will mention this, I know that most of you continue to love PDM/SEU and use SEU for editing. If you've used /free RPG IV syntax at all, you know that one of the frustrating features is that you have to wait for the compiler to compile the code to give you your syntax errors. That's all changed on V5R4, as SEU now syntax checks everything, even /free syntax.

Data Structure Mapping and EVAL-CORR

IBM finally added to RPG IV what has been in COBOL forever: Move Corresponding. This COBOL functionality is ported to RPG IV in the form of a new opcode named EVAL-CORR. (You didn't read that wrong; yes, there is a hyphen in the middle of the opcode.) It stands for "Eval Corresponding" and allows you to copy one data structure to another on a field-by-field basis.

Think of it this way: The Copy File (CPYF) command has a Format Option (FMTOPT) parameter. In that parameter, you have an option to map the fields of one file to the other. That way, when the copy is performed, the target file does not need to have an identical layout. Specifying FMTOPT(*MAP)—or FMTOPT(*MAP *DROP) to be precise—will copy the fields from the original file to the corresponding fields of the target file based on field names, no locations.

RPG IV has needed a data structure mapping option ever since IBM introduced qualified data structures because you can now have data structure subfields with the exact same names in two or more data structures.

Programmers needed a way to copy the subfields from DS2 to DS1 based on subfield name, not just a byte-by-byte copy. People asked for "Eval with mapping." Some old COBOL programmers equated it to COBOL's "MOVE CORRESPONDING" command, so we got EVAL-CORR. I'm sure the email lists will be filled with syntax-error-related questions on this one.

The correct syntax is as follows:

  eval-corr DS2 = DS1;

I was an advocate of EVALMAP or COPYDS as the opcode name, but it is what it is, and we will all learn to live with it.

How EVAL-CORR Works

EVAL-CORR works by copying the fields from the original data structure to a target data structure based on field names and similar data types. The field name mapping is obvious, but what does "similar data types" mean? It means this: If the subfield named CUSTNO is Pkd(7,0) in DS1 and is Zoned(7,0) in DS2, the copy still works. Similar functionality applies to data fields (different date formats) and character fields (varying to fixed, fixed to varying).

Here's an example. The original data structure, DS1, is represented as follows:

     D DS1             DS                  Qualified
     D  CustNo                        7P 0
     D  Address                      30A
     D  City                         25A
     D  State                         2A
     D  ZipCode                      10A
     D  Phone                        10P 0
     D  BalDue                        7P 2
     D  TotSales                     11P 2

The target of the move is data structure DS2, as follows:

     D DS2             DS                  Qualified
     D  RegionID                      4A
     D  SalePerson                    5I 0
     D  Phone                        10S 0
     D  ZipCode                      10A
     D  CustNo                        7P 0
     D  Phone                        10P 0
     D  BalDue                        7P 2
     D  TotSales                     11P 2

A simple EVAL-CORR is all that's necessary to copy the like-named fields from DS1 to DS2, as follows:

     C                   eval-corr DS2 = DS1

This copies all the fields from DS1 that also appear in DS2. The other positions—that is, the other subfields—are unchanged.

If a field in DS1 is also in DS2, but the data types are not compatible, then the subfield is not copied. It would have been cool to have an option someplace that allowed you to map the fields from numeric to character or character to numeric or what have you. But that's always an option for a future enhancement.

DEBUG Keyword

The DEBUG keyword has been enhanced to allow you to embed information about the XML SAX parser and to provide more granular control over INPUT and DUMP options.

The valid options for the DEBUG keyword are as follows:

  • *NO— This value indicates that no debugging aids are to be generated into the module. Specifying DEBUG(*NO) is the same as omitting the DEBUG keyword.
  • *YES—This value is kept for compatibility purposes. Specifying DEBUG(*YES) is the same as specifying DEBUG without parameters or DEBUG(*INPUT : *DUMP).
  • *INPUT—Fields that appear only on input specifications are read into the program fields during input operations.
  • *DUMP—DUMP operations without the (A) extender are performed.
  • *XMLSAX—An array of SAX event names is generated into the module to be used while debugging a SAX event handler.

NULL Indicator Enhancements

Somebody slap me! I read this and fell asleep. Apparently, there is a use for the %NULLIND() built-in function. Now, with V5R4, IBM supports the so-called null indicator map for qualified data structure subfields.

You can view this thing in debug mode by typing in the following debugger command:

  ==>  eval  _QRNU_NULL_dsname.subfield

Or you can specify only the data structure name, as follows:

  ==>  eval  _QRNU_NULL_dsname

In addition, you have an extra option on the OPTIONS keyword when creating a procedure and prototype. When OPTIONS(*NULLIND) is specified for a prototype/procedure parameter, and a data structure or null indicator field is passed, the null indicator map is passed along.

For example, if you have a qualified data structure containing subfields that support the null indicator, when you pass that data structure as a parameter, the null indicators are carried along with it to the subfield (if it supports them). Previously, they would have been lost.

Also, the new EVAL-CORR opcode copies the null indictor between data structures when ALWNULL(*USRCTL) is specified on the Header specification.

XML Support

If you ever tried to use the IBM XML Toolkit for RPG, you will appreciate the powerful and relatively easy XML support in RPG IV at V5R4.

IBM did a reasonably good job of adding read-only XML support (i.e., XML parsing) to the language. Again, there's the issue of my dislike of hyphens in opcode names, but I'll ignore that for now.

There are four pieces to XML parsing in RPG IV:

  • %XML—A built-in function that identifies the XML source code
  • XML-INTO—A new opcode that provides "EVAL XML" support
  • %HANDLER—A new built-in function that identifies a subprocedure to call to process the parsed XML (also the first implementation of a callback parameter in RPG IV)
  • XML-SAX—A new opcode that provides "EVAL XML" support that uses the Simple API for XML (SAX) parser

Let's look at the these components individually.

%XML Built-in Function

The %XML built-in function identifies the XML you want to read and the options that control how the XML is parsed. The syntax is as follows:

%XML( xmldocument { : options } ) 

The xmldocument parameter may be one of two things:

  1. Pure XML, such as "1234" in a field or literal. Typically, this would be a field if it were read from another system (using something like the free iSockets library) or from a file on your iSeries.
  2. The name of a file on the IFS that contains XML

If a field containing XML is specified, then the OPTIONS parameter is not necessary. If an IFS file is specified for the XML parameter, then the OPTIONS parameter must contain, at least, 'doc=file'. For example:

%XML( '/cozzi/mystuff/fedex.xml'  : 'doc=file')

This tells the %XML to use the file named FEDEX.XML in the folder /cozzi/mystuff. The options parameter is set to 'doc=file' to tell %XML that a file name is being specified.

The options parameter has many options. If more than one option is specified, it is separated from the previous option with a blank. Here are some of the more popular options. For a complete list, see the latest edition of The Modern RPG IV Language, 4th Edition, coming soon from MC Press.

"Options" Parameter Options
Parameter
Options
Description
doc
doc=string
doc=file
string, the default, tells the %XML built-in function that the XMLdocument parameter contains a string or field of XML (i.e., not a file name).
file tells the %XML built-in function that the XMLdocument parameter is the name of an IFS file that contains the XML.
case
case=lower
case=upper
case=any
This specifies how the XML is stored. Specifying case=lower or case=upper is faster than case=any because the parser will convert the XML to all uppercase when case=any is specified.
trim
trim=all
trim=none
This controls whether spaces (tabs, linefeeds, excess blanks) should be removed before assigning to variables. It applies only to character fields. Other data types always trim extra spaces.
allowmissing
allowmissing=no
allowmissing=yes
When you copy the XML values to a data structure, this option controls whether or not a node/element for each data structure subfield must exist in the XML. You probably want to set this to "yes" in most cases.
allowextra
allowextra=no
allowextra=yes
The complement of allowmissing, this option controls whether or not to allow extra nodes/elements in the XML that do not have a corresponding subfield name in the target data structure.
path
path=xmlnode/name
This is perhaps the most powerful feature of %XML. It allows you to directly retrieve an element in an XML document. If this option is not specified, the element retrieved is the same as the target field name (for standalone fields) and the names of the subfields (for data structures). For example, to retrieve the salary of an employee in this document, you would specify /employees/employee/salary, like so:



Bob Cozzi
2001-02-15
<salary>250000.00salary>

XML-INTO Opcode

The XML-INTO opcode is required in order to read/parse XML in RPG IV. It allows you to load XML into a data structure, a field, or a data structure array. This opcode, along with XML-SAX, is used instead of the EVAL opcode when processing XML.

The syntax of this opcode is as follows:

XML-INTO  target  %XML( 'xmldocument' { : 'options' } )

Here, target is the name of a field, data structure, or "handler" where the parsed XML is copied.

If target is the name of a field, the path= option should be specified to indicate which XML element is retrieved. If the path= option is not specified, the target name is used to determine the elements to retrieve. If more than one element with the target name or path= name exists, then an array may be specified. The number of elements returned is determined by the number of elements declared for the target array.

If target is a data structure, the data structure subfield names are used to identify which elements are returned. The allowmissing and allowextra options may be necessary, depending on the data in the XML. Nested data structures are permitted and fully supported. In fact, they are required in order to extract.

If target is a handler, things get a bit more complicated. To specify a handler, the %HANDLER built-in function is used, as follows:

%HANDLER Built-in Function

XML-INTO %handler(myProc : firstParm) %XML( 'xmldocument' { : 'options' } )

So how does this work? The procedure you pass on the %handler() built-in function is called each time the XML parser fills up a multiple-occurrence data structure (actually, a data structure array). Then, that same procedure needs to do something with the data. Once finished, the procedure returns, and control resumes in the XML parser, filling up another set of elements in the data structure array. Then, the procedure is again called. This process repeats until all of the XML is processed.

The second parameter of %handler is required. It is a user-supplied value passed as the first parameter of the handler procedure.

The handler procedure has two additional parameters: 1) a data structure array containing the data from the parsed XML document and 2) the count of the number of elements populated in the data structure array by the XML parser.

This requires the use of the LIKEDS keyword for the data structure array. For example:

FCONTACTS  UF A E             DISK    Prefix('CT.')

D CT            E DS                  ExtName(Contacts)
D                                     Qualified

D Contact         DS                  Qualified 
D  CustNo                        7P 0
D  Address                      30A
D  City                         25A
D  State                         2A
D  ZipCode                      10A
D  Phone                        10P 0

D  whyme          S              1A  '?'

D ContactHdlr     PR
D  hello                         1A   Const 
D  Contacts                           LikeDS(Contact
D                                     Dim(128)
D  nCount                       10I 0 

 **  Here's the normal RPG IV syntax
C                   XML-INTO  %Handler(ContactHdlr : whyme)
C                              %XML(myXMLFile : 
C                              'doc=file path=contacts/contact')

 **  Here's the /free version
 /free
    XML-INTO  %Handler(ContactHdlr : whyme)
              %XML(myXMLFile : 'doc=file path=contacts/contact');
 /end-free


P ContactHdlr     B                   Export 
D ContactHdlr     PR
D  hello                         1A   Const 
D  Contacts                           LikeDS(Contact
D                                     Dim(128)
D  nCount                       10I 0 

D i               S             10I 0

C                   for        i = 1 to nCount
C                   eval-corr  ct = Contacts(i)
C                   Write     Contacts
C                   endfor
P ContractHdlr    E

In this example, the data is loaded from a file whose name is stored in the MYXMLFILE variable (not shown). The field WHYME is passed as the first parameter to the handler procedure named CONTACTHDLR. In addition to the WHYME parameter (which is a user-defined parameter, so it can be anything), two more parameters are passed: 1) the parsed XML in the form of a data structure defined on the procedure itself and 2) the number of elements passed in on the second parameter.

The handler above copies each element from the CONTACTS data structure to the database file and then writes it out to the database.

XML-SAX Opcode

The XML-SAX opcode is also required in order to read/parse XML in RPG IV. It starts the Simple API for XML (SAX) parser. The handler is called for each chunk of XML that is recognized. A primary difference from the regular XML parser in RPG IV is that the SAX parser sees each element in the XML once (as it is loaded), whereas the regular parser gives you access to the elements in the XML randomly. The syntax of the XML-SAX opcode is illustrated below.

XML-SAX(e)  %handler(myProc : firstParm)  %XML( 'xmldoc' { : 'sax-options' } )

For more information on the XML SAX parser, see these links on the Web:

http://www.w3.org/DOM/faq.html#SAXandDOM

http://www.saxproject.org/

Long Live RPG

The V5R4 enhancements to RPG IV confirm a positive trend in RPG IV; it is the language to use if you're building applications for today's modern application world.

Bob Cozzi is a programmer/consultant, writer/author, and software developer of the RPG xTools, a popular add-on subprocedure library for RPG IV. His book The Modern RPG Language has been the most widely used RPG programming book for nearly two decades. He, along with others, speaks at and runs the highly-popular RPG World conference for RPG programmers.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$