Practical SQL: UDFs and Service Programs, Part I

SQL
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

In this installment of our continuing series on using SQL in real business applications, we introduce the concept of invoking service programs from within SQL.

 

We continue to provide you with examples of integrating SQL with traditional RPG. In this article, we're going to focus on how we can use a service program to extend SQL and provide an essential feature that SQL is sorely lacking: converting binary data in a character field to numeric values. Note that this is the opposite of something like the HEX function, which returns the hexadecimal representation of a field. This function will take some alphanumeric data, either in packed or binary form, and return the numeric value the data represents. This practical extension to SQL can be used by any shop.

What We're Going to Accomplish

We're going to create a user-defined function (UDF) that can take an alphanumeric field containing numeric data and extract the numeric value. UDFs are very powerful because they allow you to execute arbitrary logic within an SQL SELECT statement (or really anyplace in SQL that an expression can be used), essentially allowing you to extend the SQL syntax. If you're unfamiliar with stored procedures and UDFs and you'd like to know more about how they are implemented and used in the IBM i environment, you can refer to my previous articles on the subject. Click on the links to review Part I and Part II.

 

Why would we need to do this? There are several situations in which you might find binary numeric data embedded within an alphanumeric field, most of them dealing with legacy data. For example, you may have an old multi-format physical file (please, no shrieks of terror; plenty of shops out there still have this case, and you know who you are!). This is one way that we can get SQL to play nicely with those files. Well, not always nicely; we found that, as of V6.1, SQL on the IBM i really doesn't play well with those files anymore. In some cases, you need a really big hammer, such as a Materialized Query Table (MQT); I'll try to write about that another day.

 

But there is a case in which it's actually very common to have numeric data within an alphanumeric field: that would be the JOESD field of the file created in the OUTFILE keyword of the DSPJRN command. Without going through a lot of detail, if you do a DSPJRN to an output file and take a look at the results, one of the fields is the JOESD field. JOESD contains an image of the record in entries for both puts (writes) and the before- and after-image entries for an update.  For example, suppose I have this file:

 

ORDERS

Field Name Type       Length   Bytes  Position

---------- ---------- ------   -----  --------

ORDERNO    BINARY       9  0       4         1

CUSTOMER   PACKED       6  0       6         5

SHIPTO     BINARY       9  0       4        11

ADDRESS1   CHAR           50      50        15

 

Now I add a record: order = 314478, customer = 70021, ship-to = 3, and address = 1234 MAIN ST. Looking at the record with DSPPFM, I see this:

 

*...+....1....+....2....+...

­­  ö>­ ø      ­­­­­­1234 MAIN ST   

00C607010000FFFF4DCCD4EE4444

04CE002F00031234041950230000

 

As you can see, a lot of junk there. If I had journaled the file and done a DSPJRN to an output file, the write would have been recorded in the journal as a PUT: the field JOENTT would have the value PT, the JOOBJ field would have the file name, and the JOESD field would look just like what you see in DSPPFM.

 

Now, let's say I wanted to create a query that showed all the add entries for records with order numbers between 314000 and 315000. I could probably do it with some judicious use of the HEX function and a decent hexadecimal calculator, but it wouldn't be very readable:

 

select * from JRNOUT where JOENTT = 'PT'                       

  and hex(substr(JOESD,1,4)) between '0004CA90' and '004CE78'

 

And if I were doing queries performing calculations of any type on numeric values, it would be impossible. But could we design a simple UDF that would make such a query possible?  Perhaps something like this:

 

select * from JRNOUT where JOENTT = 'PT'                       

  and HEX2NUM(substr(JOESD,1,4),2) between 314000 and 315000

 

Well now, that is something we can do!

The RPG Side

You'll note that the HEX2NUM function has two parameters: the substring of the alphanumeric field that we're trying to convert and then the literal 2. That's because you have to tell the function whether you're converting packed or binary data. Other than that, the function is pretty smart; it determines the length of the incoming variable and uses that to convert the data properly. Let's see how that's done:

 

     h option( *nodebugio: *srcstmt) nomain

     d UDFHEX2NUM      pr            31p 0

     d   input                       16a   varying const

     d   convtype                    10u 0 const

     p UDFHEX2NUM      b                   export

     d                 pi            31p 0

     d   input                       16a   varying const

     d   convtype                    10u 0 const

     d C_PACKED        c                   1

     d C_INTEGER       c                   2

     d C_BADVALUE      c                   -1

     d dsPacked        ds            16

     d  Packed                       31p 0

     d dsInteger       ds             8

     d  Integer                      20i 0

     d Result          s             31p 0

     d xLen            s              3u 0

      /free

       // Default to error, get the length from the input

       result = C_BADVALUE;

       xLen = %len(input);

       monitor;

         if convtype = C_PACKED;

           // Any length > 0 is valid for packed data

           if xLen > 0;

             // Start with 0, stuff the data at the end

             Packed = 0;

             %subst ( dsPacked: 17-xLen) = input;

             result = Packed;

           endif;

         else;

           // Integer data must have a valid integer length

           if xLen = 8 or xLen = 4 or xLen = 2 or xLen = 1;

             // Start with 0, stuff the data at the end

             Integer = 0;

             %subst( dsInteger: 9-xLen) = input;

             result = Integer;

           endif;

         endif;

       on-error;

       endmon;

       return result;

      /end-free

     p                 e

The logic is really pretty simple; I won't spend a ton of time on it. Since I'm creating a service program, I use NOMAIN in the H-spec, although there is still some debate as to when and why one uses NOMAIN. I'm going to create a very simple service program with only one function. The module has the name of the function, and the service program has the same name as the module; it's all very trivial in nature and doesn't take advantage of the power of ILE. We'll delve a lot deeper into service programs and procedures in the second article in this series. Here, I just want to focus on the mechanics of attaching the SQL query to a service program procedure.

 

Back to the RPG program. After the H-spec, I define the function, which in this case is named UDFHEX2NUM. I define both the prototype and the procedure interface with exactly the same signature (the parameters and return value); that's required in RPG. You define the return value first. In this case, I defined the return value as 31p0, which is the largest packed value that you can fit into a 16-character field. It will also hold the largest binary integer field, a field of 8 bytes or 20 digits. I could have made it larger, I suppose, but I rarely run into cases where I need more than 30 digits of precision. That brings me to an important point: you may have wondered about the fact that I defined this value with no decimal positions. That's because decimal positions are implied in binary data; there's no way to tell from the data what the scale is (the scale specifies where the decimal point is meant to be). So for this case, I'm just returning the raw, unscaled value; it would be up to the query to adjust the decimals accordingly.

 

The parameters are easy to explain: one variable alphanumeric field with a length of up to 16 characters (this will hold the largest possible binary value and a packed value up to 31 digits) and one 10u0 unsigned integer field. This will contain the literal conversion type, 1 for packed and 2 for binary. It's an awfully big field for just two possible values, but, as it happens, when you specify a numeric literal on an SQL statement, the size defaults to an SQL field of type INT, which in turn is the same as an RPG field of 10u0. So that's why I defined the value as 10u0. You may note that I used 10u0; that's because I don't ever expect negative values.

 

This defines the procedure's signature. The rest is the code, which is pretty straightforward. I define a couple of constants for the literal values 1 and 2 (2 for packed, 2 for integer). I define the bad value, which is returned for error conditions. ("But what," you ask, "do we do if we really need to convert a value of -1?" Good question, and one that will be addressed in the follow-up article.)  I define a couple of data structures to do the actual conversion work. The data structure is the alphanumeric component (data structures are by definition treated as alpha fields), while the packed or integer data is defined in a subfield. The program will move the input data into the appropriate data structure and then use the numeric subfield to extract the numeric data. After the data structures are a couple of work fields, and then we're on to the program logic itself.

 

The procedure is very simple. It defaults the result to the bad value and gets the length of the input into a work variable. Then, it decides whether this is a packed or binary conversion. If packed, it checks for a positive length. Having a positive length, it initializes the packed field to zero and then inserts the alphanumeric data at the end of the data structure. It then extracts the data from the packed field and places it in the result field. The logic for the integer conversion is almost exactly the same, except the length must be a valid integer length and it uses the integer data structure rather than the packed data structure. You'll note that all of the conversion code is wrapped in a monitor block; if any unexpected error occurs, the default bad value is returned. And that's it for the RPG code.

 

Because I'm using a very trivial case, creating the service program is very simple. It requires two commands:

 

CRTRPGMOD MODULE(MYLIB/UDFHEX2NUM) SRCFILE(MYLIB/MYSRC) SRCMBR(UDFHEX2NUM)

CRTSRVPGM SRVPGM(MYLIB/UDFHEX2NUM) EXPORT(*ALL)

The SQL Side

What we've done at this point is create the service program named UDFHEX2NUM. This service program has a single procedure also named UDFHEX2NUM. Now let's create the UDF. That's done via a single SQL statement:

 

create function HEX2NUM (varchar(16), int)

  returns decimal(31,0)                                  

language rpgle deterministic no sql                      

external name 'MYLIB/UDFHEX2NUM(UDFHEX2NUM)'               

parameter style general                                  

 

There are a lot of parameters here. Some should be pretty self-explanatory (such as LANGUAGE RPG). Others take a little more time to explain than we have here, but hopefully I can address them in more detail in the next article. The important parameter is EXTERNAL NAME; in it, you specify the service program name and library in single quotes and then, within that, you specify the procedure name in parentheses. Case is very important here; the library and service program names must be uppercase, while the procedure name must match the procedure name exported from your program (in my case, I used all uppercase there as well).

 

Once you've successfully created the function, you're ready to call it. Here's an example of the SQL that I was able to run:

select * from JRNOUT where JOENTT = 'PT'

  and HEX2NUM(substr(JOESD,1,4),2) between 314000 and 315000          

 

That's exactly the syntax I was aiming for in the initial paragraphs, and the UDF worked perfectly, showing only adds (PUTs) for records whose order number was between 314000 and 315000.

Recap

Obviously, this wasn't a very complex example. The example only creates a single UDF attached to an RPG procedure in a very basic service program. But it's the archetype for all your other functions, which can be as complex as you need them to be; they can use other files, call other programs, even access other machines if need be. In the follow-up article, I'll go into more detail about the signatures (that is, the types of the parameters and the return value) and the ways we can use them to make this syntax even more powerful. Until then, have fun with this!

 

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$