Let's look back at the various ways we processed data sets using RPG and compare that to the new V6R1 SQL grouping sets capability.
Like many midrangers, I began my career as a computer operator, loading card decks into the Multi-Function Card Mangler (MFCM) attached to an IBM System 3 Model 10 utilizing RPG/2. It was my first job after completing school, and I was eager to become a programmer. In school, we were taught Basic Assembler Language (BAL), and the RPG language was merely a footnote. Each day, I pestered management for a chance to write a program. In my spare time, I read RPG manuals.
Finally, my big break came. I was given the opportunity to modify an existing program, the year-end Customer Sales by Product report. The program read the order history file (which was stored on its own disk pack) after being sorted in product-within-customer order. The program used a primary file and level-break indicators to control when group totals should be printed, and the sales department needed another total.
The program code looked something like the example below, in which I used an output file based on the Display Program References (DSPPGMREF) command:
FDSPPGMRFIP E K DISK
FPRINTER O F 132 PRINTER
*
IDSPPGMR1 01
I WHFUSGL1
I WHRFNML2
I WHFNAML3
C*0N01FACTOR1+++++++OPCODE(E)+FACTOR2+++++++RESULT++++++++LEN++D+HILOEQ..
C 01 1 ADD L1TOT L1TOT 50
CL1 L1TOT ADD L2TOT L2TOT 50
CL2 L2TOT ADD L3TOT L3TOT 50
CL3 L3TOT ADD LRTOT LRTOT 50
In the above program example, I am accumulating a count of programs by usage within format within file. Files, formats, and usage represent supersets of the single set of program reference data, each superset building on top of the other. As each program reference is read, the count is accumulated into the initial total by usage (i.e., add 1 to L1TOT). When a level break occurs, the counts are rolled up into the next level or superset (e.g., add L1TOT to L2TOT).
This process of grouping and rolling up totals is represented in Figure 1:
Figure 1: Here's an example of level-break totals. (Click images to enlarge.)
With RPG/2, the data needed to be ordered to match the level breaks. This was normally accomplished using a sort prior to calling the program. Sometimes, additional files or compile-time tables were required (e.g., a program usage code lookup table) to translate the codes into something readable. Perhaps the most difficult task to perform was the drilldown.
Drilling Down
In RPG/2 shops, the final output of the RPG program was a report (hence the name Report Program Generator). This report usually required multiple copies that were created using multi-part paper, where each part was separated by a sheet of carbon paper. This paper came in boxes, perhaps 1000 pages of continuous-feed paper per box, each box labeled according to the number of copies (i.e., two-part stock, three-part stock, or the dreaded six-part stock). The paper itself was created with colored bars (usually light green and a pleasing off-white) where, if the paper was lined up correctly, you could, depending on your operating skills, get three lines of print within each bar.
The drilldown process was basically taking the printed report and decollating the paper (i.e., separating the carbon from the green-bar stock) and then, thumbing through the printed report (i.e., drilling down) until the page containing the start of a control group was encountered. At that point, the operator would separate the paper at the perforation and deliver each section to the appropriate person.
Now you know why we operators dreaded the six-part stock. It was too thick to tear so we had to run it five times through the decollator and then do six drilldowns. And when the decollator broke, we would take the reports, place them at the start of a hallway, and proceed to flip the report up one end of the hallway and down the other. This usually made for a long night.
Enter Structured Programming and Query Tools
I went from the System/3 to the System/38. The System/38 introduced RPG/3, which provided many new structured programming operation codes (e.g., DO, SELECT, etc.) along with new database operation codes (e.g., READE, etc.) for processing sets of data. In addition to the RPG language, the System/38 introduced a new way of creating database objects, based on the relational database model, and a new reporting tool, System/38 Query, for accessing this database and producing reports.
Thus the days of RPG/2 input primary-file and level-break processing were declared over, and, according to the industry gurus of the time, batch processing would eventually go away. In fact, a big difference in RPG/3 versus RPG/2 was the absence of the blocking factor from the File Description specification. Along this same line, the Read Equal Key (READE) RPG operation, mainly used for processing matching sets of data sequentially by key, was implemented using the same random IO system module used by the Random Retrieval (CHAIN) operation. This was fine for filling an interactive subfile one page at a time, but if you didn't get the memo about batch processing going away and used READE in your overnight batch programming processes, you would find that these processes simply couldn't keep up with a growing business that produced large volumes of batch data.
The good news was that RPG/3, when combined with relational techniques and used in interactive environments, did an excellent job of drilling down through data as long as the data sets were relatively small. This meant that we could put 5250 display terminals at each users' desks and provide an interactive application allowing the users to drill down into the data and print only what they needed. This was even better news for the operators, especially when hallways were being replaced with open cubicles and floor plans that went every which way but straight.
The drilldown concept is shown in Figure 2:
Figure 2: This is the concept of drilling down.
In essence, I implemented drilldown techniques by starting at the end result, selecting an item, and using the Set Lower Limit (SETLL) RPG operation followed by a READE to present the items included in the higher-level set. This is shown in the following code snippet (which may not be syntactically correct):
C *LOVAL Setll FILESUM
1 C If %FOUND
2 C DOU %EOF(FILESUM or first page is full)
C READ FILESUM
3 C If NOT %EOF()
Build the first Subfile page the wait for input
C FILENAME Setll FORMATSUM
4 C If %FOUND
5C DOU %EOF(FORMATSUM or first page is full)
002 C FILENAME READE FORMATSUM
6C If %EOF
Build the next Subfile page then wait for input
C FORMATNAME Setll PGMSUM
4 C If %FOUND
5C DOU %EOF(PGMSUM or first page is full)
002 C FORMATNAME READE FORMATSUM
6C If %EOF
Build the next Subfile page then wait for input
The supersets had to be populated in some fashion. This could have been accomplished by front-ending the program with summary queries using the System/38 Query product or the Open Query File (OPNQRYF) command, which was introduced just prior to the introduction of the AS/400 (subsequently known as iSeries, System i, and now Power Systems).
Another technique was populating the summary tables as part of the overnight batch process (which didn't exist because batch was going away). Worst case, the RPG program had to perform accumulations on the fly. Not a good idea if each superset was the result of summing thousands of rows.
Fast Forward to V6R1 and SQL Grouping Sets
So 20 years later, what's new? The RPG language is now free-format, and you can specify the Block (*Yes) keyword on files opened for input and processed by SETLL and READ; however, the database access layer hasn't changed. In essence, the Block keyword has no impact on the READE operation as it is still using the same random-retrieval operation that was used on the System/38.
The biggest changes have occurred in SQL, which was introduced with the AS/400 in June of 1988. As of V6R1, all SQL statements now use the SQL Query Engine (SQE). Besides a new optimizer with self-learning capabilities, this also includes new system data access primitives (i.e., the mechanisms that return the data) based on the SQL request. This new technology is available only to those products or programs that are utilizing SQL, such as RPG with embedded SQL or the IBM DB2 Web Query product.
Level Breaks in SQL
One of the exciting new features of SQL in V6R1 is the ability to produce subtotals and final totals by using the ROLLUP extension to the GROUP BY clause, along with the ability to simplify the union of subselects via the grouping-sets capability.
For example, the following SQL statement is the equivalent of the RPG/2 code shown at the beginning of this article. The GROUPING SETS clause is the equivalent of a level-break indicator. The ROLLUP clause produces the subtotals and final total.
SELECT WHFNAM "File Name", WHRFNM "Format Name",
WHFUSG "Pgm Usage", COUNT(*) "Total Pgms Using"
FROM DB2_SAMPLE.DSPPGMREFO
WHERE WHFNAM = 'CBFILE0011'
GROUP BY
GROUPING SETS (ROLLUP (WHFNAM, WHRFNM, WHFUSG))
ORDER BY WHFNAM, WHRFNM, WHFUSG
The following table contains the results of the above query:
Table 1: SQL ROLLUP Result Set
This capability has to make the most hardcore SQL opponent sit up and take notice. For those who are developing modern applications, such as enterprise-level dashboards with graph and reporting capability, life has just become easier.
Drilling Down with Grouping Sets
To accomplish the equivalent of my RPG/3 program, I simply remove the ROLLUP from my previous SQL statement and add an ORDER BY clause on the total programs using columns in descending sequence as shown below:
SELECT WHFNAM "File Name", WHRFNM "Format Name",
WHFUSG "Pgm Usage", COUNT(*) "Total Pgms Using"
FROM DB2_SAMPLE.dsppgmrefo
WHERE WHFNAM = 'CBFILE0011'
GROUP BY GROUPING SETS(WHFNAM, WHRFNM, WHFUSG)
ORDER BY "Total Pgms Using" DESC
The following table contains the result set of the above query:
Table 2: Result Set from Grouping Sets Drilldown Example
This one SQL statement has produced the equivalent of the RPG program by using the multiple READE operations to find the record formats associated with file CBFILE0011. You can see the way the formats are being used (Pgm Usage) and a count of the total number of programs using that format.
Notice that the grouping data associated with each set appears only once per set. If I were going to replace the original RPG program with this new SQL capability, I would use dynamic SQL and, as part of my RPG SQL program, add a HAVING clause to the SQL statement, on the fly, to return only the rows where the grouping data IS NOT NULL. This is shown in the modified SQL statement below:
SELECT WHFNAM "File Name", WHRFNM "Format Name",
WHFUSG "Pgm Usage", COUNT(*) "Total Pgms Using"
FROM DB2_SAMPLE.dsppgmrefo
WHERE WHFNAM = 'CBFILE0011'
GROUP BY GROUPING SETS(WHFNAM, WHRFNM, WHFUSG)
HAVING WHFNAM IS NOT NULL OR WHRFNM IS NOT NULL
ORDER BY "Total Pgms Using" DESC
The following table contains the results of the above query:
File Name |
Format Name |
Pgm Usage |
Total Pgms Using |
CBFILE0011 |
|
|
829 |
|
SYRFMT0010 |
|
814 |
|
RFMT0001 |
|
15 |
Table 3: Simulating a Drilldown Process Using the HAVING Clause
Using this technique, I can continue drilling down until I hit the lowest-level details, at which point I am no longer doing grouping.
End of Story
It is important to note that the use of the new GROUPING SETS and ROLLUP functions will not work if you specify a DDS logical file on the FROM clause. You may want to have a strategy in place to move to DDL-based databases if you don't already.
I have barely scratched the surface on the use of this new SQL capability. Nor have I taken the time to present performance data. I can tell you that several customers, after having been enlightened on DB2 for i optimization, are achieving 2- to 4-second response times, processing tens of thousands of rows joined over 10 or more tables. The Visual Explains of these statements would span several pages.
I believe 20 years from now, as IBM moves more and more application function into the database, the process of building and deploying business applications is going to be nothing more than the push of button. All of the heavy lifting will be done at the database layer. Now is the time to move forward and upgrade your skills to use SQL.
To learn more about grouping sets and super groups, go to the IBM DB2 for i Web site and choose the Support tab. You will find links to articles, Redbooks, and the Information Center.
The following is the schedule for the DB2 for i SQL Optimization Workshop:
Rochester Workshop and Enrollments Details
•· September 16-19, 2008
•· October 21-24, 2008
To enroll, please contact Karen Anderson by phone (972.561.6337) or email (This email address is being protected from spambots. You need JavaScript enabled to view it.).
Non-Rochester Workshop Schedule
•· August 26-29, 2008 Atlanta
•· November 4-7, 2008 Dallas
To enroll in these courses, visit IBM Training.
LATEST COMMENTS
MC Press Online