Easily Transform Legacy Printouts into Excel Files

Java
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Using the unfairly dreaded POI-HSSF APIs and some easy-to-use procedures, create professional-looking Excel files from your legacy printouts quickly and easily!

 

Modernizing applications is a recurring theme these days. Revamping applications with user-friendly screens, Web pages, or some other form of user interaction and providing apps that are more "up-to-date" than the good old green-screens is widely discussed online and offline. But what about the output produced by your legacy programs? Is it still an ugly printout?

 

This article will help you modernize that output, with a quick and easy approach. I'll describe how to use the POI-HSSF Java APIs to create Excel spreadsheets directly from an RPG program. Don't be scared about the "Java APIs" part; they will be hidden "under the hood," and you won't even notice them.

 

For those of you unfamiliar with POI-HSSF, "HSSF is the POI Project's pure Java implementation of the Excel '97 (-2007) file format," according to the Apache Web site. It's basically a set of APIs that allows you create a personalized spreadsheet with almost as much detail as Excel itself (however, HSSF currently doesn't support the creation of graphs and such). The problem is that this flexibility comes at a high cost: it's not very easy, especially for an RPG programmer, to use the Java APIs and concepts involved; when the spreadsheet that you're building has a lot of columns, it gets tiresome and monotonous to code and maintain.

 

One of my personal references in the System i world, Scott Klement, wrote an RPG-friendly wrapper to these Java APIs that I'll be using from this point on. So, if you don't have HSSF already installed on your System i, you must download and install it. There are two parts: the Java binary files and Klement's HSSFR4 implementation for RPG. Klement wrote quite a bit about this subject. I strongly recommend reading his introductory presentation to it before continuing.

 

Let's examine an example: in a previous article, I discussed a client's debt report divided by debt age (total debt at 30, 60, and 90 days). Assume that this program currently generates a printout with six columns:

  • Client ID
  • Client Name
  • Total Debt—Client's total debt
  • Debt 30 Days—Invoices outstanding up to 30 days
  • Debt 60 Days—Invoices outstanding up to 60 days
  • Debt 90 Days—Invoices outstanding up to 90 days or more

It might be a good idea at this point to browse through POI's Web site, as I won't go into details about every single thing (the explanation would get huge). If you're familiar with HSSF, just skip to the section titled "The Easy Way" a bit further down.

 

Here's how you can do it with HSSF APIs.

The Hard Way

First, you must define the variables required to store and handle the workbook, sheet, rows, and cell styles you'll be working with:

 

      /copy qcpylesrc,hssf_h

 

     D Str             s                   like(jString)

     D Book            s                   like(HSSFWorkbook)

     D Sheet           s                   like(HSSFSheet)

     D row             s                   like(HSSFRow)

     D outfile         s                   like(jFileOutputStream)

     D ColHeading      s                   like(HSSFCellStyle)

     D Dec0            s                   like(HSSFCellStyle)

     D Dec2            s                   like(HSSFCellStyle)

     D rowcount        s             10I 0 inz(0)

     D W_Col           s             10I 0 inz(0)

     D W_PathFile      s             45

     D W_File          s             45

     D W_Count         S             10  0 inz(0)           

 

As you probably noticed, there are some special types of data here (HSSFWorkbook, HSSFSheet, and so on). These are thoroughly described in the HSSF online documentation, and I hope the names are self-explanatory. However, I'd like to explain a few. Str is just a Java string field that will be used to compose the output file IFS path, which will then be stored in outfile for the write-to-disk operation. W_PathFile and W_File are used to receive the file's path and name (as entry parameters or as data being read from a file or data area). I'll explain the rest as I go along.

 

There's a small but extremely important detail. The copy member hssf_h contains the definition of all the APIs and data types related to HSSF. Be sure to always include it! If you do so on an SQLRPGLE source, use /Include instead of /Copy. Otherwise, you'll get a "Nested copy not allowed" error. This IBM document explains the details, if you really want to know.

 

The next step is to create the workbook and sheet to which the data will be written. If you want it to have a better look than the ugly, old, and basic printout, there's work to be done: formatting the columns, creating the necessary styles for each type of data (client ID, name, and amounts), and finally writing the header row with the titles to the file. This is performed in routine IniXls:

 

        BegSr IniXls;

   

           hssf_begin_object_group(100);

   

           book = new_HSSFWorkbook();

           sheet = hssf_newSheet(book: 'MAIN');

 

           FormatColumns(book:sheet);

           CreateCellStyles(book);

   

           rowcount = 0;

   

           // create the header row

           rowcount += 1;

           row = HSSFSheet_createRow(sheet: rowcount);

   

           // write column A

           W_Col = 0;

           hssf_text(row: W_Col: 'Client ID' : ColHeading);

   

           // write column B

           W_Col += 1;

           hssf_text(row: W_Col: 'Client Name' : ColHeading);

   

           // write column C

           W_Col += 1;

           hssf_text(row: W_Col: 'Total Debt' : ColHeading);

 

           // write column D

           W_Col += 1;

           hssf_text(row: W_Col: 'Debt 30 Days' : ColHeading);

 

           // write column E

           W_Col += 1;

           hssf_text(row: W_Col: 'Debt 60 Days' : ColHeading);

 

           // write column F

           W_Col += 1;

           hssf_text(row: W_Col: 'Debt 90 Days' : ColHeading);

 

        Endsr;

 

Lets analyze this routine: hssf_begin_object_group(100) creates a group of objects able to handle up to 100 object references. In this example, I'll be creating only one, but this is a good practice; the error that you get when you try to create more objects than you're ready for is not very user-friendly.

 

Then, these lines create the workbook and sheet in memory:

 

book = new_HSSFWorkbook();

sheet = hssf_newSheet(book: 'MAIN');

 

These are plain and unformatted, the same as you get when you create a new workbook in Excel, so they need to be embellished to make a good impression. That's what the next two lines of code are for:

 

FormatColumns(book:sheet);

CreateCellStyles(book);

 

Procedure FormatColumns will format the column's width and some other details, like the rows that are printed at the top of each new page:

 

      *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

      * FormatColumns():  Set the column widths & merged cells

      *    in a given worksheet.

      *

      *   sheet = (input) sheet to set the column widths in

      *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

     P FormatColumns   B

     D FormatColumns   PI

     D   book                              like(HSSFWorkbook)

     D   sheet                             like(HSSFSheet)

      /free

   

        // Set repeating rows at top of each page

        HSSFWorkbook_setRepeatingRowsAndColumns(book : 0 : 0 : 5 : 0 : 0);

        //

        // The column width setting is in units that are approx

        //   1/256 of a character.

        //

        //Column A

        HSSFSheet_setColumnWidth( sheet:  0: 10 * 256);   

 

        //Column B

        HSSFSheet_setColumnWidth( sheet:  1: 40 * 256);

 

        //Column C

        HSSFSheet_setColumnWidth( sheet:  2: 15 * 256);

 

        //Column D

        HSSFSheet_setColumnWidth( sheet:  3: 15 * 256);

 

        //Column E

        HSSFSheet_setColumnWidth( sheet:  4: 15 * 256);

 

        //Column F

        HSSFSheet_setColumnWidth( sheet:  5: 15 * 256);              

 

      /end-free

     P                 E

 

Next are the cell styles. Since each cell will have a header row and several types of data, you have to create a cell style for each. This is achieved in procedure CreateCellStyles:

 

      *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

      * CreateCellStyles(): Create the different display styles

      *    used for cells in this Excel workbook.

      *

      *++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

     P CreateCellStyles...

     P                 B

     D CreateCellStyles...

     D                 PI

     D Book                                like(HSSFWorkBook)

   

     D ChFont          s                   like(HSSFFont)

     D TempStr         s                   like(jString)

         

      /free

   

         //

         // Create a cell style for the column headings.

         // These are bold and have a border line at the bottom

         //

   

         ColHeading = HSSFWorkbook_createCellStyle(book);

   

         ChFont = HSSFWorkbook_createFont(book);

         HSSFFont_setFontHeightInPoints(ChFont: 11);

         TempStr = new_String('Calibri');

         HSSFFont_setFontName(ChFont: TempStr);

         HSSFCellStyle_setFont(ColHeading: ChFont);

   

         HSSFCellStyle_setFillForegroundColor( ColHeading

                                            : COLOR_LIGHT_ORANGE);

         HSSFCellStyle_setFillPattern(ColHeading: SOLID_FOREGROUND);

         HSSFCellStyle_setAlignment(ColHeading: ALIGN_CENTER);

         HSSFCellStyle_setWrapText(ColHeading: *On);

         HSSFCellStyle_setBorderBottom(ColHeading: BORDER_THIN);

         HSSFCellStyle_setBorderTop(ColHeading: BORDER_THIN);

         HSSFCellStyle_setBorderRight(ColHeading: BORDER_THIN);

         HSSFCellStyle_setBorderLeft(ColHeading: BORDER_THIN);

   

         // Create a cell style for numeric 0-dec

         // right-aligned and the number is formatted nicely.

         //

         Dec0 = HSSFWorkbook_createCellStyle(book);

   

         ChFont = HSSFWorkbook_createFont(book);

         TempStr = new_String('Arial');

         HSSFFont_setFontName(ChFont: TempStr);

         HSSFCellStyle_setFont(Dec0: ChFont);

         HSSFCellStyle_setBorderBottom(Dec0: BORDER_THIN);

         HSSFCellStyle_setBorderTop(Dec0: BORDER_THIN);

         HSSFCellStyle_setBorderRight(Dec0: BORDER_THIN);

         HSSFCellStyle_setBorderLeft(Dec0: BORDER_THIN);

   

         HSSFCellStyle_setDataFormat(Dec0: 0);

   

         // Create a cell style for numeric 2-dec

         // right-aligned and the number is formatted nicely.

         //

         Dec2 = HSSFWorkbook_createCellStyle(book);

   

         ChFont = HSSFWorkbook_createFont(book);

         TempStr = new_String('Arial');

         HSSFFont_setFontName(ChFont: TempStr);

         HSSFCellStyle_setFont(Dec2: ChFont);

         HSSFCellStyle_setBorderBottom(Dec2: BORDER_THIN);

         HSSFCellStyle_setBorderTop(Dec2: BORDER_THIN);

         HSSFCellStyle_setBorderRight(Dec2: BORDER_THIN);

         HSSFCellStyle_setBorderLeft(Dec2: BORDER_THIN);

   

         HSSFCellStyle_setDataFormat(Dec2: 2);

 

         //?Create a cell style for text so that it's

         //? left-aligned

         //

   

         Text = HSSFWorkbook_createCellStyle(book);

   

         ChFont = HSSFWorkbook_createFont(book);

         TempStr = new_String('Arial');

         HSSFFont_setFontName(ChFont: TempStr);

         HSSFCellStyle_setFont(Text: ChFont);

         HSSFCellStyle_setBorderBottom(Text: BORDER_THIN);

         HSSFCellStyle_setBorderTop(Text: BORDER_THIN);

         HSSFCellStyle_setBorderRight(Text: BORDER_THIN);

         HSSFCellStyle_setBorderLeft(Text: BORDER_THIN);

   

         HSSFCellStyle_setAlignment(Text: ALIGN_LEFT);

   

      /end-free

     P                 E

 

Here, four cell styles are created:

  • ColHeading—Calibri Font, size 11, centered text, a light orange background, and thin borders. This style will be used for the Header row.
  • Dec0—Arial font with default size and alignment, without a background and with thin borders. The numbers will be displayed without decimals. This style will be applied to the client ID.
  • Dec2—Basically the same as Dec0, but it holds two decimals. This style will be used for the debt amounts.
  • Text—A rather simple cell style: Arial font with default size, left aligned, and thin borders.

Again, I will not explain every HSSF API in detail, so if the names and parameters are not self-explanatory, please refer to the online documentation.

 

Going back to the routine IniXls where we left off, the book and sheet are created (in memory only), the columns are properly formatted, and the necessary cell styles have been made available. The next step is to write the header row. Here's the first shortcoming of HSSF: you need to write a cell at a time, specifying the row, column, data type, and cell style every time. Since I'm writing the header row (simple text), I'll use the HSSF_Text API. Later, you'll see other APIs being used to write other types of data.

 

The rowcount variable indicates the number of the row to create. Be sure to set it to zero when you start to work on a new sheet. If for some reason you forget to increment it, the HSSFSheet_createrow will overwrite the last row written, so it's a good practice to have these two lines together at all times:

 

           // create the header row

           rowcount += 1;

           row = HSSFSheet_createRow(sheet: rowcount);

 

The next (tedious) step is to write each column title, one by one:

 

           // write column A

           W_Col = 0;

           hssf_text(row: W_Col: 'Client ID' : ColHeading);

   

           // write column B

           W_Col += 1;

           hssf_text(row: W_Col: 'Client Name' : ColHeading);

   

           // write column C

           W_Col += 1;

           hssf_text(row: W_Col: 'Total Debt' : ColHeading);

 

         // write column D

           W_Col += 1;

           hssf_text(row: W_Col: 'Debt 30 Days' : ColHeading);

 

         // write column E

           W_Col += 1;

           hssf_text(row: W_Col: 'Debt 60 Days' : ColHeading);

 

         // write column F

           W_Col += 1;

           hssf_text(row: W_Col: 'Debt 90 Days' : ColHeading);

 

The caution note about the rowcount variable is equally applicable to the W_Col variable; if you neglect to increment it, you'll be overwriting the same cell over and over again.

 

The header is done! Now, let's write some actual data to the spreadsheet. My example program gathers the necessary data from the files/business rules and places it on temporary variables. Then, routine WrtXls writes the data to the spreadsheet, again cell by cell:

 

       BegSr WrtXls;

   

         If (W_Count = 0);

           // If this is the first write, initialize the spreadsheet

           ExSr IniXls;

         EndIf;

   

           W_Count += 1;

           rowcount += 1;

           row = HSSFSheet_createRow(Sheet: rowcount);

   

           W_Col = 0;

   

           // Write column A

           hssf_num(row: W_Col: W_ClientID: Dec0);

   

           // Write column B

           W_Col += 1;

           hssf_text(row: W_Col: %Trim(W_ClientName) : Text);

   

           // Write column C

           W_Col += 1;

           hssf_text(row: W_Col: W_TotalDebt :Dec2);

 

           // Write column D

           W_Col += 1;

           hssf_num(row: W_Col: W_Debt30 : Dec2);

   

           // Write column E

           W_Col += 1;

           hssf_num(row: W_Col: W_Debt60: Dec2);

   

           // Write column F

           W_Col += 1;

           hssf_num(row: W_Col: W_Debt90 : Dec2);

 

   

       EndSr;

 

In this routine, I'm determining whether the spreadsheet was already created, via the W_Count variable, and creating it if necessary through routine IniXls, described earlier. Then, I'm creating a new row and writing the data to it. Here, I'm using the hssf_text or hssf_num API, depending on the data type. Also, notice that the last parameter of each API call is the cell style previously defined.

 

This routine will be called for each client in the database; it writes a new row for each, directly replacing the WRITE <format name> of the printer file being replaced.

 

At the end of the program, you need to actually write the spreadsheet to the IFS (it only exists in memory, remember?). This is achieved with routine EndXls:

 

        BegSr EndXls;

   

        // Prepare output file path

          W_File     = 'Debt_Report.xls';

          W_PathFile = 'AccountReports' + %Trim(W_File);

          Str = new_String(%Trim(W_PathFile));

          outfile = new_FileOutputStream(Str);

 

        // Write output file

 

          HSSFWorkbook_write(Book: outfile);

          FileOutputStream_close(outfile);

 

        // Close object group to free memory

         hssf_end_object_group();

 

   

       EndSr;

 

This process requires three steps: 1) preparing the file's name and path on the IFS (I'm using a fixed name and path, but it could easily have variables instead); 2) writing the file to the IFS by indicating which workbook (our book variable in this case) I want to write to disk and closing the output file. This is relevant if you're working on more than one sheet. You can handle multiple HSSFSheet variables at the same time and write them all to the same workbook, but I'm trying to keep it simple here; and 3) closing the object group to free the allocated memory. This is another good practice; it not only frees memory but also frees the IFS file, so be sure to always include this line!

 

In short, our legacy program will have a few new routines to handle the spreadsheet output (IniXls, WrtXls, and EndXls) and two new procedures (FormatColumns and CreateCellStyles). These last two have to be added to the program instead of a service program because they are specific to the spreadsheet we're creating in this program.

 

Routines IniXls and WrtXls replace the WRITE of the print file header and detail record formats, respectively. If you imagine that the printer file would have the USROPEN keyword (not very common but possible), routine EndXls would be its replacement.

 

The two procedures replace the printer file keywords related to the appearance of the fields.

 

Well, that wraps up "The Hard Way." As you can see, it's not very hard to create a spreadsheet this way, but it can get rather confusing and tedious when you have a lot of columns.

 

Recently, I had to create an Excel file with 70 columns. As you might imagine, I was dreading the task! So I decided to create an easier way to do it.

The Easy Way

The starting point is the fact that there are three basic stages to creating a spreadsheet:

 

  1. Creating the workbook and sheet in memory (this includes creating the header and all the formatting)
  2. Writing records to it
  3. Outputting it to an IFS folder

These stages match routines IniXls, WrtXls, and EndXls on "The Hard Way" section, just in case you didn't read that section.

 

Instead of creating the workbook and sheet within the program, why not use an existing spreadsheet as a starting point? That's much easier (no coding involved) and much, much quicker (using Excel, you see exactly what the result will be and can quickly adjust any formatting detail).

 

The template workbook can contain one or more sheets, each with its own set of columns. It's also important to include one row of dummy data, properly formatted (font name, size, color, background, and so on), just after the header row(s) in order to allow the procedures that use the template to recognize the formatting to use on the real data rows.

 

I've created procedure XlsCrtBook, which creates a workbook in memory and returns a pointer to it, using a template location and name supplied as entry parameters. You might ask, "Why not create the sheet at the same time, as in routine IniXls?" It's simple: this way, you can have different template sheets within the same workbook and create only the one(s) you need.

 

This means that another procedure is required to create the sheet. That's procedure XlsCrtSheet. This procedure returns a pointer to the sheet indicated on the entry parameter. I'll explain the details later on.

 

This takes care of the first stage.

 

For the second part (writing the actual data rows), which is the most tedious part if the spreadsheet has a lot of columns, the approach had to be a bit different, because it had to be flexible and couldn't depend on the type of value (numeric or alphanumeric) of each column to work. So it uses a string to store the row of data to be written and a separator character to distinguish the columns. I know this might sound a bit confusing now, but it will be clearer with an example!

 

Finally, writing the spreadsheet to the IFS also had to be easy and flexible. All the necessary API calls were condensed into a simple procedure named XlsWrtBook that receives the pointers to the workbook and sheet I'm writing to disk and the name and location of the IFS file.

 

Let's clarify this with an example. I'll use the debt report program again.

 

Instead of routine IniXls, assume I've already created a template with the name DebtRepTpl.XLS and uploaded it to IFS folder XLS.

 

Here's how the workbook is created:

 

  If XlsCrtBook('XLS': ' DebtRepTpl.XLS': P_Book) = *Off;

 EndIf;              

 

Now, you must indicate which sheet within the template you want to use. Let's use sheet DebtSummary:

 

    If XlsCrtSheet(P_Book: 'DebtSummary': 1: 6:                            

                  P_Sheet: P_Style: P_Type) = *Off;                    

    EndIf;

 

You probably noticed some additional parameters that were not previously mentioned: after the sheet name, there's a 1 and a 6 and some others that I'll explain later. The first two are the numbers of rows and columns of the header, respectively. This is used to avoid overwriting the header with data and, most important, to find the first row of data. In this case, row two is the first row of data and contains the cell types and styles (equivalent to the formatting that was created in procedure CreateCellStyles in section "The Hard Way"). These cell types and styles are returned in parameters P_Type and P_Style, respectively, which are arrays with one position for each column used. In this case, only the first six positions will be used, because I've indicated that the number of columns is six. These parameters will be used to write the data rows with the proper formatting. This is achieved through procedure XlsWrtLin:

                                                                                                                                                

    P_Row = '123§Dummy Client§2000§200§800§1000';                                      

                                                                       

    If XlsWrtLin(P_Book: P_Sheet: P_Style: P_Type: 2:  P_Row: '§') = *Off;

    EndIf;

 

Here, I'm hard-coding values to the P_Row variable, just to try to make things clearer. The call to procedure XlsWrtLin will write the following information to the second row of the sheet I've previously created: the client named Dummy Client, which has an internal ID of 123, a total debt of $2,000, $200 of outstanding invoices due in 30 days or less, $800 under 60 days, and $1,000 up to or over 90 days. In a real program, the P_row variable would contain the concatenation of database fields, properly transformed into character values (when needed) in order to form a string.

 

There's another detail I haven't talked about yet: the separation character. You'll notice that the pilcrow, aka paragraph symbol, (§) character is used in P_Row and as a parameter of the XlsWrtLin procedure. This is because you might need to choose another separator character if your data contains "§". You can use whichever character you want, as long as you use it in the composition of the P_row variable and the call to XlsWrtLin. I strongly recommend defining a constant (C_Sep, for instance) and using that instead of repeating the character everywhere.

 

Finally, I'm ready to output my spreadsheet to disk:

 

  If XlsWrtBook('XLS': 'DebtReport.xls': P_Book) = *Off;   

  EndIf;                                                  

 

This call to procedure XlsWrtBook writes my P_Book spreadsheet to the IFS folder XLS with the name DebtReport.XLS.

 

In short, with four simple calls that you can easily include in any program, you have all the functionality of the HSSF APIs without its complexity!

 

Next, I'll go through the inner workings of each procedure.

 

The procedure XlsCrtBook is quite simple:

 

     P XlsCrtBook      B                   EXPORT

     D XlsCrtBook      PI              N

 

      * Input Parms

     D P_Folder                     250A   Value

     D P_File                        50A   Value

 

      * Output Parms

     D P_Book                              Like(HSSFWorkbook)

 

      *   Main Code                                                            

      /FREE

 

             // If the folder or file names weren't supplied => error!

              If (P_Folder = *Blanks) or (P_File   = *Blanks);

                Return *On;

              EndIf;

 

             // create space for 100 or more object references

             // in the object group.

               Hssf_Begin_Object_Group(100);

 

             // Read the template file into memory

               P_Book = Hssf_Open('/' + %Trim(P_Folder) +

                                '/' + %Trim(P_File) );

 

             // If the file was not found => error!

               If P_Book = *Null;

                Return *On;

               EndIf;

 

         Return    *Off;

 

      /END-FREE

     P XlsCrtBook      E

 

This procedure simply opens an existing file, using the name and location provided as entry parameters, and returns a pointer to the workbook. Hssf_Open is usually used to open files for reading, but it can also be used for writing. The procedure returns an indicator (*ON in case of error) so that it can be used on an If statement to provide some degree of control over the events.

 

The next procedure, XlrCrtSheet, is slightly more complex:

 

     P XlsCrtSheet     B                   EXPORT

     D XlsCrtSheet     PI              N

 

      * Input Parms

     D P_Book                              Like(HSSFWorkbook) Value

     D P_SheetName                   50A   Value

     D P_CabRow                       3  0 Value                               

     D P_CabCol                       5  0 Value                                

 

      * Output Parms

     D P_Sheet                             Like(HSSFSheet)

     D P_Style                             Like(HSSFCellStyle) Dim(250)

     D P_Type                              Like(jInt) Dim(250)

 

     D Row             S                   Like(HSSFRow)

     D Cell            S                   Like(HSSFCell)

     D W_Col           S              5I 0

 

      *   Main Code                                                           

      /FREE

 

         ExSr      IniPgm;

         ExSr      Procc;

         ExSr      EndPgm;

         Return    *Off;

 

         //  Sub Routines

 

         //  Obtain the style and type of each column

           BegSr  Procc;

 

              W_Col = 1;

 

              Dow (W_Col <= P_CabCol);

                Cell = HssfRow_GetCell(Row: W_col-1);

 

                // If the cell is not filled in, set P_Type to *HiVal

                // in order to prevent this column from being written by XlsWrtLin

                If (Cell = *Null);

                  P_Type(W_col)  = *HiVal;

 

         //  Obtain the style and type of each column from the template's dummy row

                Else;

                  P_Style(W_col)   = HssfCell_GetCellStyle(Cell);

                  P_Type(W_col)    = HssfCell_GetCellType(Cell);

 

                  // Clear the dummy content

                  If P_Type(W_Col) <> CELL_TYPE_FORMULA and

                     P_Type(W_Col) <> CELL_TYPE_BOOLEAN and

                     P_Type(W_Col) <> CELL_TYPE_ERROR;

                    Hssf_Text(Row: W_Col-1: '': P_Style(W_col));

                  EndIf;

                EndIf;

 

                W_Col += 1;

              EndDo;

 

           EndSr;

 

         //  Initialization                                                   

           BegSr  IniPgm;

 

             // If the workbook or sheet are not supplied => Error

              If (P_Book = *Null) or (P_SheetName = *Blanks) or

                 (P_CabCol = *Zeros);

                Return *On;

              EndIf;

 

             // Read sheet from the template

               P_Sheet = Hssf_GetSheet(P_Book: %Trim(P_SheetName));

               If (P_Sheet = *Null);

                 Return *On;

               EndIf;

 

             // Read the first row AFTER the header to get formatting and cell types

               Row = HssfSheet_GetRow(P_Sheet: P_CabRow);

               If (Row = *Null);

                 Return *On;

               EndIf;

 

           EndSr;

 

         //  End procedure                                                               

           BegSr  EndPgm;

 

           EndSr;

 

      /END-FREE

     P XlsCrtSheet     E

 

The code is divided into three parts: initialization, processing, and termination routines. The initialization simply loads the proper sheet of the template workbook into memory and reads the dummy data row. Afterward, the processing routine reads one cell at a time and stores its type (see CELL_TYPE constants on the POI-HSSF documentation for the available types) and style (in other words, its formatting) in the P_Type and P_Style arrays, respectively. I'll get back to these two in the explanation of the next procedure! Finally, the termination procedure is empty (at least for now; it might be necessary on a later version), but it was included just for coherence. If everything went OK, an *OFF indicator is returned. An *ON is returned if something went wrong reading the template sheet or the dummy data row.

 

I've mentioned several times already arrays P_Type and P_Style without a proper explanation for their existence. I hope their purpose is clear after the explanation of the next procedure:

 

     P XlsWrtLin       B                   EXPORT

     D XlsWrtLin       PI              N

 

      * Input Parms

     D P_Book                              Like(HSSFWorkbook) Value

     D P_Sheet                             Like(HSSFSheet) Value

     D P_Style                             Like(HSSFCellStyle) Dim(250) Value

     D P_Type                              Like(jInt) Dim(250) Value

     D P_NumRow                      10  0 Value                               

     D P_Row                      32767    Varying Value                       

     D P_Sep                          1A   Value                               

 

     D Row             S                   Like(HSSFRow)

     D Cell            S                   Like(HSSFCell)

     D W_Col           S              5I 0

     D W_PosB          S              5I 0                                      

     D W_PosE          S              5I 0                                      

     D W_Cell          S           5000A   Varying                             

 

      *   Main Code

      /FREE

 

         ExSr      IniPgm;

         ExSr      Procc;

         ExSr      EndPgm;

         Return    *Off;

 

         //  Sub Routines

 

         //  Write the data row the sheet

           BegSr  Procc;

 

              W_Col  = 0;

              W_PosB = 1;

              W_PosE = %Scan(P_Sep: P_Row);

 

              Dow (W_PosE <> *Zeros);

                W_Col += 1;

                If (W_PosE - W_PosB > *Zeros);

                  W_Cell = %SubSt(P_Row: W_PosB: W_PosE - W_PosB);

                Else;

                  W_Cell = *Blanks;

                EndIf;

 

                Select;

 

                // If the cell is numeric

                When P_Type(W_Col) = CELL_TYPE_NUMERIC;

                  If (W_Cell = *Blanks);

                    W_Cell = *Zeros;

                  EndIf;

                  Hssf_Num(Row: W_Col-1: %Dec(W_Cell: 63: 20) : P_Style(W_Col));

 

                // If the cell is character

                When P_Type(W_Col) = CELL_TYPE_STRING;

                  Hssf_Text(Row: W_Col-1: %Trim(W_Cell) : P_Style(W_Col));

 

                // If the cell contains a formula

                When P_Type(W_Col) = CELL_TYPE_FORMULA;

 

                // If the cell is empty

                When P_Type(W_Col) = CELL_TYPE_BLANK;

                  Hssf_Text(Row: W_Col-1: %Trim(W_Cell) : P_Style(W_Col));

 

                // If the cell contains a boolean

                When P_Type(W_Col) = CELL_TYPE_BOOLEAN;

 

                // If the cell has an error

                When P_Type(W_Col) = CELL_TYPE_ERROR;

 

                EndSL;

 

                W_PosB = W_PosE + 1;

                W_PosE = %Scan(P_Sep: P_Row: W_PosB);

              EndDo;

 

           EndSr;

 

         //  Initialization                                                   

           BegSr  IniPgm;

 

             // If there's a problem with the book, sheet or separator character => Error

              If (P_Book = *Null) or (P_Sheet = *Null) or

                 (P_Sep = *Blanks);

                Return *On;

              EndIf;

 

             // Retrieve the row in which to write

              Row = HssfSheet_GetRow(P_Sheet: P_NumRow - 1);

 

             // If it doesn't exist, create it

              If Row = *Null;

                Row = HSSFSheet_createRow(P_Sheet: P_NumRow - 1);

              EndIf;

 

           EndSr;

 

         //  End Procedure

           BegSr  EndPgm;

 

           EndSr;

 

      /END-FREE

     P XlsWrtLin       E

 

Again, the code is divided into three parts: initialization, processing, and termination routines. The first part simply obtains the pointer to the row in which to write data. If I'm writing the first line (the one that contains dummy data on the template) or, for any reason, the template has data in more than one row, it uses the row indicated on the entry parameter. Otherwise (and this will be the most common situation), it creates a new row.

 

Now comes the tricky part: writing the data. Here's where the P_Type and P_Style arrays are (finally) put to good use. In order to choose which API to use to write (hssf_num, hssf_text, or hssf_formula), the procedure needs to know which is the type of data to write. For instance, if you try to write a string variable using hssf_num, you'll get a very ugly and non-user-friendly Java error (the type of thing that gives HSSF APIs a bad rep). To circumvent this, P_Type is used; it contains the type of cell for the one being currently written. So, if it's a number, hssf_num will be used; if it's a string, hssf_text will be used; and so on. Whichever API is used to write the cell, it requires a cell style (if you read "The Hard Way" section, you certainly remember the cell styles defined in the CreateCellStyles procedure). Now it's array P_Style's time to be used; it provides the formatting previously defined on the template sheet for the column currently being written.

 

Let's go back a bit and analyze the Do loop that encloses the writing part. The most programmer-friendly way I've found of passing the row to write is by stuffing it into a huge string, using a separator character to segregate the values. The loop is used for reading that string, jumping from separator character to separator character, and writing a cell with each value.

 

Again, the termination procedure is empty (at least for now; it might be necessary on a later version), but it was included just for coherence. If everything went OK, an *OFF indicator is returned. An *ON is returned if something went wrong reading the template sheet or if the separation character was not supplied.

 

The data writing is done; let's move on to the file writing on the IFS:

 

     P XlsWrtBook      B                   EXPORT

     D XlsWrtBook      PI              N

 

      * Input Parms

     D P_Folder                     250A   Value

     D P_File                        50A   Value

     D P_Book                              Like(HSSFWorkbook)

 

     D Str             S                   Like(jString)

     D OutFile         S                   Like(jFileOutputStream)

 

      *   Main Code                                                           

      /FREE

 

         ExSr      IniPgm;

         ExSr      Procc;

         ExSr      EndPgm;

         Return    *Off;

 

         //  Sub Routines

 

         //  Write the file to the IFS

           BegSr  Procc;

 

              Str = New_String('/' + %Trim(P_Folder) +

                               '/' + %Trim(P_File) );

              OutFile = New_FileOutputStream(Str);

              HssfWorkBook_write(P_Book: OutFile);

 

              FileOutputStream_Close(OutFile);

              Hssf_End_Object_Group();

 

           EndSr;

 

         //  Initialization                                                   

           BegSr  IniPgm;

 

             // If not all the necessary information was supplied => Error

              If (P_Folder = *Blanks) or (P_File   = *Blanks) or

                 (P_Book = *Null);

                Return *On;

              EndIf;

 

           EndSr;

 

         //  End procedure

           BegSr  EndPgm;

 

           EndSr;

 

      /END-FREE

     P XlsWrtBook      E

 

This procedure is basically the same as routine EndXls, handling all the necessary API calls to write the workbook to disk and free the memory.

 

To finalize, I'll try to consolidate all of this with a two-sheet example. Assume that I have a template workbook with two sheets, one for the debt summary (as previously explained) and another one for the invoice list that details that debt. This sheet has the following columns: invoice number, invoice date, invoice due date, and invoice amount.

 

I'll write a row on the first sheet with the client's debt and a few rows (just for you to get an idea) on the second sheet with invoice information:

 

     FINVOICES  IF   E           K Disk    

 

      /copy qcpylesrc,EASYXLS_PR

 

      *   Variables                                                           

      * Work Fields

     D P_Book          s                   like(HSSFWorkbook)

     D P_Sheet         s                   like(HSSFSheet)

     D P_style         s                   like(HSSFCellStyle) dim(250)

     D P_Type          s                   Like(jInt) dim(250)

     D P_Row           S           5000                                         

     D P_Sheet2        s                   like(HSSFSheet)

     D P_style2        s                   like(HSSFCellStyle) dim(250)

     D P_Type2         s                   Like(jInt) dim(250)

 

      *   Main Code                                                           

 

      /FREE

 

          ExSr Init;

 

        // Create Book

          If XlsCrtBook('XLS': 'DebtRepTpl.xls': P_Book) = *Off;

 

          // Create first Sheet (summary)

            If XlsCrtSheet(P_Book: 'DebtSummary': 1: 6:

                          P_Sheet: P_Style: P_Type) = *Off;

            EndIf;

 

          // Create second Sheet (detail)

            If XlsCrtSheet(P_Book: 'DebtDetail': 1: 4:

                          P_Sheet2: P_Style2: P_Type2) = *Off;

            EndIf;

 

          // Write a line for the client's debt on the first sheet

             P_Row = '555§Another Client§1000§800§200§0';                                      

                                                                       

    If XlsWrtLin(P_Book: P_Sheet: P_Style: P_Type: 2:  P_Row: '§') = *Off;

    EndIf;

 

          // Write invoice data in sheet "DebtDetail"

            P_Row = 'Inv. Nbr. 12§20010101§20100131§200';

      

            If XlsWrtLin(P_Book: P_Sheet2: P_Style2: P_Type2: 2:

                                                             P_Row: '§') = *Off;

            EndIf;

      

          // Write a second invoice to sheet "DebtDetail"

            P_Row = 'Inv. Nbr. 25§20010601§20100630§800';

      

            If XlsWrtLin(P_Book: P_Sheet2: P_Style2: P_Type2: 3:

                                                             P_Row: '§') = *Off;

            EndIf;

 

 

          // Write Book to IFS

            If XlsWrtBook('XLS': 'DebtReport.xls': P_Book) = *Off;

            EndIf;

 

          EndIf;

 

          Exsr End_Of_Program;

 

         //  Sub Routines

 

         //  Initialization                                                   

           BegSr  Init;

 

 

           EndSr;

 

         //End of the program                                                             BegSr  End_Of_Program;

 

           *InLr = *On;

 

           EndSr;

 

I hope this slightly more complex example helped! What I've discussed so far helps to replace a printout with relative ease, but what if you have an output file instead of a printer file? It's a fairly common technique: write the output data to a temporary physical file and then transfer the data to an IFS folder (via CPYTOIMPF) or to a PC folder (via the iSeries Access Data Transfer program, for instance) in CSV or XLS format afterward. Well, I also have a solution for that. Keep on reading.…

The Even Easier Way

For the situations in which there's already an output file, it makes no sense to develop code to read it again and write it line by line. So here's another procedure to help you. It's quite simple to use actually. You just add the USROPN keyword to the output file, close it once it's ready to transfer, and include these lines of code:

 

        // Write physical file to IFS spreadsheet

          If FileToXls('OUTF': 'XLS': 'MyTemplate.xls': 'MySheet': 1: 4: '§':

                       'XLS': 'MyOutputFile.xls') = *Off;

            EndIf;

 

Sounds simple enough? Let's look at the parameters: OUTF is the name of the physical file. XLS, MyTemplate.xls, and MySheet indicate the folder, workbook, and sheet to use as templates, respectively. 1 indicates that there's only one header row. 4 indicates that the template has four columns. § is the separator character. The second XLS indicates the output IFS folder. And finally, MyOutputFile.xls indicates the name of the output file.

 

This procedure relies on the previously described ones to function, as you might have gathered from the parameters. However, it uses a simple trick devised to provide the necessary flexibility to process almost any file, as you'll see next:

 

     P FileToXls       B                   EXPORT

     D FileToXls       PI              N

 

      * Input Parms

     D P_File                        10A   Value

     D P_FolderT                    250A   Value                                

     D P_FileXlsT                    50A   Value                               

     D P_SheetName                   50A   Value

     D P_CabRow                       3  0 Value                               

     D P_CabCol                       5  0 Value                               

     D P_Sep                          1A   Value                               

     D P_FolderOut                  250A   Value                               

     D P_FileXlsOut                  50A   Value                                

 

     D W_Lin           S              5I 0

     D W_Book          S                   Like(HSSFWorkbook)

     D W_Row           S          32767A                                        

     D W_Sheet         S                   Like(HSSFSheet)

     D W_Style         S                   Like(HSSFCellStyle) Dim(250)

     D W_Type          S                   Like(jInt) Dim(250)

     D P_Cmdlin        S            350A

     D P_Cmdlen        S             15  5

 

      * Programs Call

     D $QCMDEXC        PR                  ExtPgm('QCMDEXC')

     D P_Cmdlin                     350A

     D P_Cmdlen                      15  5

 

      *   Main Code                                                           

      /FREE

 

       ExSr      IniPgm;

       ExSr      Procc;

       ExSr      EndPgm;

       Return    *Off;

 

       //  Sub Routines

 

       //  Read temporary physical file and write it to IFS

       BegSr  Procc;

 

         //?Create Book

         If XlsCrtBook(P_FolderT: P_FileXlsT: W_Book) = *On;

           Return *On;

         EndIf;

 

         // Create Sheet

         If XlsCrtSheet(W_Book: P_SheetName: P_CabRow: P_CabCol:

                                        W_Sheet: W_Style: W_Type) = *On;

           Return *On;

         EndIf;

 

 

         // Read intermediate physical file

         Open PFXLS;

         Read PFXLS;

         W_Lin = P_CabRow;

 

         // Write lines (records) to spreadsheet

         Dow Not %Eof(PFXLS);

           W_Row = %Trim(XlsTxt) + P_Sep;

           W_Lin += 1;

 

           If XlsWrtLin(W_Book: W_Sheet: W_Style: W_Type: W_Lin:

                                                            W_Row: P_Sep) = *On;

             Close PFXLS;

             Return *On;

           EndIf;

 

           Read PFXLS;

         EndDo;

 

         // Write the workbook to IFS

           If XlsWrtBook(P_FolderOut: P_FileXlsOut: W_Book) = *On;

             Close PFXLS;

             Return *On;

           EndIf;

 

       EndSr;

 

         //  Initialization                                                     

       BegSr  IniPgm;

 

         // If a parameter is blank => Error

         If (P_File = *Blanks) or (P_FolderT = *Blanks) or

            (P_FileXlsT = *Blanks) or (P_SheetName = *Blanks) or

            (P_CabRow = *Zeros) or (P_CabCol = *Zeros) or

            (P_Sep = *Blanks) or (P_FolderOut = *Blanks) or

            (P_FileXlsOut = *Blanks);

           Return *On;

         EndIf;

 

         // Create the intermediate file in QTEMP

         P_Cmdlin = 'CRTDUPOBJ OBJ(PFXLS) FROMLIB(*LIBL) ' +

                    'OBJTYPE(*FILE) TOLIB(QTEMP)';

         P_Cmdlen = %LEN(%TRIM(P_CmdLin));

 

         CallP(E) $QCMDEXC(P_Cmdlin: P_Cmdlen);

 

         // If the file already exists, clear it

         If %Error;

           P_Cmdlin = 'CLRPFM FILE(QTEMP/PFXLS)';

           P_Cmdlen = %LEN(%TRIM(P_CmdLin));

           CallP(E) $QCMDEXC(P_Cmdlin: P_Cmdlen);

 

           If %Error;

             Return *On;

           EndIf;

         EndIf;

 

 

         // Copy the physical file to the intermediate file,

         // using the separator character to segregate fields

         P_Cmdlin = 'CPYTOIMPF FROMFILE(' + %Trim(P_File) +

                    ') TOFILE(QTEMP/PFXLS) MBROPT(*REPLACE) RCDDLM(*EOR) ' +

                    'STRDLM(*NONE) FLDDLM(''' + %Trim(P_Sep) + ''') ' +

                    'DECPNT(*COMMA)';

         P_Cmdlen = %LEN(%TRIM(P_CmdLin));

 

         CallP(E) $QCMDEXC(P_Cmdlin: P_Cmdlen);

 

         // If the copy ended in error

         If %Error;

           Return *On;

         EndIf;

 

       EndSr;

 

       //  End Procedure

       BegSr  EndPgm;

 

         Close PFXLS;

 

       EndSr;

 

      /END-FREE

     P FileToXls       E

 

The initialization procedure uses the trick I mentioned before: since the XlsWrtLin reads a string that contains the cell values segregated by a special character, the first step is to copy the physical file to an intermediate file that has only one field, basically a big string. This is achieved with the CPYTOIMPF command, using the separator character to segregate the fields. Once the copy is done, file PFXLS will contain records similar to the P_Row variable on the previous examples.

 

From this point on, it's easy! The main processing is just a matter of reading the intermediate file, writing each line to the spreadsheet, closing it, and writing it to the IFS. This time, the termination routine is actually doing something: closing the intermediate file so that no locks occur.

Easy, But…

All these procedures designed to build spreadsheets in an easier way have (I hope) advantages, but they also have some limitations worth discussing. Since they're Java-based, speed is not their strongest attribute; if you're transforming a report program that uses a printer file into another that generates an Excel file, you'll notice mild to severe performance degradation. It's hard to predict exactly how slow it will get, but as a rule of thumb, estimate at least twice the time. It depends on a lot of factors, such as number of columns and data types.

 

Another limitation is the usage of date and timestamp fields when using the FileToXls procedure. The corresponding spreadsheet column has to be formatted as text, instead of date, because System i and Excel have different notions of what a date is! You change either the physical file field type or the spreadsheet column formatting, but I suppose changing the latter is easier.

 

It's really important to choose the right separator character; otherwise, you might get indecipherable errors and odd results. This can lead to a lot of frustration and wasted time.

 

You can download the file named EASYXLS CHEAT SHEET.DOC as a quick reference guide for these procedures; it provides parameter information and the examples mentioned in this article. You can also download the save file containing the service program EASYXLS, saved in library EASYXLSLIB.

Easy, Easy, Easy

I hope that you find these procedures useful and that the great tool created and maintained by Scott Klement (and my humble procedures with it) are used by programmers to modernize the part of the applications that is mostly left in the shade.

 

Author's note: I'd like to thank Paulo Ferreira, an analyst/developer at LeasePlan Portugal with more than 10 years of experience on System i, for his invaluable help with the procedures, especially with the last one, which he conceived and implemented.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$