Scripting Languages on i: The Good, the Bad, the Ugly

Scripting
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Contrary to what you might think, scripting languages have a long history on the i, and a wide variety of options exist on the platform. This article compares and contrasts those options.

 

What's a scripting language? Within the context of this discussion, a scripting language is a programming tool that allows the developer to craft dynamic Web pages, especially pages using business data. Even more specifically, we're talking about server-side scripting languages: languages that run on the host to generate Web pages. This is an important distinction, because only server-side languages depend on the host operating system; client-side scripting runs in the browser and therefore is independent of the host. So this article is going to examine the various types of server-side scripting languages available for the IBM i.

How Do Scripting Languages Work?

In its simplest form, a server-side scripting language is a language whose primary goal is to output HTML. Remember that the simplest browser request is one in which a URL is sent to the Web server and the server returns the contents of a static Web page. Simple and effective, this sort of interaction drove the first generation of Web sites.

 

Scripting languages, however, are used for dynamic Web sites. A dynamic Web page is one in which the data displayed to the user is a mixture of static information (such as logos and basic formatting) and dynamic business data retrieved from the host at run time. So scripting languages are tools used to dynamically generate the HTML for a Web page from data on the host. The industry term for calling a scripting program is CGI, or Common Gateway Interface. CGI defines a specific protocol by which the HTTP server invokes the CGI program to generate the Web page. And even this definition has two primary subcategories: macro-based languages and general-purpose languages.

 

Macro-Based Languages

 

Macro languages work by using embedded special instructions within a standard Web page; these embedded instructions use a syntactical shorthand to influence the way the data will be output. Those of us who hearken back to the dawn of the computer age may remember something called a "macro assembler." Assembly language programs are written in a very low-level language; it might take 10 operation codes (opcodes) to execute a simple task, such as comparing two strings. In order to make programming more efficient, the macros assembler allowed a programmer to combine multiple opcodes into a single construct called a macro instruction, or macro, and then invoke that macro with a single line of code.

 

The macro-based scripting languages for Web applications work similarly. The macro language instructions are embedded within a standard HTML page. In practice, the developer designs a standard Web page—often using a static Web design tool such as Dreamweaver or one of the many Eclipse-based tools—and then inserts (or "embeds") the scripting language elements. These scripting elements have a wide variety of capabilities, but two major categories include data acquisition and flow control. The data acquisition portion encompasses the various ways that business data can be retrieved into the Web page. The most common is ODBC access to relational data, although most scripting languages also allow a more flexible interface via direct calls to programs running on the host. Using these features, programmers can gather data from the host into simple variables or more complex variable types, such as arrays and lists. Simple variables are easily output to the HTML page, while the more complex variables usually require some flow control: for example, looping through an array or dictionary.

 

Here's a typical example:

<html>
<head><title>Hello World</title></head>
<body>
<cfset message = "Hello World!">
<cfoutput>#message#</cfoutput>
</body>
</html>

 

The majority of the code above is standard HTML. Only two lines do not contain standard HTML tags: the cfset and cfoutput tags. The cfset tag sets the variable message to the value "Hello World!", and the cfoutput tag displays it.

 

Simple, isn't it? This example happens to use one of the earliest macro languages, ColdFusion, and it shows a standard technique: using custom tags to invoke features specific to the scripting language. Another way to do things is to separate the macro language portion of the page from the HTML by using a special "mode" tag.

 

<html>
<head><title>Hello World</title></head>
<body>
<?php

 $message = "Hello World!";

 echo $message;

?>
</body>
</html>

 

This listing creates the same page, but instead all of the syntactical elements of the scripting language are enclosed within the special mode tag, which has the format. You may have guessed that this example is written in PHP, a very popular scripting language. It's different from ColdFusion in that, between the beginning and end of the special PHP tag, the PHP code is written much like a normal programming language. Statements are ended with semicolons and so on, to the point where you can write complete PHP programs right in your Web page. The echo opcode is used to get variable data from the PHP code into the HTML page. This is syntactically a little different from the use of a special output tag like the cfoutput tag above, but it accomplishes the same purpose.

 

I realize that in both these examples I've hard-coded the "dynamic" part of the page (the "Hello World!" text); in a real-world application, some scripting code would have been invoked to get that data from the host, either from a program call or through database access.

 

General-Purpose Languages

 

General-purpose languages do things a little differently. Instead of embedding directives into an HTML page, they output the entire Web page. The way most of these languages work is that they write directly to a "stream" called STDOUT, which in turn sends the data to the browser. In programming terms, a stream is a file that is treated as a stream of bytes. This works well for things like text files and Web pages, and is the fundamental file access method for UNIX systems. In fact, in most UNIX-based programming languages, such as C and C++, STDOUT is the primary device used to communicate with the user. Most write or print opcodes typically go to that device. (Incidentally, that's why PHP can use the echo command to send data to the browser; echo sends the data to STDOUT, and the data is then redirected to the browser by the Web server.) Some languages, such as RPG and COBOL, require a little more help; they usually have a special helper API that provides access to the STDOUT stream.

 

Here's an example of the RPG code for the same page:

 

wMessage = 'Hello World!';

writeStdout('<html>');
writeStdout('<head><title>Hello World</title></head>');
writeStdout('<body>');
writeStdout(wMessage);

writeStdout('</body>');
writeStdout('</html>');

 

Note that the program is a normal RPG program (in this case, free-format RPG). Setting the variable wMessage is done the same way you would in any other program. I could use a CHAIN to a database file or a call to another program—whatever I wanted to do to get the data. Then I execute a series of calls to the procedure writeStdout, which is a wrapper function for the QtmhWrStout API. Note that most of the calls use a literal; this is the equivalent of the hard-coded HTML in the macro-based languages. The only difference is the line that calls writeStdout but passes in the variable wMessage; this is how I insert dynamic data into the page.

 

With the exception of the code to write the output to STDOUT, general-purpose languages can be used for anything. They could update databases just as easily as spitting out Web pages. And in fact, since UNIX-based applications tend to talk to one another through STDOUT—the output of one program is sent (or piped) to the input of the next program in the job—most UNIX programmers can write CGI programs.

 

So the primary difference between the two CGI subcategories is that with macro-based scripting languages, each Web response starts life as an actual Web page with some script embedded into it, while general-purpose languages have no such Web page template and must explicitly write out the static portion of the Web page as well as the dynamic part.

Options on the IBM i

You might not realize it, but many options exist for both categories of CGI development. In the macro-based category, PHP is the latest craze, but several other languages have been around for a very long time, including one or two commercial options. For example, ColdFusion is available on the i. But perhaps the best-integrated option was something called Net.Data. Not to be confused with Microsoft's various .NET offerings, Net.Data was a very specific application development tool for the IBM i and z families. It was a little different than some of the other macro languages in that the Net.Data "source" document was broken up into well-defined regions; one section was used to define macros, while the other section defined the Web page, including references to the previously defined macros. Net.Data had great integration with SQL (way ahead of its time, especially for the IBM midrange family), and it was used for a lot of nifty first-generation Web applications for the platform. Unfortunately, like many things unique to IBM (can you say OS/2?), Net.Data just didn't have the following that would justify its ongoing development, and eventually the product was dropped.

 

Probably the largest community for macro-based languages on the IBM i is the JavaServer Pages (JSP) community. JSP and especially JSP Model 2, with or without JavaServer Faces (JSF), is the preeminent technology for all of IBM tooling, starting with WebSphere and continuing on through things like WebSphere Portal and EGL. The biggest problem with Java-based solutions is that they need to run in a Web application server such as Tomcat or WebSphere. The other languages require only the HTTP server to run.

 

Commercial options also exist, such as Zend's PHP and ProData's RPG Server Pages.

 

On the general-purpose side of the coin, by definition any language can be used, as long as it can invoke the appropriate APIs to access STDOUT. However, a relatively robust community has grown up around RPG CGI and especially the CGIDEV2 libraries originally written by Mel Rothman and maintained by Dr. Giovanni Perotti. I haven't heard much about COBOL being used as a language. Other languages include C and Perl, although Perl is only available in the PASE environment.

 

And just to be complete, you can use Java but forgo the JSP route and write your own Java CGI programs (called servlets), which do all the work and explicitly write the entire Web page as well. I'm not sure why you would do this, since JSP and especially JSF are such mature tools; in fact, if you're considering this route, you might want to seek professional help, and I'm not talking about programming!

Server-Side vs. Client-Side Scripting

So I think I've covered both the good and the bad of scripting. The good is that there are lots of options, and the bad is that there are lots of options. That is, so many options make it hard to make a decision. But regardless of that choice, it's time to address the ugly part of scripting.

 

Remember, at the very beginning of this article I mentioned that client-side scripting is server-independent. In fact, with a little help, it's also multi-host capable (although you need a proxy to protect against cross-site scripting vulnerabilities). Client-side scripting also allows applications to be much more responsive to the end user because interactions do not require round-trips to the host.

 

The magic tool that provides this functionality is the JavaScript language. JavaScript is a language that runs inside nearly every browser and nowadays is even relatively standardized across browsers. Some discrepancies exist, primarily between the Microsoft and Mozilla browser families, but generally speaking, you can write applications that use JavaScript to create very slick and powerful user experiences. From social networking sites to cloud-based applications to online games, everything now uses JavaScript to some degree.

 

Let's take a standard shopping cart application. In a traditional page-at-a-time Web page (also known as Web 1.0, though the term is somewhat loosely defined), you might see a list of items with thumbnail images. In order to see a full-size image of the item, you would have to hit a Submit button, which would in turn bring up another page. With a Web 2.0 application, you can assign a JavaScript function to a "mouseover" event so that, when the user moves the cursor over the thumbnail, a full-size image displays, hovering over the page until the user moves off the thumbnail.

 

In a more business-oriented case, changing the quantity of an order line on a Web 1.0 application would not be reflected in the totals until the user hits some sort of Submit button, at which point the data would be posted back to the server and the entire page would be updated. With Web 2.0, you could assign a JavaScript function to the "onblur" event of the quantity field; this would trigger a small, self-contained message to the host using Asynchronous JavaScript and XML (AJAX) technology. This message would return the updated order information, which could be used to update the appropriate fields on the screen without repainting the entire page.

 

Two different problems make this particularly ugly. First, you have to learn JavaScript. So, if you choose something like PHP or Perl, you're going to have to learn that syntax, and then you're also going to have to learn JavaScript syntax. I can tell you that if you wanted to learn a single syntax, Java is probably closer to JavaScript than anything, although some features of PHP (such as typeless variables) are actually more aligned with JavaScript. The point remains: you have to learn two languages.

 

But even that pales in comparison to the real problem, which is that the JavaScript code is embedded in the Web page, and since you're dynamically generating some of the HTML, you may (and probably will) find yourself having to generate the JavaScript code as well. For example, if you want to create a function that reacts to a click on a row in a table, you'll have to figure out how to dynamically attach the function to each row. This can get really interesting when the size of the table changes at run time and even more fun when the columns change. I've spent a good part of my career writing code that generates code, and I can tell you from experience that it's probably one of the toughest jobs a programmer can do.

 

So there's the ugly part. You'll need to learn a couple of languages, and you'll probably have to use one to write code in the other.

 

It's the end of the article, so it's as good a time as any to insert my obligatory plug for EGL: with EGL, you learn a single syntax and the EGL tool generates the appropriate code, be it HTML for the thin-client pages or Java for the server-side code or JavaScript for the client-side. It will help for you to have an understanding of the syntax of the various generated languages, but you won't have to write that code yourself, much less use one language to write another. It's a huge benefit, and you can check it out by downloading the free EGL Community Edition.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$