[House of Technology: Canadian OS/2 Software Distributor (click here) ]

the REXX Files- by Dr. Dirk Terrell


I had a discussion with someone this month about using REXX to composite HTML files instead of using Server-Side Includes (SSI) on an HTTP server. SSI provides some very useful capabilities, and in some cases it is the only way to accomplish a given task. However, it can greatly decrease the performance of a server if each file has to be scanned before sending it. Basically, SSI works by scanning an HTML file for special commands before sending it to the requesting web browser.

One common example of using SSI is the inclusion of header and footer text on all pages of a web site. In this particular case, it might make more sense to use REXX to composite the HTML files together before placing them on the server. Let's look at an implementation of an HTML preprocessor written in REXX. (Of course, this could be used for any text file, not just HTML.)

Basically what we want to do is embed special markers in a text file that tell our preprocessor what to do. For HTML, an obvious choice would be to use HTML comment tags with the preprocessor commands inside them. Recall that an HTML comment tag is of the form:

<!--- some text --->
So let's use HTML comments to embed a preprocessor command to include a file at the location of the command. For clarity, let's call the file with the preprocessor commands in it the source file and the processed file to be created the target file. The source file might look like this:
<HTML>
<HEAD>
<TITLE>Sample HTML File</TITLE>
</HEAD>
<BODY> 
<!--- include header.src --->
On this page you will find links to other web sites containing OS/2 information...
<!--- include footer.src ---> 
</BODY>
</HTML>
In this file, we want the line
<!--- include header.src --->
to be replaced with the contents of the file header.src and the line
<!--- include footer.src --->
to be replaced with the contents of the file footer.src.

The first thing to do is read the contents of our target file into a variable. Obviously, you don't want to do that with very large (multi-megabyte) files, but HTML files are generally small enough that reading the entire contents of the file into memory poses no problems. Let's pass the source and target file names to the program on the command line and use PARSE ARG to retrieve them:

Parse Arg SourceFile Targetfile
Now that we know the name of the target file, we read its contents using the CHARIN function:
NChars=Chars(SourceFile)
SourceText=Charin(SourceFile,1,NChars)
rc=Stream(SourceFile,"C","Close")
The variable NChars is the number of bytes contained in the source file, which is retrieved with the CHARS function. We need to know the number of bytes to tell the CHARIN function to read, and since we want to read the entire file, we use the result of CHARS as input to CHARIN.

The second parameter in the CHARIN function tells it where in the file (which byte number) to begin reading. We use 1 to start at the beginning of the file.

The third line closes the source file since we are now finished with it. It is always a good habit to close a file as soon as possible to free up its file handle since a limited number of file handles are available. If you find yourself reading lots of files in a REXX program and it is mysteriously crashing after the first few, chances are you are forgetting to close files and thus running out of file handles.

So now we have the contents of the source file in memory in the variable SourceText. The next step is to scan the contents for preprocessor commands. The REXX function for this job is POS which returns the location of one string within another. The calling form is:

POS(target,source,start)
where target is the string we are looking for, source is the string that may or may not contain target, and start is the position in source to begin the search. Since there may be multiple preprocessor commands in the file, we will have to use POS several times until we find them all. When we begin, we obviously want start to be 1 (i.e., start at the beginning of source). POS will return either a 0 (target was not found in source) or a number indicating the byte where target first appears in source. Here is the loop that searches for all occurrences of a preprocessor include command:
N=Pos("<!--- include",SourceText,1)
Do While N<>0
   End=Pos(">",SourceText,N) /* The end of the include tag */
   If SubStr(SourceText,End+1,2)=crlf then /* Eliminates extra CR/LF */
      End=End+2
   FirstPart=SubStr(SourceText,1,N-1) /* The text up to the include tag */
   LastPart=SubStr(SourceText,End+1)  /* The text after the include tag */
   IncludeTag=SubStr(SourceText,N,End-N+1)
   Parse Var IncludeTag . "include" IncludeFile "--->"
   IncludeFile=Strip(IncludeFile) /* Strip any leading/trailing spaces */
   NChars=Chars(IncludeFile)
   IncludeText=Charin(IncludeFile,1,NChars) /* Read in the include file */
   rc=Stream(IncludeFile,"C","Close")
   SourceText=FirstPart||IncludeText||LastPart /* Put all of the pieces together */
   Drop FirstPart LastPart IncludeText /* Don't need these anymore so clear them */
   N=Pos("<!--- include",SourceText,1)
End
Now all that's left to do is write out the processed file:
rc=SysFileDelete(TargetFile) /* Make sure that the output file does not exist */
rc=Charout(TargetFile,SourceText) /* Write out the output file */
rc=Stream(TargetFile,"C","Close") /* Close the output file */
As usual, there are many things that could be added such as error checking (does the indicated include or source file exist?) but this is enough to get you started. A nice generalization of the program would be to search for a list of commands rather than just include. Another would be to process all source files in a directory (e.g. search for all files ending in .src and process them.

The sample file (ZIP, 2k) includes the above code and some small text files to test it on. Run it by typing:

rexpp source.src source.htm
and it will process the file source.src, inserting header.src and footer.src and creating the output file source.htm. Let me know if you create any useful variations of this program.
Dr. Dirk Terrell is an astronomer at the University of Florida specializing in interacting binary stars. His hobbies include cave diving, martial arts, painting and writing OS/2 software such as HTML Wizard.

[Index]  [ Previous] - [Feedback] - [Next ]

[Our Sponsor: Oberon Software, Inc. - Home of TE/2, TE/2 Pro and other fine OS/2 programs.]


This page is maintained by Falcon Networking. We welcome your suggestions.

Copyright © 1997 - Falcon Networking