Lief Clennon is a computer hobbyist and Team OS/2 member currently residing in Albuquerque, NM. He can usually be found badgering his friends on IRC.
Blast Back! Send a private message directly to Lief Clennon with your thoughts:
Go to a Printer Friendly version of this page
Summary: Speed up your doily browsing and save money by staying offline while you read your favorite sites, all with the help of these Web Mirroring and caching utilities.
There are a surprising number of entries in the OS/2 web-mirroring market, and even more surprisingly many of them are commercial software packages. Out of the group, I found four worthy of review: Sslurp, WebMirror/2, Templeton, and Wget.
Sslurp, while not the strongest in its feature set, is pure OS/2 software; WebMirror and Templeton are concurrently developed in Windows and Unix respectively, and Wget is GNU free software. Sslurp is copyrighted freeware, and has previously gone by the names "spider" and "wsuck".
Sslurp's interface (GIF, 5.2k) is fairly straightforward; you type a URL (the "http://" or equivalent prefix is required) into the entry bar, and press the Start! button. It uses only one download thread, so everything's done in order; if it's getting a file you don't want it to, press skip, and if you want it to stop, there's a button for that too. The URLs of the files being retrieved and their current status are displayed in the window below.
Sslurp allows you to set a specific path for the files to be retrieved to, but under that directory it creates a separate directory for each server (e.g., www.os2ezine.com) and for each subdirectory on those servers, no matter how deep your starting directory was. This is fairly standard behavior for mirroring utilities, but most of them have an option to disable it; Sslurp does not. Because of this it requires an HPFS partition to support the long file and directory names it will try to create.
You may also specify an exclusive list of extensions that will be downloaded (for instance, htm and html to only download pages and not any other files they link to). A nice feature is that control over whether or not inline images are downloaded is completely separate from your extension list, meaning that you don't need to explicitly include (or not include) extensions like GIF and JPG.
WebMirror (GIF, 6.5k) is commercial shareware also available for Windows, and is replete with user-friendly efficiency blockades. Its shareware limitation is that only one level of mirroring is done, which makes the unregistered version useless except as a demo. It has very little in the way of control over what is mirrored; you can only specify how many levels deep to go, and whether to follow links off the server. However, it may be worthwhile for many, especially in a business environment, because of its special features.
First, you can specify that pages are regularly updated automatically. Not only that, but you can configure WebMirror to automatically dial into your provider for this purpose. And, perhaps most useful, WebMirror includes a proxy server. This means that you can store your most frequently accessed sites locally, with frequent automatic updates. With the additional option to completely disallow access to sites not locally cached, this allows a LAN administrator -- or for that matter, a concerned parent -- to keep tight control over what gets accessed.
Wget and Templeton
The last two web mirroring utilities are command-line applications, and being Unix ports, both require the EMX runtime environment. Templeton has an interactive mode if run with no parameters, but this only gives access to the most basic of features; Wget, in the standard method of Unix software, requires everything to be either on the command-line or in the configuration file. However, these two are the most powerful and feature-rich selections. The basic rule of thumb for these two is that if you can think of it, so did the authors. A full feature-by-feature comparison of the two would probably be about as big as their combined help files (300k or so), so I'll only mention a couple of the more unique features.
Templeton will rewrite the HTML it downloads on the fly. Not only will it perform minor corrections on HTML grammar, and fix a few of the easier-to-detect typos, but it also has the capability to rewrite links to point to mirrored local files, instead of to files on remote servers. This isn't quite the proxy feature of WebMirror, but with a little bit of scripting it can serve the same purpose, and in a much more dynamic fashion. Templeton has a shareware timeout; it will only run for ten minutes, so if your mirroring takes longer than that, you'll have to register.
Wget is distributed under the GNU General Public License, meaning that not only do you get the software, you get the source code as well. It has two features that set it apart: first, it will mirror FTP sites as well as web sites, including the ability to process links in HTML files accessed through FTP. Second, it supports the resume functions of both HTTP and FTP servers, and will automatically attempt to continue a file whose transfer was cut short.
Each of the utilities reviewed here has its strong points. Sslurp is far and away the easiest to use; WebMirror has its proxy server and Templeton has a power-user's equivalent; Wget is free, surprisingly user-friendly for a command-line app, and does FTP and resumes. All of them have a place, and it's up to the individual as to which best fits their personal needs.
|Copyright © 1998 - Falcon Networking||ISSN 1203-5696||October 1, 1998|