WebMaven

WebMaven was a commercial computer program currently under development by Dick Goran, President of CFSNevada, Inc. Its general purpose is to serve as an off-line imaging and management facility for the Internet or intranets.

WebMaven captures and copies all or portions of publicly available material from locations on the World Wide Web (Web sites), or from a private intranet, to the user's local computer. At the same time, WebMaven "localizes" all of the HTML links to their respective values in the copied files. WebMaven is available in both a consumer edition and an enterprise edition. A FAQ sheet and the technical specifications for WebMaven are available for review.

WebMaven captures a remote Web site based on the site's "tree" structure. HTML links which are deemed to be out of tree are replaced with a link to an a page indicating the out of tree reference. WebMaven does not leave remote links in the localized pages. Therefore, unlike the results from similar products, there are no broken links in the WebMaven localized files.

WebMaven has a built-in JavaScript pre-processor and a Java virtual machine interpreter so Web sites containing either are handled by WebMaven without regard to your operating system software or your preferred Web browser.

WebMaven has no browser dependencies. You use your browser of choice to view the reports created while processing a Web site.

The enterprise edition of WebMaven serves as a powerful management tool for anyone responsible for the maintenance of a Web site, whether the site is large or small. WebMaven produces numerous management reports. A summary of these reports, along with sample copies of the individual reports, is available for review.

An uncrippled, but limited feature, version of WebMaven will be available via anonymous download when the product becomes available.

WebMaven Advantages
There are drawbacks to reading material which is published electronically on the Internet. The user must be connected to the Internet via a modem or some other kind of connection in order to retrieve the material. That means that the connection must be maintained all of the time while the user is reading the material.

Similarly, downloading of Internet documents is single-threaded (except for the overlap of included images provided by the common browsers today (Netscape Navigator / Communicator, Microsoft Internet Explorer, etc.).

WebMaven overcomes this by allowing up to nine download processes to run concurrently thus optimizing the throughput between the user's computer and the Internet. WebMaven keeps the "pipe" full at what ever speed you are connected to the Internet. This is of particular advantage to users who are billed for connect time based on hours used.

There can also be delays in retrieving subsequent pages after the user has finished with the current page. Internet users find this delay especially frustrating causing them to abandon many Web sites.

Since multiple instances of WebMaven can run concurrently, if a server for a particular site is temporarily slow in responding, the other instances of WebMaven will keep Internet transmission going at maximum speed.

WebMaven as a Webmaster's Tool
The enterprise edition of WebMaven is a powerful tool for anyone who has the responsibility of maintaining a Web site.

WebMaven is the Webmaster's dream when it is necessary to move a Web site from one domain to another. With WebMaven's localize function (described below), Web pages containing complete URIs within a specified domain will be converted to relative URIs. Localize can be set to leave full URIs on other domains intact.

WebMaven produces numerous reports (samples) that assist the Webmaster in maintaining a Web site.

In addition to the [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/ReportDomainException.htm Unresolved Domain Names], [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/ReportLinksException.htm Links Exception Report], [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/ReportSyntaxErrors.htm Syntax Error Report] and produced by all editions of WebMaven, the enterprise edition provides the reports shown in the following table to assist the Webmaster in managing his or her site. All of the reports are shown in the [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/ReportSummary.htm WebMaven Summary Report] with footnotes (though they appear on the side) about the contents of each report.

Since some of these reports can be excessively large, the content of the large reports is also produced as delimited ASCII files which can be imported in programs of the user's choice.

The [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/ReportDomainException.htm Unresolved Domain Names] and [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/ReportLinksException.htm Links Exception Report], which are the only reports produced by an unlicensed copy of WebMaven, contain a link to e-mail the content of each respective report to the Webmaster of the site that was retrieved. If "Webmaster" does not appear as an e-mail address within the retrieved files, the most commonly used e-mail address in the retrieved files at the domain will be used as the mailto: address for the reports. Consistent with the spirit of Internet privacy, both Netscape Navigator and Internet Explorer will prompt for permission before actually sending the report.

The included reports are from an actual site retrieved with WebMaven. All of the files from the site are also included, but localized to their current domain and path. Some of the report files are large and may be slow to download. The [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/SampleReports/SampleReports.zip SampleReports.zip file] (5.2 MB) contains all of the files included in the SampleReports and subordinate directories and can be downloaded.

Functions
A [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/faq_01.htm Q & A FAQ sheet, along with the WebMaven technical specifications], is available for review.

Localize
The "localize" capability is one of the unique features of WebMaven. If a Web site page is copied to the user's computer with currently available means, graphics, sound, and text images that are referenced on the Web page must also be copied otherwise, the local copy is rendered with portions omitted (broken links). While these related images or files can also generally be copied, they must be retrieved one-by-one.

Even after they are copied to the local computer, they may not be rendered properly by the user's Internet browser since the references to them (HTML links) in the original document may be described in a way that references their original location on the Internet rather than the local copy. The localize capability of WebMaven overcomes this. All HTML links in the local copies of the retrieved documents are converted, where necessary, so that they are rendered properly when the local copy of the Web page is viewed.

Another unique capability of WebMaven is its ability to retrieve the operative portions (class files) of Java-based Web applications. WebMaven not only downloads all class files explicitly referenced on the Web site, it also retrieves all of the other class files required by a Java applet or object. Where possible, any Java-related data files are also retrieved.

The localized documents / files, including the related images, can be compressed (zipped) by the user for local archiving. When the archive is decompressed (unzipped), the integrity of the retrieved documents remains intact regardless of the status, or the presence or absence, of the original Web site documents.

Chron
While one of the purposes of the Internet is to provide timely dissemination of information, there are Web sites that are updated at frequent intervals. Some are even built dynamically for each individual request. This leads to the requirement that some sites be viewed on a regular basis at the risk of missing changing information which is not archived by the Web site owner.

The "chron" capability of WebMaven allows the program to log onto the Internet at regularly or irregularly scheduled dates and times and retrieve specified Web site documents. This feature allows dynamically changing Web sites to be copied to the local computer in the user's absence. These Web documents can then be viewed at the reader's leisure.

One of the best examples of a use for WebMaven is the retrieval of a multi-page document from the Internet like a white paper or book. These documents are typically published in a tree-like structure with the first page branching off into one or more subordinate branches. WebMaven has the ability of retrieving all of the subordinate pages within the tree structure that comprise the complete document.

Another feature of WebMaven is the ability to download desired information from a Web site and store it on a computer's disk. Portable computers such as laptops and notebooks can be used to view this information without being connected to the Internet.

There are many other day-to-day uses for this program (many of which the author hasn't thought of yet).

Inherent with all of the applicable uses of the program is its ability to optimize the transfer speed of the material from the Internet to the user's computer. Furthermore, it runs unattended so that the material can be gathered (downloaded) while the user is away from his or her computer.

Operating Environment
WebMaven is applicable for both the individual Internet surfer as well as the corporate Internet user. It will run on any personal computer running Windows 95undefined or Windows NTundefined from Microsoft, or OS/2undefined from IBM.

WebMaven is both CPU and memory intensive when retrieving large Web sites. A minimum of a 133 Mhz Pentium CPU is recommended for both the personal and enterprise versions of WebMaven with faster CPUs recommended.

The personal version of WebMaven requires a minimum of 32 MB of memory with 64 MB recommended. The enterprise version requires a minimum of 64 MB of memory with 128 or 256 MB recommended.

Hard drive requirements are dependent on the Web sites being retrieved.

Competition
There are no known programs available that contain all of the functions included in WebMaven. While there are other programs whose purpose is to capture or mirror Web sites, they do not contain many of the technological capabilities of WebMaven. WebMaven maintains the integrity of the path (tree) structure from the original Web site along with the original names unless they conflict with the client file system. WebMaven does not encode the files it downloads making the material viewable by any browser, and portable without conversion or "export".

WebMaven processes Web sites created with CGI scripting as well as sites that use redirection, scripting, and Java applets. WebMaven has the ability to process each Java class like it is processed by the Java virtual machine and is therefore able to retrieve all associated classes and most data files referenced by these classes.

The only restriction on what WebMaven can process is HTML links that are dynamically generated through scripting.

Distribution
Initial product introduction and distribution is planned via Internet and catalog mail order sites.

A trial version of the program will be made available via the Internet for worldwide distribution (anonymous download) at no cost to the user. The trial version will have some of its [/web/20061110172524/http://www.toward.com/cfsrexx/WebMaven/techspecs.htm#FeatureByLevel features limited] in either scope or size.

Full functionality of the trial version can be enabled by the user's purchasing and installing an electronic "key" for the program. The key will be available from established distribution sources.