Previous | Table of Contents | Next |
Perl (Practical Extraction and Report Language) is a computer language for text processing. Developed and maintained by Larry Wall, its origins are said to date back to when Larry needed utilities to aid in the administration of several UNIX computer systems.
As a computer language, Perl can be said to share characteristics from C, c-shell, awk, and sed computer languages/programs. Perl really excels in its text recognition and pattern-matching capabilities. To put it in perspective, you can often perform tasks requiring these capabilities using, say, 100 lines of Perl code, that would require 1000 lines in another language.
From the SGML perspective, Perl is extremely useful in several respects. By itself, it can be used to write programs for data conversion and other tasks. Its usefulness in this regard cannot be overstated.
For example, in a rather complicated SGML production system that myself and others have developed, Perl programs are used in a variety of ways. These tasks include those listed in table 28.1.
Task | Description |
---|---|
Document Conversion | Converts Interleaf source documents into SGML |
File Rename | Renames large numbers of scanned art files into SGML system format |
SGML Post Processing | Adds floating content tags into SGML document files |
Graphics File Manager | Builds document association files that bind graphics files with associated hotspot layers |
Document Conversion Error Scanning | Scans document conversion output files for error messages and reports to user |
Source File Art Identification | Scans source document files and identifies names of all referenced art files |
Document Comparison | Contents of a parts list document are compared with the corresponding bill of materials |
Library Management | Intermediate art library collection processed periodically; when disk is full, those files not accessed for 30 days are deleted |
As you can see, Perl programs are used to perform a number of tasks in this SGML system. Perls flexibility in textual pattern recognition make it a valuable tool in SGML system conversions and data manipulation.
There are a number of sources for Perl on the Internet. Among these is the University of Florida Perl Archive (see fig. 28.1).
Fig. 28.1 The University of Florida Perl Archive contains a wide variety of Perl soft-ware versions and other related information.
Perls flexibility and utility are becoming increasingly recognized in the software development community. As a result, many users have built various tools and utilities with Perl that perform a number of functions, including those related to SGML processing.
A number of useful tools for SGML processing based on Perl are available, usually through the Internet. In fact, it seems that as people appreciate the utility of Perl, the available number of tools will continue to grow.
perlSGML. Written as a collection of Perl programs and utility libraries, perlSGML provides various support for SGML document processing. It provides a number of functions for manipulating SGML document instances and DTDs.
The functions supported by perlSGML include:
DynaText is a browser for viewing SGML documents electronically. Through a compilation process, DynaText indexes SGML documents into electronic book collections.
The DynaText browser is particularly noteworthy for its ability to support any DTD (rather than a particular set of standard DTDs). It includes support for hypertext navigation, context-sensitive full text search. Native graphics support includes TIFF and CALS raster formats. CGM vector graphics support is available as an option. The display of complex tables is also supported.
DynaText is particularly powerful in its suitability to particularly large documents (see fig. 28.2). Unlike other SGML viewers, it can handle very large documents without a major downturn in performance. Its ability to perform reformatting on-the-fly on very large documents sets it apart from the other SGML viewers currently available.
Fig. 28.2 DynaText electronic book browser is a powerful tool for viewing SGML docu-ments. It is particularly noteworthy for its ability to handle very large documents.
The output formatting capabilities of DynaText are highly flexible. Similar to the FOSI output specification in its use of SGML syntax, DynaTexts output formatting will shortly support the more powerful DSSSL Lite subset of the Document Style Semantic Specification Language (ISO/IEC standard 10179).
Note:
A sample version of the DynaText book browser is included along with sample books.
DynaText is part of a family of products from Electronic Book Technologies that supports SGML document systems. Their full range of SGML related products is shown in table 28.2.
Product | Description |
---|---|
DynaText | SGML document viewer |
DynaTag | Document conversion tool |
DynaBase | SGML data repository and document management system |
DynaWeb | SGML based World Wide Web server |
CADLeaf Batch | Batch graphics extraction and conversion |
Previous | Table of Contents | Next |