Programs Available


The programs currently supported are:


Note: The link given above for awk is to the original AT&T version. On many systems awk is an alias for gawk or nawk. The link given for nawk is for the Heirloom Toolchest version. This is actually better than the original AT&T version in that it supports Unicode. Note also that the links for grep, egrep, and fgrep are all to GNU grep. In the GNU implementation, there is only a single program whose behavior emulates that of the AT&T programs when given appropriate command line options. In the case of standard UNIX tools, there may be many different versions. I have generally given a link to the GNU version.

Support for Java is for Sun Java J2EE 1.4 SDK. The compiler must be called javac, the byte-code interpreter java. Only recent versions will work since it is only in recent releases that java.util.regex is provided. Although the execution of Java for individual searches is reasonably fast, because each execution of Java involves a fairly large overhead, feature testing for Java is slow.

The version of Javascript with which Redet has been tested is the standalone Javascript shell. This has some capabilities that versions intended for CGI use do not, in particular, the ability to read and write files. As far as I know, the regular expression matching facilities of this version of Javascript are the same as for the CGI versions.

There are a number of variants of sed. Most versions, including GNU sed and BSD sed, are always named sed. Minised installs by default as minised but on some systems is the only version of sed and is named sed. Supersed is sometimes called ssed and sometimes sed. Redet attempts to determine which version of sed is called sed and to behave appropriately. This determination may not be foolproof since some seds do not identify themselves unequivocally.

The Tcl language has two regular expression facilities. One is a general-purpose string matching facility, denoted above as tcl. The other is a file-globbing facility, denoted above as tclglob.

For programs like grep whose very purpose is the matching of patterns, it is obvious what it means to use that program to execute a regular expression. In the case of editors such as sed, execution of a regular expression consists of using it to match input lines. In the case of programming languages such as python, regular expressions are given as arguments to the regular expression matching function provided by the language. In the case of shells, executing a regular expression means using that regular expression as a file globbing pattern.

The original and builtin regular expression facility of PHP provides POSIX regular expressions for ASCII using functions with names like ereg. This is denoted php-posix. Two extensions provide alternative facilities. These are not always available. One, denoted php-mb, allows for multibyte character sets, such as Unicode. The other, denoted php-pcre, provides PERL-style regular expressions.

The language tcl provides both regular expression matching on strings and file name globbing, with distinct regular expression syntax, so both are provided for. If the program selected is tcl, regular expression matching on strings is performed. If the program selected is tclglob, file name globbing is performed.

Rebol is only partially supported because its pattern matching system is not a typical regular expression matching system. Some aspects of its pattern matching system are procedural. In order to make full use of Rebol pattern matching, one must write procedural programs; much of the system lies outside of a declarative system like that used for regular expressions. The declarative part of the system is the parse function. Only this part is supported by Redet.

Another limitation concerns character classes. Rebol provides a system for defining character classes comparable to that commonly used for regular expressions. Unfortunately, in Rebol character classes may only be defined outside of the parse function. It is therefore not possible to treat them like the character class definitions of regular expressions. For the present, at least, Redet does not support user-defined character classes in Rebol. However, to make things a bit easier, several character classes have been predefined. These are:

ASCII lower case letters
ASCII upper case letters.
ASCII letters
The decimal digits
The union of the ASCII letters and decimal digits
The hexadecimal digits


Back to Table of Contents