 
 
The Regular-expressions library exports the Regular-expressions module, which contains various functions that deal with regular expressions (regexps). The module is based on Perl (version 4), and has the same semantics unless otherwise noted. The syntax for Perl-style regular expressions can be found on page 103 of Programming Perl by Larry Wall and Randal L. Schwartz. There are some differences in the way String-extensions handles regular expressions. The biggest difference is that regular expressions in Dylan are case insensitive by default. Also, when given an unparsable regexp, String-extensions will produce undefined behavior while Perl would give an error message.
A regular expression that is grammatically correct may still be illegal if it contains an infinitely quantified sub-regexp that may match the empty string. That is, if R is a regexp that can match the empty string, then any regexp containing R*, R+, and R{n,} is illegal. In this case, the Regular-expressions library will signal an <illegal-regexp> error when the regexp is parsed. Note: Perl also has this restriction, although it isnt mentioned in Programming Perl.
In previous versions of the regular-expressions library, each basic function had a companion function that would pre-compute some information needed to use the regular expression. By using the companion function, one could avoid recomputing the same information. In the present version, the regular-expressions library caches this information, so the companion functions are no longer necessary and should be considered obsolete. However, they have been kept for backwards compatibility.
Companion functions differ in details, but they all essentially return curried versions of their corresponding basic function. For example, the following two pieces of code yield the same result:
            regexp-position("This is a string", "is");
            let is-finder = make-regexp-positioner("is");
            is-finder("This is a string");
Both pieces of code should have roughly the same performance, even if the code is inside a loop.
 
The following names are exported by the Regular-Expressions module of the Regular-Expressions library:
 
regexp-position [Function]
(big-string, regexp, #key start, end, case-sensitive)
=> variable-number-of-marks-or-#f
            regexp-position("This is a string", "is");
            regexp-position("This is a string", "(is)(.*)ing");
            regexp-position("This is a string", "(not found)(.*)ing");
 
make-regexp-positioner [Function]
(regexp, #key byte-characters-only, need-marks, maximum-compile, case-sensitive)
=> an anonymous positioner 
 method (big-string, #key start, end)
 
regexp-replace [Function]
(big-string, search-for-regexp, replace-with-string, #key count, case-sensitive, start, end)
=> new-string
            regexp-replace("The rain in Spain and some other text",
                           "the (.*) in (\\w*\\b)", "\\2 has its \\1")
            regexp-replace("Hi there", "Hi there(, Bert)?", 
                           "What do you think\\1?")
 
make-regexp-replacer [Function]
(regexp, #key replace-with, case-sensitive)
=> an anonymous replacer function that is either
 method (big-string, #key count, start, end)
or 
 method (big-string, replace-string, #key count, start, end)
 
translate [Generic Function]
(big-string, from-string, to-string, #key delete, start, end) 
=> new-string
            translate("any string", "a-z", "A-Z")
            translate("any string", "a-z", "z-a")
            translate("any string", ".aeiou", ",", delete: #t)
            translate("any string", ",./:;[]{}()", " ");
 
translate [G.F. Method]
(big-byte-string, from-byte-string, to-byte-string, #key delete, start, end) 
=> new-string
 
make-translator [Generic Function]
(from-string, to-string, #key delete) 
=> an anonymous translator
 method (big-string, #key start, end) => new-string
 
make-translator [G.F. Method]
(from-byte-string, to-byte-string, #key delete) 
=> an anonymous translator
 method (big-string, #key start, end) => new-byte-string
 
split [Function]
(regexp, big-string, #key count, remove-empty-items, case-sensitive, start, end)
=> a variable number of strings
            split("-", "long-dylan-identifier")
            split("-", "long--with--multiple-dashes)
            split("-", "really-long-dylan-identifier", count: 3)
            split("-", "really-long-dylan-identifier", start: 8)
 
make-splitter [Function]
(pattern :: <string>, #key case-sensitive)
=> an anonymous splitter 
 method (big-string, #key count, remove-empty-items, start, end) => buncha-strings
 
join [Function]
(delimiter :: <string>, #rest strings) => big-string
            join(":", word1, word2, word3)
            concatenate(word1, ":", word2, ":", word3)
 
<illegal-regexp> [Class]
 
The regular expression parser does a very poor job with syntactically invalid regular expressions. Depending on the expression, the parser may signal an error, improperly parse it, or simply crash.
A regular expression that matches a large enough substring can produce a stack overflow. This can happen much more easily under d2c than under Mindy -- as few as two dozen lines of 80 column text under d2c for Windows.
Copyright 1994, 1995, 1996, 1997 Carnegie Mellon University. All rights reserved.