source: trunk/j/doc/regexp.html @ 2

Last change on this file since 2 was 2, checked in by piso, 18 years ago

Initial checkin.

File size: 3.4 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
2
3<html>
4
5<head>
6<title>J User's Guide - Regular Expressions</title>
7<LINK REL="stylesheet" HREF="j.css" TYPE="text/css">
8</head>
9
10<body>
11
12<a href="contents.html">Top</a>
13
14<hr>
15
16<h1>Regular Expressions</h1>
17
18<hr>
19
20<h2>Background</h2>
21
22A regular expression is a character string where some characters are given
23special meaning, so that the pattern as a whole denotes a possibly infinite
24class of alternative strings to match.
25<p>
26J uses the <a href="http://www.cacas.org/~wes/java">gnu.regexp</a> package.
27
28
29<h2>Supported Syntax</h2>
30
31Within a regular expression, the following characters have special meaning:
32
33<ul>
34<li>
35Positional Operators
36<blockquote>
37<code>^</code> matches the beginning of a line<br>
38<code>$</code> matches the end of a line<br>
39</blockquote>
40
41<li>
42One-Character Operators
43<blockquote>
44<code>.</code> matches any single character<br>
45<code>\d</code> matches any decimal digit<br>
46<code>\D</code> matches any non-digit<br>
47<code>\n</code> matches a newline character<br>
48<code>\r</code> matches a return character<br>
49<code>\s</code> matches any whitespace character<br>
50<code>\S</code> matches any non-whitespace character<br>
51<code>\t</code> matches a tab character<br>
52<code>\w</code> matches any word (alphanumeric) character<br>
53<code>\W</code> matches any non-word (alphanumeric) character<br>
54<p>
55Otherwise, <code>\c</code> matches the character <i>c</i>.
56</blockquote>
57
58<li>
59Character Classes
60<blockquote>
61<code>[abc]</code> matches any character in the set <i>a</i>, <i>b</i> or <i>c</i><br>
62<code>[^abc]</code> matches any character not in the set <i>a</i>, <i>b</i> or <i>c</i><br>
63<code>[a-z]</code> matches any character in the range <i>a</i> to <i>z</i> (inclusive)<br>
64<p>
65A leading or trailing dash is interpreted literally.<br>
66</blockquote>
67
68<li>
69Subexpressions and Backreferences
70<blockquote>
71<code>(abc)</code> matches whatever the expression <i>abc</i> would match, and saves it as a subexpression<br>
72<code>\<i>n</i></code> where 1 &lt;= <i>n</i> &lt;= 9, matches the same thing the <i>n</i>th subexpression matched<br>
73<p>
74Parentheses can also be used for grouping.
75<p>
76Parentheses used for grouping or to record matched subexpressions should not be escaped.
77<p>
78Backreferences may also be used in replacement strings; see <a href="commands.html#replace">replace</a>.
79</blockquote>
80
81<li>
82Branching (Alternation) Operator
83<blockquote>
84<code>a|b</code> matches whatever the expression <i>a</i> would match, or whatever the expression <i>b</i> would match.<br>
85</blockquote>
86
87<li>
88Repeating Operators
89<blockquote>
90<code>?</code> matches zero or one occurrence of the preceding expression or the null string<br>
91<code>*</code> matches zero or more occurrences of the preceding expression<br>
92<code>+</code> matches one or more occurrences of the preceding expression<br>
93<code>{m}</code> matches exactly <i>m</i> occurrences of the preceding expression<br>
94<code>{m,n}</code> matches between <i>m</i> and <i>n</i> occurrences of the preceding expression (inclusive)<br>
95<code>{m,}</code> matches <i>m</i> or more occurrences of the preceding expression<br>
96<p>
97The repeating operators operate on the preceding atomic expression.<br>
98</blockquote>
99
100<li>
101Stingy (Minimal) Matching
102<blockquote>
103If a repeating operator is immediately followed by a ?, the repeating operator
104will stop at the smallest number of repetitions that can complete the rest of
105the match.
106</blockquote>
107
108</ul>
109
110</body>
111
112</html>
Note: See TracBrowser for help on using the repository browser.