public class RegExp
extends java.lang.Object
Automaton
.
Regular expressions are built from the following abstract syntax:
regexp | ::= | unionexp | ||
| | ||||
unionexp | ::= | interexp | unionexp | (union) | |
| | interexp | |||
interexp | ::= | concatexp & interexp | (intersection) | [OPTIONAL] |
| | concatexp | |||
concatexp | ::= | repeatexp concatexp | (concatenation) | |
| | repeatexp | |||
repeatexp | ::= | repeatexp ? | (zero or one occurrence) | |
| | repeatexp * | (zero or more occurrences) | ||
| | repeatexp + | (one or more occurrences) | ||
| | repeatexp {n} | (n occurrences) | ||
| | repeatexp {n,} | (n or more occurrences) | ||
| | repeatexp {n,m} | (n to m occurrences, including both) | ||
| | complexp | |||
complexp | ::= | ~ complexp | (complement) | [OPTIONAL] |
| | charclassexp | |||
charclassexp | ::= | [ charclasses ] | (character class) | |
| | [^ charclasses ] | (negated character class) | ||
| | simpleexp | |||
charclasses | ::= | charclass charclasses | ||
| | charclass | |||
charclass | ::= | charexp - charexp | (character range, including end-points) | |
| | charexp | |||
simpleexp | ::= | charexp | ||
| | . | (any single character) | ||
| | # | (the empty language) | [OPTIONAL] | |
| | @ | (any string) | [OPTIONAL] | |
| | " <Unicode string without double-quotes> " | (a string) | ||
| | ( ) | (the empty string) | ||
| | ( unionexp ) | (precedence override) | ||
| | < <identifier> > | (named automaton) | [OPTIONAL] | |
| | <n-m> | (numerical interval) | [OPTIONAL] | |
charexp | ::= | <Unicode character> | (a single non-reserved character) | |
| | \ <Unicode character> | (a single character) |
The productions marked [OPTIONAL] are only allowed if
specified by the syntax flags passed to the RegExp
constructor.
The reserved characters used in the (enabled) syntax must be escaped with
backslash (\) or double-quotes ("..."). (In
contrast to other regexp syntaxes, this is required also in character
classes.) Be aware that dash (-) has a special meaning in
charclass expressions. An identifier is a string not containing right
angle bracket (>) or dash (-). Numerical
intervals are specified by non-negative decimal integers and include both end
points, and if n and m have the same number
of digits, then the conforming strings must have that length (i.e. prefixed
by 0's).
Modifier and Type | Field and Description |
---|---|
static int |
ALL
Syntax flag, enables all optional regexp syntax.
|
static int |
ANYSTRING
Syntax flag, enables anystring (@).
|
static int |
AUTOMATON
Syntax flag, enables named automata
(<identifier>).
|
static int |
COMPLEMENT
Syntax flag, enables complement (~).
|
static int |
EMPTY
Syntax flag, enables empty language (#).
|
static int |
INTERSECTION
Syntax flag, enables intersection (&).
|
static int |
INTERVAL
Syntax flag, enables numerical intervals
(<n-m>).
|
static int |
NONE
Syntax flag, enables no optional regexp syntax.
|
Constructor and Description |
---|
RegExp(java.lang.String s)
Constructs new
RegExp from a string. |
RegExp(java.lang.String s,
int syntax_flags)
Constructs new
RegExp from a string. |
Modifier and Type | Method and Description |
---|---|
java.util.Set<java.lang.String> |
getIdentifiers()
Returns set of automaton identifiers that occur in this regular
expression.
|
boolean |
setAllowMutate(boolean flag)
Sets or resets allow mutate flag.
|
Automaton |
toAutomaton()
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(AutomatonProvider automaton_provider)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(AutomatonProvider automaton_provider,
boolean minimize)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(boolean minimize)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(java.util.Map<java.lang.String,Automaton> automata)
Constructs new
Automaton from this RegExp . |
Automaton |
toAutomaton(java.util.Map<java.lang.String,Automaton> automata,
boolean minimize)
Constructs new
Automaton from this RegExp . |
java.lang.String |
toString()
Constructs string from parsed regular expression.
|
public static final int INTERSECTION
public static final int COMPLEMENT
public static final int EMPTY
public static final int ANYSTRING
public static final int AUTOMATON
public static final int INTERVAL
public static final int ALL
public static final int NONE
public RegExp(java.lang.String s) throws java.lang.IllegalArgumentException
RegExp
from a string. Same as
RegExp(s, ALL)
.s
- regexp stringjava.lang.IllegalArgumentException
- if an error occured while parsing the regular expressionpublic RegExp(java.lang.String s, int syntax_flags) throws java.lang.IllegalArgumentException
RegExp
from a string.s
- regexp stringsyntax_flags
- boolean 'or' of optional syntax constructs to be enabledjava.lang.IllegalArgumentException
- if an error occured while parsing the regular expressionpublic Automaton toAutomaton()
Automaton
from this RegExp
. Same
as toAutomaton(null)
(empty automaton map).public Automaton toAutomaton(boolean minimize)
Automaton
from this RegExp
. Same
as toAutomaton(null,minimize)
(empty automaton map).public Automaton toAutomaton(AutomatonProvider automaton_provider) throws java.lang.IllegalArgumentException
Automaton
from this RegExp
. The
constructed automaton is minimal and deterministic and has no transitions
to dead states.automaton_provider
- provider of automata for named identifiersjava.lang.IllegalArgumentException
- if this regular expression uses a named identifier that is
not available from the automaton providerpublic Automaton toAutomaton(AutomatonProvider automaton_provider, boolean minimize) throws java.lang.IllegalArgumentException
Automaton
from this RegExp
. The
constructed automaton has no transitions to dead states.automaton_provider
- provider of automata for named identifiersminimize
- if set, the automaton is minimized and determinizedjava.lang.IllegalArgumentException
- if this regular expression uses a named identifier that is
not available from the automaton providerpublic Automaton toAutomaton(java.util.Map<java.lang.String,Automaton> automata) throws java.lang.IllegalArgumentException
Automaton
from this RegExp
. The
constructed automaton is minimal and deterministic and has no transitions
to dead states.automata
- a map from automaton identifiers to automata (of type
Automaton
).java.lang.IllegalArgumentException
- if this regular expression uses a named identifier that
does not occur in the automaton mappublic Automaton toAutomaton(java.util.Map<java.lang.String,Automaton> automata, boolean minimize) throws java.lang.IllegalArgumentException
Automaton
from this RegExp
. The
constructed automaton has no transitions to dead states.automata
- a map from automaton identifiers to automata (of type
Automaton
).minimize
- if set, the automaton is minimized and determinizedjava.lang.IllegalArgumentException
- if this regular expression uses a named identifier that
does not occur in the automaton mappublic boolean setAllowMutate(boolean flag)
flag
- if true, the flag is setpublic java.lang.String toString()
toString
in class java.lang.Object
public java.util.Set<java.lang.String> getIdentifiers()