description
『patt』 pattern
「patt」 alternative
〔patt 1║patt 2║…〕 alternative group
【name】 see description "name"
〈…〉 -
expression
『【delimiter】【pattern】【delimiter】【modifiers】』
delimiter
A delimiter can be any non-alphanumeric, non-whitespace character, but ...
Often used delimiters are forward slashes (/), hash signs (#) and tilde...
The delimiters as in order of their statistical use: /#~!@%°=&
modifiers
『「【set】」「-【reset】」』
Values for 【set】 and 【reset】 are group of the following characters:
i remCaseLess Do case-insensitive pattern matching.
m remMultiLine Treat string as multiple lines. That is, ...
A (2) remAnchored *
D (2) remDollarEndOnly *(ignored if modifier "m" is set)
s remSingleLine Treat string as single line. That is, cha...
S (1) *Ausführung steigern
U remUngreedy *Gier unterdrücken
x remExtended Extend your pattern's legibility by permi...
u (1) *UTF-8 interpretiert
p (1) (preserve) Preserve the string matched such that ${^...
g (1) (global) Global matching
1) not supported
2) not allowed as pattern in extendet groups
pattern syntax - meta-characters:
『\…』 general escape character with several uses
『(…)』 subpattern
『…|…』 alternative patterns
『.』 match any character except newline (by default)
『^』 assert start of subject (or line, in multiline mode)
『$』 assert end of subject (or line, in multiline mode)
『[…]』 character class
『…?』 0 or 1 quantifier (or quantifier minimizer)
『…*』 0 or more quantifier
『…+』 1 or more quantifier
『…{…}』 min/max quantifier
『#…』 comment - only if modifier "x" is set
If used this characters, this must be delimited.
meta-characters in character classes:
『\…』 general escape character
『^』 negate the class, but only if the first character
『-』 indicates character range
『[:…:]』 POSIX character class
delimited characters and classes
\0 null or Octal character code
\1 to \9 back reference
\a bell (alert)
\A text start
\b \B word boundary
\c control character
\C single character
\d \D decimal digit
\e escape
\E end of quote (\Q, \L and \U)
\f form feed
\g back reference
\G matches start
\h \H horizontal space characters
\k named back reference
\K keep the left stuff
\l \L lowercase characters
\n new line
\N named
unicode character
\p \P named property
\Q quote
\r carrige return
\R newline sequence
\s \S space
\t tabulator
\u \U uppercase characters
\v \V vertical space characters
\w \W word characters
\x heXadecimal character code
\X eXtended
unicode sequence
\z text end
\Z text end or end of last line
\< start of word
\> end of word
The followed characters must be delimited if they are to be used.
\ ( ) | . ^ $ [ ? * + {
# (if modifier "x" is set)
characters
『\0【digit】』 octal character code
『\x【x-digit】【x-digit】』 heXadecimal character code (
Ansi)
『\x{【x-digits】}』 heXadecimal character code (
Unicode)
『\c【character】』 control char
『\N{【name】}』 named
unicode character
supported names
U+xxxx hexadecimal character code
named character class (named
unicode properties)
『\p【character】』
『\p{【name】}』 for names of only one letter
『\P【character】』 any characters but not this
『\P{【name】}』 any characters but not this
supported classes
IsCntrl, IsSpace, IsSpacePerl, IsDigit, IsXDigit, IsUpper, IsLower,
IsAlpha, IsAlnum, IsWord, IsPunct, IsGraph, IsPrint, IsASCII
supported scripts
Arabic, Armenian, Balinese, Bengali, Bopomofo, Braille, Buginese, Bu...
Canadian_Aboriginal, Cherokee, Common, Coptic, Cuneiform, Cypriot, C...
Deseret, Devanagari, Ethiopic, Georgian, Glagolitic, Gothic, Greek, ...
Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Inherited, Kannada...
Kharoshthi, Khmer, Lao, Latin, Limbu, Linear_B, Malayalam, Mongolian...
New_Tai_Lue, Nko, Ogham, Old_Italic, Old_Persian, Oriya, Osmanya, Ph...
Phoenician, Runic, Shavian, Sinhala, Syloti_Nagri, Syriac, Tagalog, ...
Tai_Le, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Yi
supported general category property codes
C other
Cc control
Cf format
Cn unassigned
Co private use
Cs surrogate
L letter
Ll lower case letter - specifying caseless matching does not affe...
Lm modifier letter
Lo other letter
Lt title case letter - specifying caseless matching does not affe...
Lu upper case letter - specifying caseless matching does not affe...
M mark
Mc spacing mark
Me enclosing mark
Mn non-spacing mark
N number
Nd decimal number
Nl letter number
No other number
P punctuation
Pc connector punctuation
Pd dash punctuation
Pe close punctuation
Pf final punctuation
Pi initial punctuation
Po other punctuation
Ps open punctuation
S symbol
Sc currency symbol
Sk modifier symbol
Sm mathematical symbol
So other symbol
Z separator
Zl line separator
Zp paragraph separator
Zs space separator
character class
『[「^」【character list】「【character list】…」]』
character list
『【character】』 single character or delimited char...
『【character】-【character】』 range of characters
『\【class】』 delimited class
『[:【POSIX】:]』 POSIX character class
^ inverts the class
POSIX character class
『[「^」:【name】:]』:
this can used only in a character class ( […] )
supported classes
cntrl, space, blank, digit, xdigit, upper, lower,
alpha, alnum, punct, graph, print
group
『(【pattern】)』
named group
『(?「P」<【name】>【pattern】)』
modifier change (extendet group)
『(?【modifiers】)』
extendet group
『(?「【modifiers】」:【pattern】)』
look-ahead
『(?「【modifiers】」=【pattern】)』
negative look-ahead
『(?「【modifiers】」!【pattern】)』
look-behind
『(?「【modifiers】」<=【pattern】)』
negative look-behind
『(?「【modifiers】」<!【pattern】)』
recursive subpattern
『(?「-║+」【number】)』
『(?R)』
『(?P>【name】)』
『(?P&【name】)』
clones the pattern (not the result) of a previous group
(?R) = (?0)
conditional subpattern
『(?(【condition】)【yes-pattern】「|【no-pattern】」)』
condition
『「-║+」【number】』
『R』
『{【name】}』
『【pattern】』
back references
『\【digit】』 for the references 1 to 9
『\g【digit】』
『\g{「-║+」【number】}』
『\g【character】』 for names of only one letter
『\g{【name】}』
named back references
『\k<【name】>』
『\k'【name】'』
『\k{【name】}』
comments
『(?#【text】)』
『#【text】([\r\n]|$)』 (1)
non in character sets
1) only if modifier "e" is set
quantifier
『【pattern】?「?║+」』 einmal oder garnicht equivalent to ...
『【pattern】*「?║+」』 garnicht oder mehrmals equivalent to ...
『【pattern】+「?║+」』 mindestens einmal equivalent to ...
『【pattern】{n}「?║+」』 n-mal
『【pattern】{n,}「?║+」』 mindestens n-mal
『【pattern】{n,m}「?║+」』 n-mal bis m-mal
characters and character classes:
. any character - if multiple lines are not activated then doesn't m...
\0 null character
\a bell (alert #7)
\n new line (#10)
\f form feed (#13)
\e escape {#27}
\t tabulator (#9)
\h horizontal space characters
\v vertical space characters
\r carrige return (#13)
\R newline sequence
\d decimal digit
\w word character
\s space
\X eXtended
unicode sequence
\C single char - one character or a part of surrogate pairs
\H any character but none horizontal space characters
\V any character but an vertical space characters
\D any character but not a decimal digit
\W any character but an word character
\S any character but a space
control classes:
^ line start
$ line end
\A text start
\G matches start
\z text end
\Z text end or end of last line
\b word boundary
\B not a word boundary
\< start of word
\> end of word
\l lowercase next char
\u uppercase next char
\L lowercase till \E
\U uppercase till \E
\Q quote (disable) pattern metacharacters till \E
\E end of quote (\Q, \L and \U)
\K keep the stuff left of the \K, don't include it in result
options
reoSplitNoEmpty If this flag is set, then from SPLIT ...
reoSplitDelimCapture If this flag is set, then be parenthe...
reoOffsetCapture If this flag is set, then returned wi...
reoSplitSetCapture Orders results so that $array[0] an a...
default (no reoSplitSetCapture) Orders results so that $array[0] an a...
reoCustomizeLinebreaks
related character classes and sets
DESCRIPTION POSIX PERL FN PERL PERL ...
...
--------------- ----------- --------------- -- ----------------...
any char . [^\n\r]
control [:cntrl:] \p{IsCntrl} [\x00-\x1F\x7F] ...
white space+tab [:blank:] \p{IsSpace} [ \t] ...
whitespace \p{IsSpace} \s [ \f\t\v]
whitespacePerl [:space:] \p{IsSpacePerl} [ \f\n\r\t\v] ...
punctuation [:punct:] \p{IsPunct} [!-/:-@[-`{-~] ...
decimal digit [:digit:] \p{IsDigit} \d [0-9] ...
hexadecimal [:xdigit:] \p{IsXDigit} [0-9A-Fa-f] ...
upper [:upper:] \p{IsUpper} \u [A-Z] ...
lower [:lower:] \p{IsLower} \l [a-z] ...
upper+lower [:alpha:] \p{IsAlpha} [A-Za-z] ...
alphanumeric [:alnum:] \p{IsAlnum} [A-Za-z0-9] ...
alphanumeric+_ [:word:] \p{IsWord} \w [A-Za-z0-9_] ...
printable [:graph:] \p{IsGraph} [!-~] ...
printable+space [:print:] \p{IsPrint} [ -~] ...
any
ASCII [:
ascii:] \p{IsASCII} [\x00-\xFF] ...
any
Unicode [\x00-\x{FFFF}]
[:punct:] []!"#$%&\'()*+,./:;<=>?@\\^_`{|}~[-]
[:xdigit:] [[:digit:]A-Fa-f]
[:alpha:] [[:upper:][:lower:]]
[:alnum:] [[:alpha:][:digit:]]
[:word:] [[:alnum:]_]
[:graph:] [[:word:][:punct:]]
[:print:] [ [:graph:]]