Reference - What does this regex mean? -
what this?
this collection of common q&a. community wiki, invited participate in maintaining it.
why this?
regex suffering give me ze code
type of questions , poor answers no explanation. reference meant provide links quality q&a.
what's scope?
this reference meant following languages: php, perl, javascript, python, ruby, java, .net.
this might broad, these languages share same syntax. specific features there's tag of language behind it, example:
- what regular expression balancing groups? .net
the stack overflow regular expressions faq
quantifiers
- zero-or-more:
*
:greedy,*?
:reluctant,*+
:possessive - one-or-more:
+
:greedy,+?
:reluctant,++
:possessive ?
:optional (zero-or-one)- min/max ranges (all inclusive):
{n,m}
:between n & m,{n,}
:n-or-more,{n}
:exactly n - differences between greedy, reluctant (a.k.a. "lazy", "ungreedy") , possessive quantifier:
- greedy vs. reluctant vs. possessive quantifiers
- in-depth discussion on differences between greedy versus non-greedy
- what's difference between
{n}
,{n}?
- can explain possessive quantifiers me? php, perl, java, ruby
- emulating possessive quantifiers .net
- non-stack overflow references: oracle, regular-expressions.info
character classes
- what difference between square brackets , parentheses?
[...]
: 1 character,[^...]
: negated/any character but[^]
matches 1 character including newlines javascript[\w-[\d]]
/[a-z-[qz]]
: set subtraction .net, xml-schema, xpath, jgsoft[\w&&[^\d]]
: set intersection java, ruby 1.9+[[:alpha:]]
:posix character classes- why
[^\\d2]
,[^[^0-9]2]
,[^2[^0-9]]
different results in java? java - shorthand:
- digit:
\d
:digit,\d
:non-digit - word character (letter, digit, underscore):
\w
:word character,\w
:non-word character - whitespace:
\s
:whitespace,\s
:non-whitespace
- digit:
- unicode categories (
\p{l}, \p{l}
, etc.)
escape sequences
- horizontal whitespace:
\h
:space-or-tab,\t
:tab - newlines:
- negated whitespace sequences:
\h
:non horizontal whitespace character,\v
:non vertical whitespace character,\n
:non line feed character pcre php5 java-8 - other:
\v
:vertical tab,\e
:the escape character
anchors
^
:start of line/input,\b
:word boundary, ,\b
:non-word boundary,$
:end of line/input\a
:start of input,\z
:end of input php, perl, python, ruby\g
:start of match php, perl, ruby
(also see "flavor-specific information → java → functions in matcher
")
groups
(...)
:capture group,(?:)
:non-capture group\1
:backreference , capture-group reference,$1
:capture group reference- what subpattern
(?i:regex)
mean? - what 'p' in
(?p<group_name>regexp)
mean? (?>)
:atomic group or independent group,(?|)
:branch reset- named capture groups:
- java:
(?<groupname>regex)
: overview , naming rules (non-stack overflow links) - other languages:
(?p<groupname>regex)
python,(?<groupname>regex)
.net,(?<groupname>regex)
perl,(?p<groupname>regex)
,(?<groupname>regex)
php
- java:
lookarounds
- lookaheads:
(?=...)
:positive,(?!...)
:negative - lookbehinds:
(?<=...)
:positive,(?<!...)
:negative (not supported javascript) - lookbehind limits in:
- lookbehind alternatives:
modifiers
- most flavors:
g
:global,i
:case-insensitive,u
:unicode,x
:whitespace-extended c
:current position perle
:expression php perlo
:once rubym
:multiline php perl python javascript .net java,m
:(non)multiline rubys
:single line (not supported javascript or ruby),s
workaround javascripts
:study phpu
:ungreedy php r- how convert preg_replace e preg_replace_callback?
- what inline modifiers?
- what '?-mix' in ruby regular expression
other:
|
:alternation (or) operator,.
:any character,[.]
:literal dot character- what special characters must escaped?
- control verbs (php , perl):
(*prune)
,(*skip)
,(*fail)
,(*f)
- php only:
(*bsr_anycrlf)
- php only:
- recursion (php , perl):
(?r)
,(?0)
,(?1)
,(?-1)
,(?&groupname)
common tasks
- get string between 2 curly braces:
{...}
- match (or replace) pattern except in situations s1, s2, s3...
- how find youtube video ids in string using regex?
- validation:
- internet: email addresses, urls (host/port: regex , non-regex alternatives), passwords
- numeric: a number, min-max ranges (such 1-31), phone numbers, date
- parsing html regex: see "general information > when not use regex"
advanced regex-fu
- strings , numbers:
- other:
- how can match a^n b^n java regex?
- match nested brackets
- “vertical” regex matching in ascii “image”
- list of highly up-voted regex questions on code golf
- how make 2 quantifiers repeat same number of times?
- an impossible-to-match regular expression:
(?!a)a
- match/delete/replace
this
except in contexts a, b , c
flavor-specific information
(except marked *
, section contains non-stack overflow links.)
- java
- official documentation: pattern javadoc, oracle's regular expressions tutorial
- the differences between functions in
java.util.regex.matcher
:matches()
): match must anchored both input-start , -endfind()
): match may anywhere in input string (substrings)lookingat()
: match must anchored input-start only- (for anchors in general, see section "anchors")
- the
java.lang.string
functions accept regular expressions:matches(s)
,replaceall(s,s)
,replacefirst(s,s)
,split(s)
,split(s,i)
- *an (opinionated and) detailed discussion of disadvantages of , missing features in
java.util.regex
- .net
- official documentation:
- boost regex engine: general syntax, perl syntax (used textpad, sublime text, ultraedit, ...???)
- javascript 1.5 general info , regexp object
- .net mysql oracle perl5 version 18.2
- php: pattern syntax,
preg_match
- python: regular expression operations,
search
vsmatch
, how-to - splunk: regex terminology , syntax , regex command
- tcl: regex syntax, manpage,
regexp
command
general information
(links marked *
non-stack overflow links.)
- other general documentation resources: learning regular expressions, *regular-expressions.info, *wikipedia entry, *rexegg, open-directory project
- dfa versus nfa
- generating strings matching regex
- books: jeffrey friedl's mastering regular expressions
- when not use regular expressions:
- some people, when confronted problem, think "i know, i'll use regular expressions." have 2 problems. (blog post written stack overflow's founder)*
- do not use regex parse html:
- don't. please, don't
- well, maybe...if you're really determined (other answers in question good)
examples of regex can cause regex engine fail
tools: testers , explainers
(this section contains non-stack overflow links.)
online (* includes replacement tester, + includes split tester):
- debuggex (also has repository of useful regexes) javascript, python, pcre
- *regular expressions 101 php, pcre, python, javascript
- regex pal, regular-expressions.info javascript
- rubular ruby regexr regex hero dotnet
- *+ regexstorm.net .net
- *regexplanet: java java, go go, haskell haskell, javascript javascript, .net dotnet, perl perl php pcre php, python python, ruby ruby, xregexp xregexp
freeformatter.com
xregexp- *+
regex.larsolavtorvik.com
php pcre , posix, javascript - refiddle javascript ruby .net
offline:
- microsoft windows: regexbuddy (analysis), regexmagic (creation), expresso (analysis, creation, free)
Comments
Post a Comment