php - How to stop regex match if word is found? -
i have text this:
text 786 opq rts appendix title text 123 abc efg appendix b text 456 hij klm
and
text 786 opq rts appendix title text 123 abc efg text 456 hij klm
i'm trying use regex extract text starting appendix a
appendix b
if appendix b
present otherwise appendix a
until end (i.e., hij klm
). also, appendix a
must appear within 15 words before title
. i'm come far:
(\b(?:appendix)(?:.){0,15}(?:title)(?:.*)(?:appendix){0,1})/is
problem is, capture not stop @ appendix b
if appendix b
there, captures until end.
one way use alternation optional part
perl -0777 -wlne' @m = /(appendix .{0,15} title (?: .*?appendix\s\w+ | .*) )/xsig; @m ' input.txt
with /g
match sections within appendix
markers.
or capture multiple groups, 1 optional item, test , use accordingly
perl -0777 -wne' @m = /(appendix .{0,15} title) (.*? appendix\s\w+)? (.*)/xsi; print join "", ($m[1] ? @m[0,1] : @m[0,2]) ' input.txt
this works because $2
is created second (
if there no match.
with yet more capture groups can filter in second case, ? grep { defined } @m
. if there may multiple appendix
-sections better use while
$n
variables in approach
while (/(appendix.{0,15}title)(.*?appendix\s\w+)?(.*)/sig) { $appx_section = ($2) ? $1.$2 : $1.$3; ... }
since 1 big @m
captures need little analysis.
all these print desired output in both cases, including multiple appendix
-sections.
i've wrapped in one-liners ready testing. code works in perl script stands.
Comments
Post a Comment