php - How to stop regex match if word is found? -
i have text this:
text 786 opq rts appendix title text 123 abc efg appendix b text 456 hij klm and
text 786 opq rts appendix title text 123 abc efg text 456 hij klm i'm trying use regex extract text starting appendix a appendix b if appendix b present otherwise appendix a until end (i.e., hij klm). also, appendix a must appear within 15 words before title. i'm come far:
(\b(?:appendix)(?:.){0,15}(?:title)(?:.*)(?:appendix){0,1})/is problem is, capture not stop @ appendix b if appendix b there, captures until end.
one way use alternation optional part
perl -0777 -wlne' @m = /(appendix .{0,15} title (?: .*?appendix\s\w+ | .*) )/xsig; @m ' input.txt with /g match sections within appendix markers.
or capture multiple groups, 1 optional item, test , use accordingly
perl -0777 -wne' @m = /(appendix .{0,15} title) (.*? appendix\s\w+)? (.*)/xsi; print join "", ($m[1] ? @m[0,1] : @m[0,2]) ' input.txt this works because $2 is created second ( if there no match.
with yet more capture groups can filter in second case, ? grep { defined } @m. if there may multiple appendix-sections better use while $n variables in approach
while (/(appendix.{0,15}title)(.*?appendix\s\w+)?(.*)/sig) { $appx_section = ($2) ? $1.$2 : $1.$3; ... } since 1 big @m captures need little analysis.
all these print desired output in both cases, including multiple appendix-sections.
i've wrapped in one-liners ready testing. code works in perl script stands.
Comments
Post a Comment