c# - How to format Elasticsearch highlight data that contains html? -
i'm using elasticsearch 1.7 highlighting on c# webapp data that's being highlighted has html in it. stripping out html regex
regex.replace(rawhighlight, "<.*?>", string.empty)
the problem comes in when highlight not finish complete html tag. example, if pre , post tags @highlight--
, --highlight@
might result this:
<div>this @highlight--example--highlight@ </d
so regex remove first div
not 1 not complete ie. </d
so sort of 2 questions in one. there regex remove malformed html after first regex run (that on end or start of string) or there better way of using elasticsearch's highlight
won't have parse string?
maybe did asked that, completion: why have html in search index? , if have to, add second (scripted) field, contains parsed html (without tags), , use highlighting.
Comments
Post a Comment