Regex to read URL from ASPX File PowerShell -


i'm writing powershell script extracts url's aspx files , test if http statuscode equal 200.

i found following regex url:

$regex = "(http[s]?|[s]?ftp[s]?)(:\/\/)([^\s,]+)" select-string -path $path -pattern $regex -allmatches | % { $_.matches } | % { $_.value } 

but return looks this:

https://code.jquery.com/ui/1.9.0/themes/base/jquery-ui.css"/> https://code.jquery.com/ui/1.11.4/jquery-ui.min.js"></script> 

as can see, doesn't trim end of html tags.

how can edit regex url without html tags in end?

if have @ [^\s,] negated character class, see matches any char whitespace , ,. if @ input have, notice " , < , > can matched [^\s,].

a fix current situation add <>" chars negated character class make regex engine "stop" when comes across >, < , " chars.

note since extract whole matches, may refactor pattern bit , remove unnecessary groupings , turn first 1 non-capturing group:

$regex = '(?:http|s?ftp)s?://[^\s,<>"]+' 

mind in .net patterns, / not need escaped (it not special regex metacharacter/operator).


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -