java - Need to scrape an url from a web page -
i need scrape url website located within javascript code.
<script type="text/javascript"> (function() { // somewhere.. $.get("http://someurl.com?q=34343&b=343434&c=343434")... }); </script>
i know url starts http://someurl.com?q=
, needs have @ least second query parameter (&b=
) inside, rest of content unknown.
i tried jsoup, it's not suitable task. manually fetching page , applying regex pattern on not preferable option since page huge. url quick , safe?
you can use regex
/\$\.get\("(http:\/\/someurl\.com\?q=[\w.\-%#\/]*&b=[\w.\-%&=\/]*)"\)/g
this regex search directly string:
$.get("http://someurl.com?q=
it allow number of url valid characters occur value of q.
it match
&b=
and again number of valid characters followed opposing quotation marks. tested with
match - $.get("http://someurl.com?q=34343&b=343434&c=343434") match - $.get("http://someurl.com?q=34343&b=13a43&k=343434&c2=something") fail - $.get("http://someurl.com?q=34343&c=343434&b=343434") fail - $.get("http://someurl.com?a=34343&b=343434=343434")
if want return first result can remove global identifier end
/\$\.get\("(http:\/\/someurl\.com\?q=[\w.\-%#\/]*&b=[\w.\-%&=\/]*)"\)/