I'm writing some code to find absolute URLS of a single webpage:
http://explore.bfi.org.uk/4ce2b69ea7ef3
So far I get all the links of that page and print the absolute urls
Here is part of the code:
Elements hyperLinks = htmlDoc.select("a[href]");
for(Element link: hyperLinks)
{
System.out.println(link.attr("abs:href"));
}
This prints out alot or urls just like the one above. However, it seems to skip a few URLS aswell. The ones it skips are the ones I actually need.
This is one of the a[href] elements its not turning into the absolute URL:
<div class="title"><a href="/4ce2b69ea7ef3">Royal Review</a><br /></div>
It will print this line if I just print "link" but when I put "abs:href", it will just print blank.
I am new to Java and appreciate any feedback!