78
String urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_001.pdf";
URL url = new URL(urlString);
if(/* Url does not return 404 */) {
    System.out.println("exists");
} else {
    System.out.println("does not exists");
}
urlString = "http://www.nbc.com/Heroes/novels/downloads/Heroes_novel_190.pdf";
url = new URL(urlString);
if(/* Url does not return 404 */) {
    System.out.println("exists");
} else {
    System.out.println("does not exists");
}

This should print

exists
does not exists

TEST

public static String URL = "http://www.nbc.com/Heroes/novels/downloads/";

public static int getResponseCode(String urlString) throws MalformedURLException, IOException {
    URL u = new URL(urlString); 
    HttpURLConnection huc =  (HttpURLConnection)  u.openConnection(); 
    huc.setRequestMethod("GET"); 
    huc.connect(); 
    return huc.getResponseCode();
}

System.out.println(getResponseCode(URL + "Heroes_novel_001.pdf")); 
System.out.println(getResponseCode(URL + "Heroes_novel_190.pdf"));   
System.out.println(getResponseCode("http://www.example.com")); 
System.out.println(getResponseCode("http://www.example.com/junk"));           

Output

200
200
200
404

SOLUTION

Add the next line before .connect() and the output would be 200, 404, 200, 404

huc.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
5
  • I don't see the problem in your test. In my browser I don't get content for the second result, but I don't get a 404 Commented Sep 4, 2009 at 10:01
  • In fact I appear to get a largely empty HTML page Commented Sep 4, 2009 at 10:02
  • 1
    That website seems to give valid content for most anything. e.g. www.nbc.com/junk. Try with example.com/junk.html Commented Sep 4, 2009 at 10:13
  • The URL nbc.com/Heroes/novels/downloads/Heroes_novel_190.pdf gives me a completely blank page (not even <html> tag), but with a 404 header. Not very nice to users, but technically correct. Commented Sep 4, 2009 at 10:41
  • 1
    You should have separated the solution into an answer so I can upvote that too!. Commented Jun 16, 2013 at 0:29

5 Answers 5

60

You may want to add

HttpURLConnection.setFollowRedirects(false);
// note : or
//        huc.setInstanceFollowRedirects(false)

if you don't want to follow redirection (3XX)

Instead of doing a "GET", a "HEAD" is all you need.

huc.setRequestMethod("HEAD");
return (huc.getResponseCode() == HttpURLConnection.HTTP_OK);
Sign up to request clarification or add additional context in comments.

2 Comments

+1 for the HEAD, people forget how HTTP works every now and then and it's good some people still remember :)
Dealing with HTTPS urls is more tricky right ?? Have to manage the certificates...
44

this worked for me:

URL u = new URL ( "http://www.example.com/");
HttpURLConnection huc =  ( HttpURLConnection )  u.openConnection (); 
huc.setRequestMethod ("GET");  //OR  huc.setRequestMethod ("HEAD"); 
huc.connect () ; 
int code = huc.getResponseCode() ;
System.out.println(code);

thanks for the suggestions above.

Comments

24

Use HttpUrlConnection by calling openConnection() on your URL object.

getResponseCode() will give you the HTTP response once you've read from the connection.

e.g.

   URL u = new URL("http://www.example.com/"); 
   HttpURLConnection huc = (HttpURLConnection)u.openConnection(); 
   huc.setRequestMethod("GET"); 
   huc.connect() ; 
   OutputStream os = huc.getOutputStream(); 
   int code = huc.getResponseCode(); 

(not tested)

Comments

14

Based on the given answers and information in the question, this is the code you should use:

public static boolean doesURLExist(URL url) throws IOException
{
    // We want to check the current URL
    HttpURLConnection.setFollowRedirects(false);

    HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection();

    // We don't need to get data
    httpURLConnection.setRequestMethod("HEAD");

    // Some websites don't like programmatic access so pretend to be a browser
    httpURLConnection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)");
    int responseCode = httpURLConnection.getResponseCode();

    // We only accept response code 200
    return responseCode == HttpURLConnection.HTTP_OK;
}

Of course tested and working.

1 Comment

I recommend you set timeouts to avoid unbounded length calls. Ex: httpURLConnection.setConnectTimeout(10 * 1000); httpURLConnection.setReadTimeout(10 * 1000);
13

There is nothing wrong with your code. It's the NBC.com doing tricks on you. When NBC.com decides that your browser is not capable of displaying PDF, it simply sends back a webpage regardless what you are requesting, even if it doesn't exist.

You need to trick it back by telling it your browser is capable, something like,

conn.setRequestProperty("User-Agent",
    "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.13) Gecko/2009073021 Firefox/3.0.13");

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.