2

I'm trying to get data from an urlA but I kept being redirected from the urlA to urlB.

I receive the response code 301 (moved permanentely) but if I use any browser (e.g. Chrome, Firefox or even Internet explorer) I can still go to urlA without being redirected. So urlA still exists (the browser does not load it from any kind of cache) and does not redirect the user automatically to urlB if you use a web browser.

How can I force my Java programm using HttpURLConnection to go to the original urlA that still exists?

private StringBuffer getHTMLCode(String urlA) throws IOException {
    URL url = new URL(urlA);
    final String userAgent = "Mozilla/5.0";

    HttpURLConnection con = (HttpURLConnection) url.openConnection();
    con.setInstanceFollowRedirects(false);  // NO REDIRECT, if I set TRUE I will be redirected to urlB
    con.setRequestMethod("GET");
    con.setRequestProperty("User-Agent", userAgent);

    int responseCode = con.getResponseCode(); 
    System.out.println("\nSending 'GET' request to URL : " + url); // shows my original urlA
    System.out.println("Response Code : " + responseCode);  // <-- 301 moved permanently        

    BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
    StringBuffer htmlCode = new StringBuffer();
    String inputLine;    

    while ((inputLine = in.readLine()) != null) {
        htmlCode.append(inputLine);
    }

    System.out.println(htmlCode);  // <head><title>Document Moved</title></head><body><h1>Object Moved</h1>This document may be found <a HREF="urlB">here</a></body>

    in.close();     
    return htmlCode;
}
2
  • If you got 301 you weren't redirected. Unclear what you're asking. Commented Jan 29, 2016 at 18:05
  • Ok, but nevertheless, I don't get the original html code from urlA but a instead htmlCode contains different html with a text saying "This document may be found..." (see the last sysout in my code above). That's why I thought I'm being redirected. My question is: how can I get the html code of urlA? Commented Jan 30, 2016 at 22:19

1 Answer 1

2

Possibly the code you are looking for is this, which uses setInstanceFollowRedirects

private StringBuffer getHTMLCode(String urlA) throws IOException {
    URL url = new URL(urlA);
    final String userAgent = "Mozilla/5.0";

    HttpURLConnection con = (HttpURLConnection) url.openConnection();
    con.setInstanceFollowRedirects(false);  // NO REDIRECT, if I set TRUE I will be redirected to urlB
    con.setRequestMethod("GET");
    con.setRequestProperty("User-Agent", userAgent);
    con.setInstanceFollowRedirects(false);

    int responseCode = con.getResponseCode(); 
    System.out.println("\nSending 'GET' request to URL : " + url); // shows my original urlA
    System.out.println("Response Code : " + responseCode);  // <-- 301 moved permanently        

    BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
    StringBuffer htmlCode = new StringBuffer();
    String inputLine;    

    while ((inputLine = in.readLine()) != null) {
        htmlCode.append(inputLine);
    }

    System.out.println(htmlCode);  // <head><title>Document Moved</title></head><body><h1>Object Moved</h1>This document may be found <a HREF="urlB">here</a></body>

    in.close();     
    return htmlCode;
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.