0

I have a problem with getting XML from this webpage. In the browser it shows correctly and there is no issue, but when it comes to Java, it is different.

I've tried two methods which both of them resulted in exception.

// Method 1 - Using Java's URL
URL url = new URL(/* mentioned link */);
String rawXML = new String(url.openStream().readAllBytes(), StandardCharsets.UTF_8); // java.io.IOException: Invalid Http response
// Method 2 - Using Apache's HTTP client
HttpGet httpGet = new HttpGet(/* mentioned link */);
String rawXML = EntityUtils.toString(HttpClients.createDefault().execute(httpGet).getEntity()); // org.apache.http.ProtocolException: The server failed to respond with a valid HTTP response

Downloading this webpage with wget and using argument --content-on-error works but it is unreliable since wget is not always available on all systems like Windows.

0

1 Answer 1

1

The response does not contains headers so java rejects it

wget "https://www.strava.cz/foxisapi/foxisapi.dll/istravne.istravne.process?xmljidelnickyA&zarizeni=3148" -O so-69226464.html
--2021-09-17 13:44:29--  https://www.strava.cz/foxisapi/foxisapi.dll/istravne.istravne.process?xmljidelnickyA&zarizeni=3148
Resolving www.strava.cz (www.strava.cz)... 82.99.180.77
Connecting to www.strava.cz (www.strava.cz)|82.99.180.77|:443... connected.
HTTP request sent, awaiting response... 200 No headers, assuming HTTP/0.9
Length: unspecified

This java class making a raw HTTP GET request is able to get the contents. Based on this page.
The request sent is

GET /foxisapi/foxisapi.dll/istravne.istravne.process?xmljidelnickyA&zarizeni=3148 HTTP/1.1\r\n
User-Agent: RawHttpGet\r\n
Host: www.strava.cz\r\n
Accept: */*\r\n

Java code:

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.Socket;
import java.nio.charset.StandardCharsets;

import javax.net.ssl.SSLSocketFactory;

public class RawHttpGet {
    private static String hostname = "www.strava.cz";
    public static void main(String[] args) throws IOException {
        Socket socket = SSLSocketFactory.getDefault().createSocket(hostname, 443);

        // UTF-8 encdoding
        //BufferedWriter out = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), StandardCharsets.UTF_8));
        // Encoding for this request
        BufferedWriter out = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), "Cp1250"));
        BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
        
        StringBuffer buff = new StringBuffer("GET /foxisapi/foxisapi.dll/istravne.istravne.process?xmljidelnickyA&zarizeni=3148 HTTP/1.1\r\n");
        buff.append("User-Agent: RawHttpGet\r\n");
        buff.append("Accept: */*\r\n");
        buff.append("Host: " + hostname + "\r\n");
        buff.append("\r\n");
        System.out.println(" * Request");
        System.out.println(buff.toString());
        // send message
        out.write(buff.toString());
        out.flush();

        // read response
        System.out.println(" * Response");
        // Default system encoding
        //System.out.println(new String(socket.getInputStream().readAllBytes()));
        // Encoding for this request
        System.out.println(new String(socket.getInputStream().readAllBytes(), "Cp1250"));

        out.close();
        in.close();
    }
}
Sign up to request clarification or add additional context in comments.

7 Comments

Works exactly how I expected it to. Thanks for your answer!
@Mayuna fixed code to use SSL port 443. You can still use port 80 by commenting/uncommenting appropriately.
don't convert xml to characters/Strings/Readers. that is a great way to destroy. keep it as bytes and Input/OutputStreams
use InputStream and OutputStream and byte[] to copy bytes between.
Thanks, I'll update my code.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.