0

I have laboratory work to make a crawler using BSD socket api, so i need to make multiple http requests to extract all responses, i was trying to do it with single socket connection, but i can get the response only after sending first request header, for other response is empty. Here is my code, so what are the solutions?:

Socket socket = new Socket("fucking-great-advice.ru", 80);

    BufferedReader input = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    PrintWriter output = new PrintWriter(socket.getOutputStream());

    for (int numberAdvice = 1; numberAdvice < 100; numberAdvice++) {
        output.write("GET /advice/" + numberAdvice + " HTTP/1.0\r\n\r\n");
        output.flush();

        StringBuilder sb = new StringBuilder();
        int ch = 0;
        while ((ch = input.read()) != -1) {
            sb.append((char) ch);
        }
        String response = sb.toString().split("\r\n\r\n")[1];
        System.out.println(response);
    }

    input.close();
    output.close();
    socket.close();
1
  • 1
    Have you tried with keep alive ? Commented May 14, 2016 at 15:54

1 Answer 1

1

They are many problems in your current code:

  1. You don't provide the host in the header of your request such that you get an error 404.
  2. You keep reading the InputStream until you get -1 which means that you implicitly expect to reach the end of the stream (stream closed) which is not what you want as you try to keep querying the server.
  3. You need to add the header Connection: keep-alive to indicate the server to avoid closing the connection once answered
  4. As this website gives results in chunks, we need to manage it in the code by reading the response line by line and checking for start and end of chunk.

The request is then:

output.write(
    String.format(
        "GET /advice/%d HTTP/1.1\r\nHost: fucking-great-advice.ru\r\nConnection: keep-alive\r\n\r\n",
        numberAdvice
    )
);
output.flush();

Here is how you can read and display the responses:

if (numberAdvice > 1) {
    // Skip inter responses empty line
    input.readLine();
}
StringBuilder sb = new StringBuilder();
String line;
boolean started = false;
while ((line = input.readLine()) != null) {
    if (!started) {
        // Here we check if we reached the end of the header
        if (line.isEmpty()) {
            // Here the body starts
            started = true;
            // Skip chunk start
            input.readLine();
        }
        continue;
    }
    if ("0".equals(line)) {
        // Reached chunk end
        break;
    }
    sb.append(line);
}
System.out.println(sb);

NB: This code is not meant to be optimal or perfect, it only shows the global idea

Sign up to request clarification or add additional context in comments.

1 Comment

i was thinking better about your answer, so it solved my problem, thnx a lot

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.