0

I don't clear about how to count `Content-Length' header in HTTP.

Take an example,

HEADER
...
Content-Type: text/html
(blank line `\r\n')
<html></html>
(blank line `\r\n')

This is a working http request sending an empty HTML page(correct me if any problem :-)). Then what should be the length of content? 15 or 17(take the blank line between header and sending entity into account)?

Thanks in advance. Best regards.

2 Answers 2

4

According to W3 Content-Lentgth is defined as followed:

The Content-Length entity-header field indicates the size of the entity-body, in decimal number of OCTETs, sent to the recipient or, in the case of the HEAD method, the size of the entity-body that would have been sent had the request been a GET.

As far as I understand it, you have to count everything after the first line break. My answer to your question would be 15 then.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the fast reply. Since I'm now receiving data from a keep-alive connection, so I think I'd better extract the Content-Length field as a counter and reading specified bytes of data starting from entity. Unfortunately, when the stream ends, the counter is 2 instead of 0. I can't figure it out, and I think the additive 2 is for the blank line between header and entity, but I can't find any documents rectify my assumption.
You should definately NOT be hard-coding an offset. Read the headers, skip the blank line and line break foLlowing the headers, then read however many bytes the Content-Length header says to read. Also keep in mind that some responses may use a Transfer-Encoding: chunked header instead of a Content-Length header, so be prepared for that, as well as responses that use a disconnect instead of either header at all. Read RFC 2616, it explains how to handle an entity length correctly.
2

15 is the correct answer. That counts the line break at the END of the entity data, which means that line break is part of the entity, not the http protocol. DO NOT count the line break between the headers and entity.

3 Comments

Good explanation! The first line break is part of the HTTP specification, therefor do not count it.
Sorry, \r\n (the one at the end of the entity body) counts 2 bytes, right? If I analyse the request body with software like wireshark, \r\n counts two bytes, 0d 0a in HEX value, but if I export those bytes into a file I see a ^M insted of the \r\n character, and it counts 1 byte only, so how should I handle this?
Yes, \r\n is 2 bytes, 0d 0a. ^M is just how some text editors display 0d when it is by itself without a trailing 0a. If you see 2 bytes in the capture, but only 1 byte is being exported, then the export is faulty. That has nothing to do with the HTTP protocol itself

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.