1

Im coding a python script that connects to a remote server, and parses the returned response. For some odd reason, 9 out of 10 times, Once the header is read, the script continues and returns before getting the body of the response. Im no expert at python, but im certain that my code is correct on the python side of things. Here is my code:

class miniclient:
"Client support class for simple Internet protocols."

def __init__(self, host, port):
    "Connect to an Internet server."


    self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    self.sock.settimeout(30)

    try:
        self.sock.connect((host, port))
        self.file = self.sock.makefile("rb")

    except socket.error, e:

        #if e[0]    == 111:
        #   print   "Connection refused by server %s on port %d" % (host,port)
        raise


def writeline(self, line):
    "Send a line to the server."

    try:
        # Updated to sendall to resolve partial data transfer errors
        self.sock.sendall(line + CRLF) # unbuffered write

    except socket.error, e:
        if e[0] == 32 : #broken pipe
            self.sock.close() # mutual close
            self.sock = None

        raise e

    except socket.timeout:
        self.sock.close() # mutual close
        self.sock = None
        raise

def readline(self):
    "Read a line from the server.  Strip trailing CR and/or LF."

    s = self.file.readline()

    if not s:
        raise EOFError

    if s[-2:] == CRLF:
        s = s[:-2]

    elif s[-1:] in CRLF:
        s = s[:-1]

    return s


def read(self, maxbytes = None):
    "Read data from server."

    if maxbytes is None:
        return self.file.read()

    else:
        return self.file.read(maxbytes)


def shutdown(self):

    if self.sock:
        self.sock.shutdown(1)


def close(self):

    if self.sock:
        self.sock.close()
        self.sock = None

I use the ReadLine() method to read through the headers until i reach the empty line (Delimiter between headers and body). From there, my objects just call the "Read()" method to read the body. As stated before, 9 of 10 times, read returns nothing, or just partial data.

Example use:

try:
    http = miniclient(host, port)

except Exception, e:

    if e[0] == 111:
        print   "Connection refused by server %s on port %d" % (host,port)

    raise

http.writeline("GET %s HTTP/1.1" % str(document))
http.writeline("Host: %s" % host)
http.writeline("Connection: close") # do not keep-alive
http.writeline("")
http.shutdown() # be nice, tell the http server we're done sending the request

# Determine Status
statusCode = 0
status = string.split(http.readline())
if status[0] != "HTTP/1.1":
    print "MiniClient: Unknown status response (%s)" % str(status[0])

try:
    statusCode = string.atoi(status[1])
except ValueError:
    print "MiniClient: Non-numeric status code (%s)" % str(status[1])

#Extract Headers
headers = []
while 1:
    line = http.readline()
    if not line:
        break
    headers.append(line)

http.close() # all done

#Check we got a valid HTTP response
if statusCode == 200:
    return http.read()
else:
    return "E\nH\terr\nD\tHTTP Error %s \"%s\"\n$\tERR\t$" % (str(statusCode), str(status[2]))
2
  • What, precisely, is your question? Commented Jul 31, 2013 at 4:36
  • My question is why sometimes i get a full read, and other times i dont Commented Jul 31, 2013 at 4:38

1 Answer 1

2

You call http.close() before you call http.read(). Delay the call to http.close() until after you have read all of the data.

Sign up to request clarification or add additional context in comments.

6 Comments

After moving http.Close() after http.read(), i still get partial reads
I can't reproduce the behavior you see, using www.google.com/, stackoverflow.net/, or linux.die.net/. Can you test against a publicly-visible web server and tell me the name of a web server that it fails against?
I actually just found the error, but i still dont know how to fix it. Exception (10054, 'Connection reset by peer'). Im testing against my local HttpListener (.Net 4.0)
I'm wrong about the http.close() ordering. From the doc: "The file object references a dup()ped version of the socket file descriptor, so the file object and socket object may be closed or garbage-collected independently."
Another clue from the socket.makefile() doc: "The socket must be in blocking mode (it can not have a timeout)."
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.