In Python socket programming, why does recv()-ing data directly from the sockets only gives you the first message?

Question

I'm playing around with socket programming in Python 2 and trying out select() for the server script. When I have the following code for the server:

print('Server started in port {}.'.format(self.port))
server_socket = socket.socket()
server_socket.bind((self.address, self.port))
server_socket.listen(5)

client_sockets = [server_socket]
while True:
    for s in client_sockets:
        if s is server_socket:
            client_socket, address = server_socket.accept()
            client_sockets.append(client_socket)

            print('Connection received.')
        else:
            data = s.recv(200)
            if data:
                print('Received: {}'.format(data.decode().strip()))
            else:
                client_sockets.remove(s)
                s.close()

The server only receives the first message from the client. However, the second and later messages will only be received when the client is restarted. This baffles me (of which I attribute to my inadequate knowledge in networking). The data seems to be buffered. Why does this happen?

I did try this:

client_sockets = [server_socket]
while True:
    readable, writable, errored = select.select(client_sockets, [], [])
    for s in readable:
        if s is server_socket:
...

And finally, the server can now receive the second and later messages from the client.

Here is the code for the client:

class BasicClient(object):

    def __init__(self, name, address, port):
        self.name = name
        self.address = address
        self.port = int(port)
        self.socket = socket.socket()

    def connect(self):
        self.socket.connect((self.address, self.port))
        self.socket.send(self.name)

    def send_message(self, message):
        self.socket.send(message.encode())


args = sys.argv
if len(args) != 4:
    print "Please supply a name, server address, and port."
    sys.exit()

client = BasicClient(args[1], args[2], args[3])
client.connect()
while True:
    message = raw_input('Message: ')

    # We pad the message by 200 since we only expect messages to be
    # 200 characters long.
    num_space_pads = min(200 - len(message), 200)
    message = message.ljust(num_space_pads, ' ')

    client.send_message(message)

TCP sockets are streaming, with no fixed "packets" or message boundaries. It's only a stream of bytes. If you need to send fixed messages you need to implement your own protocol above TCP to handle that, which could include message lengths or specific end-of-message sequences. Due to the stream-of-bytes nature of TCP, that means a single send call could be broken up and the receiver needing multiple calls to recv to receive all. Or that more than one send could be received in a single recv. And the last could include partial "messages". — Some programmer dude
– Some programmer dude, Commented Feb 27, 2019 at 13:04
I don't know how much the above applies to your problem, but without more context and proper minimal reproducible example (of both the server and client) it's going to be very hard to help you further. — Some programmer dude
– Some programmer dude, Commented Feb 27, 2019 at 13:06

Hannu · Accepted Answer · 2019-02-27 13:27:39Z

2

There are many problems here.

The key problem is that socket.accept() is a blocking call. You accept your client connection, then read from it, but then your for loop processes back to the server socket and your code is stuck in accept. Only when you reconnect to the socket does the server code move forward, hence it appears you only receive one message.

The better way to do this is to use threads. In your main thread wait for connections (always waiting in accept). When a connection appears, accept it then spawn a thread and handle all client traffic there. The thread then blocks waiting only for the client socket and eventually terminates on connection closed. Much simpler and less prone to errors.

There is also the problem of the concept of "message", as TCP sockets do not know anything about your message structure. They just transmit a stream of data, whatever that is. When you send a "message" and read from a TCP socket, one of the following happens on the server side:

There is no data and the read call blocks
There is exactly one message and you receive that
The server was slower than your client, there are several messages and you receive them all on one go
The server jumped the gun and decided to read when only a partial message was available
Combination of 3 and 4: You receive several full messages and one partial

You need to always cater for cases 3-5 (or use zeromq or something else that understands the concept of a message if you do not want to implement this yourself), not only 1 and 2.

When reading data from a TCP socket, it is always necessary to validate it. Have a "process" buffer and append what you read to that. Then start parsing from the beginning and process as many complete messages as you can find, delete these from the buffer and leave the remainder there. The assumption then is the partial message will eventually be completed.

edited Feb 27, 2019 at 13:27

answered Feb 27, 2019 at 13:22

Hannu

12.4k4 gold badges39 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Sean Francis N. Ballais Over a year ago

I did an investigation and indeed, it gets stuck in accept(). This helped me finally understand why I have to use the readable variable when using select.select(). Thanks!

Hannu Over a year ago

It is almost always better to process clients in threads. Not only is it simpler but it would also make your program be always responsive. A misbehaving or evil client would not be able to block the whole program by doing something you do not expect to happen. For example if you expect the client to send 200 characters, a client sending 199 and never sending the last would block your entire program forever. Using one thread per one client connection would contain many problems there.

Sean Francis N. Ballais Over a year ago

Ah! That's quite a nice insight. Thanks! It reminds me of the way Chrome works. Unfortunately, since what I'm doing with socket programming is a university class work and the specs forbid us from using threads, I'm stuck with a single-threaded server.

Hannu Over a year ago

I sort of guessed this might be a school project as it seemed almost pointless to do it this way and many school projects tend to be on the pointless side... Good luck with that. For better marks you might want to add some kind of failsafe checks to your client reader to make sure it always proceeds at some point even if the client is not doing the right thing.

Sean Francis N. Ballais Over a year ago

The project is just to create a limited version of an IRC server/client, with messages limited to 200 characters, so perhaps my failsafe checks will probably be around making sure the client sends the correct IRC commands and checking if client is still connected (unless you have something to suggest on this). Anyways, thanks!

Collectives™ on Stack Overflow

In Python socket programming, why does recv()-ing data directly from the sockets only gives you the first message?

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related