What is the correct way of passing data through Unix sockets?

Question

I'm working on a personal project to try to better understand inter-process communications on Unix. I have two binaries i compiled in C and I am attempting to pass data from one process to another using Unix sockets.

I wanted to make my send/receive function as generic as possible to be able to pass ANY TYPE of data (int, char, complex structures) using the same message structure :

    enum DataType
{
    INT_TYPE,
    FLOAT_TYPE,
    CHAR_TYPE,
    STRUCT_TYPE,
};

struct Message
{
    int identifier;
    enum DataType data_type;
    void* data;
    size_t data_length;
};

This is the send function I came up with :

ssize_t Send_message(const int pSocket, struct Message pMessage)
{
    // Send the message over the socket
    ssize_t bytes_sent = send(pSocket, &pMessage, sizeof(struct Message), 0);

    if (bytes_sent == -1) 
    {
        perror("Error in ipc.c, Send_message: Error sending message");
        return -1;
    }

    if (bytes_sent != sizeof(struct Message)) 
    {
        fprintf(stderr, "Error in ipc.c, Send_message: Incomplete message sent\n");
        return -1;
    }

    if(pMessage.data_length > 0 && pMessage.data != NULL)
    {
        bytes_sent += send(pSocket, pMessage.data, pMessage.data_length, 0);

        if (bytes_sent == -1) 
        {
            perror("Error in ipc.c, Send_message: Error sending message");
            return -1;
        }

        if (bytes_sent != pMessage.data_length + sizeof(struct Message)) 
        {
            fprintf(stderr, "Error in ipc.c, Send_message: Incomplete message sent\n");
            return -1;
        }
    }
    
    printf("\nSent message with Request Type : %d, Identifier :%d, Data Lenght : %d \n", pMessage.request_type, pMessage.identifier, pMessage.data_length);

    return bytes_sent;
}

I though the best way to be as generic as possible is to cast the data I want to pass to a void* and then cast back to the correct type on the receiving end. Example sending process :

struct Message response;
// ** Input here response.identifier
// ** Input here response.data_type 
// ** Input here response.data_length

char *string_val = "HELLO WORLD";
int int_val = 42; 
if(received_message.data_type == CHAR_TYPE)
{
  response.data = (void*)string_val;                
}
if(received_message.data_type == INT_TYPE)
{
  response.data = (void*)&int_val ;                
}

Send_message(pSocket, response);

This works perfectly for basic types. But if i want to pass complex structures like :

typedef struct {
int subparam1;
float subparam2;
char * subparam3;
} SubConfiguration;

SubConfiguration subconf;
// ** Fill in the struct

response.data = (void*)&subconf;

Send_message(pSocket, response);

-- EDIT Adding Receive_message to receive the message

ssize_t Receive_message(const int pSocket, struct Message *pMessage)
{
    // Receive the message into the buffer
    ssize_t bytes_received = recv(pSocket, pMessage, sizeof(struct Message), 0);

    if (bytes_received != sizeof(struct Message)) 
    {
        perror("\n Error in ipc.c, Receive_message: Error receiving message");
        return -1;
    }

    if(pMessage->data_length > 0 )
    {
        pMessage->data = malloc(pMessage->data_length);
        bytes_received += recv(pSocket, pMessage->data, pMessage->data_length, 0);

        if (bytes_received != pMessage->data_length + sizeof(struct Message)) 
        {
            perror("\n Error in ipc.c, Receive_message: Error receiving message");
            return -1;
        }
    }

    printf("\nReceived message with Request Type : %d, Identifier :%d, Data Lenght : %d \n", pMessage->request_type, pMessage->identifier, pMessage->data_length);

    return bytes_received;
}

Now all i get on the receiving end are the int and float values of the structure. The char* I put in can't be accessed.

The question I have is : Is it possible to do what I am trying to do? and what am I doing wrong? I started to think about integrating Protocol Buffers like protobuf to serialize and deserialize my data correctly : Is that necessary in my Case?

send(pSocket, &pMessage, sizeof(struct Message), 0) sends a struct Message which includes the void* data; member. What is the receiver to do with that pointer? — chux
– chux, Commented Nov 7, 2023 at 10:18
"The char* I put in can't be accessed." --> The receiving end can access the pointer's value, it is just that de-referencing the pointer to some char is not possible. You seem to think that a char * and a string are the same thing. The first is a pointer, the 2nd is more like an array. Arrays are not pointers. Pointers are not arrays. — chux
– chux, Commented Nov 7, 2023 at 10:24
To send a SubConfiguration, code will need to send the STRUCT_TYPE and then the types of the members INT_TYPE,, FLAOT_TYPE, STRING_TYPE. IMO, consider perfecting the simple cases first (float, int), with working code, then attempt a string, then the the struct. — chux
– chux, Commented Nov 7, 2023 at 10:27
Thanks for the comment, I edited the question to add Receive_message function — BigBurger
– BigBurger, Commented Nov 7, 2023 at 12:54

12431234123412341234123 · Accepted Answer · 2023-11-14 09:43:05Z

1

"Modern" (that means basically every system that is more than just a microcontroller developed in the last 40 year or so) systems do have virtual memory. That means every process has its own virtual address ranges independent from other processes.

If a process, lets call the process A, needs memory, process A has has to request it from the kernel (on unix the mmap() syscall can be used). The kernel then (or later, if lazy allocation is used) reserves physical memory for process A. Lets say the physical address starts at 0x12345600 but process A may not access it with a pointer to address 0x12345600 but with a virtual address, lets say it is address 0xABCDEC00. The CPU automatically translates the virtual address 0xABCDEC00 to the physical 0x12345600 for process A.

Now, when process A sends the pointer to address 0xABCDEC00 to process B. When process B wants to access 0xABCDEC00, there is either no physical address mapped at that address for process B and causes a segment fault. Or process B did map something (else) at address 0xABCDEC00 and then this is accessed instead of the physical address 0x12345600 (causing unpredictable behaviour, this is why accessing this address in C causes UB).

This is why void* data; in the receiver points to either nowhere or some unrelated data. This can not work.

Maybe you read about virtual memory, address translation and MMU (memory management unit).

How to avoid this:

You could either write the data in the socket. That means all data you want to transmit are included in the write() or send() call.

Or you could reserve shared memory (also with mmap()). If you do it correctly, you can then send pointers to that shared memory to process B and process B can access it.

I wanted to make my send/receive function as generic as possible to be able to pass ANY TYPE of data (int, char, complex structures) using the same message structure :

That is probably not the best idea since this adds a huge amount of complexity that you could avoid. Except you mean you use a stream of bytes (which is essentially what sockets, pipes and files are), which are very generic, but then you don't have to write any new functions since the already existing functions can do that.

edited Nov 14, 2023 at 9:43

answered Nov 7, 2023 at 10:57

12431234123412341234123

2,88218 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

BigBurger Over a year ago

Thanks for your answer, I get that passing complex structures is impossible using my solution, have you heard about protocol buffers and would they help in my case?

James Kanze Over a year ago

Even with shared memory, unless precautions are taken so that the mmap maps the memory to the same address in both processes, the addresses will be different. One solution here would be to pass the offset of the data in shared memory, adding or subtracting the start address of shared memory as needed.

12431234123412341234123 Over a year ago

@JamesKanze That is why i said "If you do it correctly". One way to do it is to create it in process A and then fork it to process B, now both A and B can access the same memory with the same address. Not sure if this works through exec(). And i am not sure what the best way is for already running processes.

James Kanze Over a year ago

If you fork and then don't exec, there's no problem, since the task images are identical. If you exec, the entire memory map is replaced -- you'll have to re-mmap the file. Pointers in the mmapped memory can then be tricky (but it's doable). You can even have a mutex in the mmapped memory, shared between the two processes, although the error handling code in this case is a bit complex.

12431234123412341234123 Over a year ago

@bazza I don't think performance is a concern for this, from the OP: »...a personal project to try to better understand inter-process communications...« which does not sound like he should worrying about performance. And »...as generic as possible...« is also contrary to a high performance design (or to a simple design).

|

Collectives™ on Stack Overflow

What is the correct way of passing data through Unix sockets?

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related