I have a multi-processor system with a "shared" memory region used for communication. In some cases one of the processors needs to process some considerably large data in the shared memory and place the result back. Normally, I would access the shared memory using volatile pointers to ensure the data is written and read from it, when instructed and the correct data is visible to all the processors (consider it is not cached memory), but unfortunately the processing needs to be done using library functions, while the library isn't written with any special considerations for shared memory (i.e. not taking volatile pointers fore one). Here is an artificial example for illustration:
struct request {
size_t data_size;
uint8_t data[MAX_REQ];
}
struct response {
size_t data_size;
uint8_t some_other_data[MAX_RESP];
}
volatile struct request req __attribute__((section(".shared_mem")));
volatile struct request resp __attribute__((section(".shared_mem")));
// The function that is provided by a library I have no control over
extern void library_function(uint8_t* data_in, size_t size_in, uint8_t* data_out, size_t size_out);
// The required usage
library_function(req.data_size, req.data, resp.data_size, req.some_other_data);
......
This will obviously and rightfully give a warning similar to
expected 'uint8_t *' {aka 'unsigned char *'} but argument is of type 'volatile uint8_t *' {aka 'volatile unsigned char *'}
20 | extern void library_function(uint8_t* data_in, size_t size_in, uint8_t* data_out, size_t size_out);
and rightfully so.
First of all I am not completely getting the picture of what could go wrong with the code. Since the function is in external library, the compiler has no way of knowing what it does with the passed buffers, so likely won't ever optimize the call out. The optimization on the library side is not a concern I guess. Is LTO the only potential troublemaker here?
Second, if there is indeed a potential issue, but my question is whether there is any way around it (not only the warning, but potential issues of the memory not being properly accessed as expected) besides copying the data back and forth to/from a local buffer before and after processing, and letting the function to work with the local memory instead (disabling optimizations globally isn't a good solution too, besides I am not sure it gives any guarantee).
Note, the question is not about synchronization between the processors or potential race-conditions when both processors might be accessing the same memory at a time of the processing, but purely about what the compiler might do to "screw up" the expected logic.
volatileaccesses do not really have anything to do with cache coherency.buffnorvbuffis ever read/written, becausememsetis discardingvolatile, and the compiler is assuming it is already set to zero. Butmemsetis a special case as it is a standard function that the compiler might be making special assumptions about it.