1

If there is a buffer that is supposed to pack 3 integer values, and you want to increment the one in the middle, the following code works as expected:

#include <iostream>
#include <cstring>

int main()
{
    char buffer[] = {'\0','\0','\0','\0','A','\0','\0','\0','\0','\0','\0','\0'};
    
    int tmp;

    memcpy(&tmp, buffer + 4, 4); // unpack buffer[5:8] to tmp
    std::cout<<buffer[4];              // prints A

    tmp++;
    memcpy(buffer + 4, &tmp, 4); // pack tmp value back to buffer[5:8]
    std::cout<<buffer[4];              // prints B

    return 0;
}

To me this looks like too many operations are taking place for a simple action of merely modifying some data in a buffer array, i.e. pushing a new variable to the stack, copying the specific region from the buffer to that var, incrementing it, then copying it back to the buffer.

I was wondering whether it's possible to cast the 5:8 range from the byte array to an int* variable and increment it, for example:

  int *tmp = reinterpret_cast < int *>(buffer[5:8]);
  (*tmp)++;

It's more efficient this way, no need for the 2 memcpy calls.

0

2 Answers 2

2

The latter approach is technically undefined, though it's likely to work on any sane implementation. Your syntax is slightly off, but something like this will probably work:

int* tmp = reinterpret_cast<int*>(buffer + 4);
(*tmp)++;

The problem is that it runs afoul of C++'s strict aliasing rules. Essentially, you're allowed to treat any object as an array of char, but you're not allowed to treat an array of char as anything else. Thus to be fully compliant you need to take the approach you did in the first snippet: treat an int as an array of char (which is allowed) and copy the bytes from the array into it, manipulate it as desired, and then copy back.


I would note that if you're concerned with runtime efficiency, you probably shouldn't be. Compilers are very good at optimizing these sorts of things, and will likely end up just manipulating the bytes in place. For instance, clang with -O2 compiles your first snippet (with std::cout replaced with printf to avoid stream I/O overhead) down to:

mov     edi, 65
call    putchar
mov     edi, 66
call    putchar

Demo

Remember, when writing C++ you are describing the behavior of the program you want the compiler to write, not writing the instructions the machine will execute.

Sign up to request clarification or add additional context in comments.

1 Comment

Nice explanation with the demo, compilers optimize a lot of code these days. I am still slightly confused about the strict aliasing rule. Isn't that rule technically broken when I memcpy a section of the char array to an int variable? Because It looks like I am treating 4 chars as an int on the first approach too.
1

Simply change buffer[5:8] to buffer + 4, just like in your memcpy() calls, and then it will likely work the way you want:

int *tmp = reinterpret_cast<int*>(buffer + 4 /* or: &buffer[4] */);
(*tmp)++;

Alternatively, you can use a reference instead of a pointer:

int &tmp = reinterpret_cast<int&>(buffer[4] /* or: *(buffer+4) */);
tmp++;

However, note that either approach is technically undefined behavior, as accessing the array like this violates the Strict Aliasing rules. The memcpy() approach is the safe and standard way to go, and compilers are very good about optimizing memcpy() calls.

But, the reinterpret_cast approach will likely work nonetheless, depending on your compiler.

5 Comments

If the array is aligned, the int object would have been implicitly created, so there is no strict aliasing issue
"the int object would have been implicitly created" - technically, only in C++20 and later, not in earlier versions.
@Artyer If the array is aligned In the example, there is nothing guaranteeing the array to be aligned though.
@Artyer by array is aligned you mean something like alignas(int) char buf[] = {...}?
Update: just tried both with alignas(int) and without(1-byte alligned). After the reinterpret_cast to int at buffer[4], the returned value is an integer that interprets the 4 bytes starting from buffer[4]. This is probably UB, because with 1-byte allignment it should've interpreted only the byte at address buffer[4], without the next 3.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.