initial or terminal malloc buffer possible?

Question

Suppose I do something as follows:

size_t length = 1000;
char* p = malloc(length);

and then I want to loop over the elements, so most basic would be:

for (size_t i = 0; i < length; ++i) {
  p[i] = ...; // or p[length - 1 - i] = ...
}

but also possible is

char* q = p;
for (size_t i = 0; i < length; ++i) {
  *q = ...;
  ++q;
}

or in reverse

char* q = p + (length - 1);
for (size_t i = 0; i < length; ++i) {
  *q = ...;
  --q;
}

My question is, what if I want to avoid the i and do something as follows:

char* const final = p + (length - 1);
for (char* q = p; q <= final; ++q) {
  *q = ...;
}

or in reverse:

char* const final = p + (length - 1);
for (char* q = final; q >= p; --q) {
  *q = ...;
}

It seems that there is a very tiny chance of erroneous behavior in those loops avoiding i; for the first loop, what if p + length == 0, i.e. we have a system where we were allocated memory just at the very end of the possible size_t limit and overflow happened... For the second loop, what if p == 0, i.e. we have a system where we were allocated memory just at the beginning of memory... In both these scenarios the loop will not end when needed...

Probably those do not really happen, but if this is undefined behavior then maybe it is better to loop with the i although it looks slightly less elegant..

Edit: Following Fe2O3's comment, I recalled that indeed I wanted to ask it a bit differently. Namely, I would like not an array of chars, but an array of elements of some struct type, so the struct is potentially relatively big, of size 3000, say. Then it is enough for p to be < 3000 in order for the second loop to fail, it is not necessary for it to be 0. Also, it is enough for final to be at the maximum size minus 3000... Of course, 3000 can be even bigger...

0 is, in most implementations, equivalent to (void*)0, or NULL. That's the "special indication" that allocation failed. So it cannot be used to simultaneously indicate a successful allocation. At the other end, the 'heap' boundary cannot bleed into the stack that is located at the most high addresses. In other words, this is not a concern... — user17592432
– user17592432, Commented Oct 7, 2023 at 10:32
p + length is always correct according to the Standard. For the 2nd situation use do ... while(q-- > p); — pmg
– pmg, Commented Oct 7, 2023 at 10:38
Actually, your 4th snippet goes into UB... Better for (char* q = final +1; q-- > p; )... It is UB to decrement q to a value less than p, and then compare the two pointers... — user17592432
– user17592432, Commented Oct 7, 2023 at 10:38
@pmg Thanks! The second suggestion is in fact excellent, didn't think about that for some reason. Regarding the standard, good to know, seems it also solves the first "problem" then... — Sasha
– Sasha, Commented Oct 7, 2023 at 10:42
p + length is always correct as a pointer (eg for pointer comparisons) ... not correct if you dereference it: *(p + length) is invalid! — pmg
– pmg, Commented Oct 7, 2023 at 10:45

Chris Dodd · Accepted Answer · 2023-10-07 11:15:52Z

1

TL;DR: The incrementing pointer version is ok, but the decrementing one is undefined.

The C standard defines pointer arithmetic in an array to be valid as long as the resulting pointer points at an element of the arrar or it points to "one past the end". In that special case you get a valid pointer that can't be dereferenced (it is undefined to do so), but that will always compare as greater than any pointer to any element of the array

6.5.6.8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

So when you're incrementing the pointer, when you get past the end of the array, you'll get this special "one past the end" pointer that will compare as greater than the pointer to the last element and the loop will terminate. With the decrementing loop, however, after the first element is reached, you'l decrement the pointer again and "underflow" giving undefined behavior.

answered Oct 7, 2023 at 11:15

Chris Dodd

127k14 gold badges150 silver badges243 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

cafce25 Over a year ago

This is almost true, except OP's forward version is using p + (length - 1) for final which is UB in case length == 0.

Sasha Over a year ago

Thanks, except that @pmg above gave a solution also for the decrementing loop; write do ... while(q-- > p). However, that solution is problematic if length == 0, so should check for that separately, I guess...

Sasha Over a year ago

@cafce25 right.. So I guess for both cases, increment and decrement, better to check for length == 0 first...

cafce25 Over a year ago

@Sasha, no you can make the condition q < final and initialize with final = p + length; Also FeO3 provides a version that works for the backward case without explicitly checking length != 0, even for different types than char

user17592432 Over a year ago

@Sasha When thinking about things, it's good to have a picture in your mind. Q: What is, for instance, a standard egg carton that holds 0 eggs? A: "air"... If you have an array, then you have at least 1 element.

Sasha · Accepted Answer · 2023-10-07 21:39:27Z

1

To summarize what people said (thanks!), the following should always work (I think):

size_t length = ...;
type_t* p = malloc(length * sizeof(type_t));
type_t* const q = p + length;

/* looping incrementally */
for (type_t* r = p; r < q; ++r) {
  *r = ...;
}

/* looping decrementally */
for (type_t* r = q; r > p;) {
  *--r = ...;
}

edited Oct 7, 2023 at 21:39

user17592432

answered Oct 7, 2023 at 11:52

Sasha

3711 silver badge8 bronze badges

5 Comments

Chris Dodd Over a year ago

For the 3rd one, you need for (type_t* r = q; r > p; ) { *--r=... to be strictly correct. Otherwise you'll incorrectly decrement the pointer even when r == p which is undefined.

Sasha Over a year ago

@ChrisDodd If I understand correctly, I will decrement it, but the loop will end, so that I will not care about the value at r any more...

pmg Over a year ago

for the decremental one ... make sure length >= 1 (why would it ever be 0? lol) and stick with do ... while loop :-)

Eric Postpischil Over a year ago

@Sasha: Re “I will decrement it, but the loop will end”: No, you cannot be sure that you will decrement it, based only on the C standard. You attempt to decrement it with r-- > p, but what does r-- do? When r is already p, so that r-- would make r point to before p, the C standard says the behavior is not defined. So you do not know that r-- will decrement r. It could abort the program. Or the compiler could make deductions based on the fact that r-- is only defined if r remains in the proper range and optimize your program weirdly based on those deductions.

Sasha Over a year ago

@EricPostpischil OK, thank you all, I appreciate it. I didn't realize that merely making the pointer "out of domain" should not be done. So if I understood correctly now, the code char* p = malloc(10); --p; ++p; is UB, can't be counted on as being the same as char* p = malloc(10);. But char* p = malloc(10); p += 10; p -= 10; is the same as the second...

Collectives™ on Stack Overflow

initial or terminal malloc buffer possible?

2 Answers 2

5 Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related