3

Suppose I do something as follows:

size_t length = 1000;
char* p = malloc(length);

and then I want to loop over the elements, so most basic would be:

for (size_t i = 0; i < length; ++i) {
  p[i] = ...; // or p[length - 1 - i] = ...
}

but also possible is

char* q = p;
for (size_t i = 0; i < length; ++i) {
  *q = ...;
  ++q;
}

or in reverse

char* q = p + (length - 1);
for (size_t i = 0; i < length; ++i) {
  *q = ...;
  --q;
}

My question is, what if I want to avoid the i and do something as follows:

char* const final = p + (length - 1);
for (char* q = p; q <= final; ++q) {
  *q = ...;
}

or in reverse:

char* const final = p + (length - 1);
for (char* q = final; q >= p; --q) {
  *q = ...;
}

It seems that there is a very tiny chance of erroneous behavior in those loops avoiding i; for the first loop, what if p + length == 0, i.e. we have a system where we were allocated memory just at the very end of the possible size_t limit and overflow happened... For the second loop, what if p == 0, i.e. we have a system where we were allocated memory just at the beginning of memory... In both these scenarios the loop will not end when needed...

Probably those do not really happen, but if this is undefined behavior then maybe it is better to loop with the i although it looks slightly less elegant..


Edit: Following Fe2O3's comment, I recalled that indeed I wanted to ask it a bit differently. Namely, I would like not an array of chars, but an array of elements of some struct type, so the struct is potentially relatively big, of size 3000, say. Then it is enough for p to be < 3000 in order for the second loop to fail, it is not necessary for it to be 0. Also, it is enough for final to be at the maximum size minus 3000... Of course, 3000 can be even bigger...

15
  • 1
    0 is, in most implementations, equivalent to (void*)0, or NULL. That's the "special indication" that allocation failed. So it cannot be used to simultaneously indicate a successful allocation. At the other end, the 'heap' boundary cannot bleed into the stack that is located at the most high addresses. In other words, this is not a concern... Commented Oct 7, 2023 at 10:32
  • 1
    p + length is always correct according to the Standard. For the 2nd situation use do ... while(q-- > p); Commented Oct 7, 2023 at 10:38
  • 1
    Actually, your 4th snippet goes into UB... Better for (char* q = final +1; q-- > p; )... It is UB to decrement q to a value less than p, and then compare the two pointers... Commented Oct 7, 2023 at 10:38
  • 1
    @pmg Thanks! The second suggestion is in fact excellent, didn't think about that for some reason. Regarding the standard, good to know, seems it also solves the first "problem" then... Commented Oct 7, 2023 at 10:42
  • 1
    p + length is always correct as a pointer (eg for pointer comparisons) ... not correct if you dereference it: *(p + length) is invalid! Commented Oct 7, 2023 at 10:45

2 Answers 2

1

TL;DR: The incrementing pointer version is ok, but the decrementing one is undefined.

The C standard defines pointer arithmetic in an array to be valid as long as the resulting pointer points at an element of the arrar or it points to "one past the end". In that special case you get a valid pointer that can't be dereferenced (it is undefined to do so), but that will always compare as greater than any pointer to any element of the array

6.5.6.8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

So when you're incrementing the pointer, when you get past the end of the array, you'll get this special "one past the end" pointer that will compare as greater than the pointer to the last element and the loop will terminate. With the decrementing loop, however, after the first element is reached, you'l decrement the pointer again and "underflow" giving undefined behavior.

Sign up to request clarification or add additional context in comments.

5 Comments

This is almost true, except OP's forward version is using p + (length - 1) for final which is UB in case length == 0.
Thanks, except that @pmg above gave a solution also for the decrementing loop; write do ... while(q-- > p). However, that solution is problematic if length == 0, so should check for that separately, I guess...
@cafce25 right.. So I guess for both cases, increment and decrement, better to check for length == 0 first...
@Sasha, no you can make the condition q < final and initialize with final = p + length; Also FeO3 provides a version that works for the backward case without explicitly checking length != 0, even for different types than char
@Sasha When thinking about things, it's good to have a picture in your mind. Q: What is, for instance, a standard egg carton that holds 0 eggs? A: "air"... If you have an array, then you have at least 1 element.
1

To summarize what people said (thanks!), the following should always work (I think):

size_t length = ...;
type_t* p = malloc(length * sizeof(type_t));
type_t* const q = p + length;

/* looping incrementally */
for (type_t* r = p; r < q; ++r) {
  *r = ...;
}

/* looping decrementally */
for (type_t* r = q; r > p;) {
  *--r = ...;
}

5 Comments

For the 3rd one, you need for (type_t* r = q; r > p; ) { *--r=... to be strictly correct. Otherwise you'll incorrectly decrement the pointer even when r == p which is undefined.
@ChrisDodd If I understand correctly, I will decrement it, but the loop will end, so that I will not care about the value at r any more...
for the decremental one ... make sure length >= 1 (why would it ever be 0? lol) and stick with do ... while loop :-)
@Sasha: Re “I will decrement it, but the loop will end”: No, you cannot be sure that you will decrement it, based only on the C standard. You attempt to decrement it with r-- > p, but what does r-- do? When r is already p, so that r-- would make r point to before p, the C standard says the behavior is not defined. So you do not know that r-- will decrement r. It could abort the program. Or the compiler could make deductions based on the fact that r-- is only defined if r remains in the proper range and optimize your program weirdly based on those deductions.
@EricPostpischil OK, thank you all, I appreciate it. I didn't realize that merely making the pointer "out of domain" should not be done. So if I understood correctly now, the code char* p = malloc(10); --p; ++p; is UB, can't be counted on as being the same as char* p = malloc(10);. But char* p = malloc(10); p += 10; p -= 10; is the same as the second...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.