Why is the maximum size of an array "too large"?

Question

I'm under the same impression as this answer, that size_t is always guaranteed by the standard to be large enough to hold the largest possible type of a given system.

However, this code fails to compile on gcc/Mingw:

#include <stdint.h>
#include <stddef.h>

typedef uint8_t array_t [SIZE_MAX];

error: size of array 'array_t' is too large

Am I misunderstanding something in the standard here? Is size_t allowed to be too large for a given implementation? Or is this another bug in Mingw?

EDIT: further research shows that

typedef uint8_t array_t [SIZE_MAX/2];   // does compile
typedef uint8_t array_t [SIZE_MAX/2+1]; // does not compile

Which happens to be the same as

#include <limits.h>

typedef uint8_t array_t [LLONG_MAX];           // does compile
typedef uint8_t array_t [LLONG_MAX+(size_t)1]; // does not compile

So I'm now inclined to believe that this is a bug in Mingw, because setting the maximum allowed size based on a signed integer type doesn't make any sense.

An array of size SIZE_MAX probably consumes all of memory. — Paul Ogilvie
– Paul Ogilvie, Commented Mar 3, 2017 at 9:20
@PaulOgilvie Then why did they pick a number which is too large for the given implementation? — Lundin
– Lundin, Commented Mar 3, 2017 at 9:22
According to GCC source code the limit is enforced by signed counterpart of sizetype (INT_MAX in comment is misleading). index is assigned with c_common_signed_type (sizetype); at line 5933. This probably explains "half-range" issue. — Grzegorz Szpetkowski
– Grzegorz Szpetkowski, Commented Mar 3, 2017 at 9:51
@Lundin: I haven't found any comments, why they are taking signed type, so it may be a bug. Edit: I think that 2501 is right and it is due to ptrdiff_t type, which is signed. — Grzegorz Szpetkowski
– Grzegorz Szpetkowski, Commented Mar 3, 2017 at 10:02
You'll notice that nothing in the standard implies that the compiler must allow objects of any size up to SIZE_MAX. It only implies that the compiler must not allow objects larger than SIZE_MAX. That's true even if you don't actually create the object, since sizeof can be applied to types too. — Brian Bi
– Brian Bi, Commented Mar 3, 2017 at 21:37

2501 · Accepted Answer · 2017-03-08 15:41:46Z

69

+100

The limit SIZE_MAX / 2 comes from the definitions of size_t and ptrdiff_t on your implementation, which choose that the types ptrdiff_t and size_t have the same width.

C Standard mandates¹ that type size_t is unsigned and type ptrdiff_t is signed.

The result of difference between two pointers, will always² have the type ptrdiff_t. This means that, on your implementation, the size of the object must be limited to PTRDIFF_MAX, otherwise a valid difference of two pointers could not be represented in type ptrdiff_t, leading to undefined behavior.

Thus the value SIZE_MAX / 2 equals the value PTRDIFF_MAX. If the implementation choose to have the maximum object size be SIZE_MAX, then the width of the type ptrdiff_t would have to be increased. But it is much easier to limit the maximum size of the object to SIZE_MAX / 2, then it is to have the type ptrdiff_t have a greater or equal positive range than that of type size_t.

Standard offers these³ comments⁴ on the topic.

(Quoted from ISO/IEC 9899:201x)

¹ (7.19 Common definitions 2)
The types are
ptrdiff_t
which is the signed integer type of the result of subtracting two pointers;
size_t
which is the unsigned integer type of the result of the sizeof operator;

² (6.5.6 Additive operators 9)
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the header. If the result is not representable in an object of that type, the behavior is undefined.

³ (K.3.4 Integer types 3)
Extremely large object sizes are frequently a sign that an object’s size was calculated incorrectly. For example, negative numbers appear as very large positive numbers when converted to an unsigned type like size_t. Also, some implementations do not support objects as large as the maximum value that can be represented by type size_t.

⁴ (K.3.4 Integer types 4)
For those reasons, it is sometimes beneficial to restrict the range of object sizes to detect programming errors. For implementations targeting machines with large address spaces, it is recommended that RSIZE_MAX be defined as the smaller of the size of the largest object supported or (SIZE_MAX >> 1), even if this limit is smaller than the size of some legitimate, but very large, objects. Implementations targeting machines with small address spaces may wish to define RSIZE_MAX as SIZE_MAX, which means that there is no object size that is considered a runtime-constraint violation.

edited Mar 8, 2017 at 15:41

answered Mar 3, 2017 at 10:01

2501

25.8k4 gold badges51 silver badges93 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Lundin Over a year ago

This makes sense. So it is rather a defect in the C standard then? Meaning that SIZE_MAX can never be used in meaningful ways, but size_t should instead rather be using PTRDIFF_MAX?

2501 Over a year ago

@Lundin It is not a defect because SIZE_MAX doesn't represent the value of the maximum allowable size of an object. Even using PTRDIFF_MAX as the limit is not correct, because it could theoretically be larger than SIZE_MAX. I think the correct value is min(SIZE_MAX,PTRDIFF_MAX).

mtvec Over a year ago

@Lundin: I don't think that would be allowed since the C standard defines SIZE_MAX as the "limit of size_t" (C99 Section 7.20.3).

Kevin Over a year ago

@Lundin: You seem to be under a misapprehension of the purpose of SIZE_MAX. It is not intended to be "the maximum possible size of an object" at all. It is intended to be "the maximum possible value of an integer of type size_t." While we do use size_t to measure the sizes of objects, there is no requirement that the implementation actually permit the creation of such enormous objects. To be clear, SIZE_MAX / 2 is still an absurdly huge number on a 64-bit system; no sane programmer will ever want to create an array that big even as a static global variable.

M.M Over a year ago

The standard allows an implementation to offer objects greater than PTRDIFF_MAX; the only drawback is that subtracting two pointers far enough apart would cause undefined behaviour. (Which is a pretty big drawback and explains why they choose not to do that).

|

AnT stands with Russia · Accepted Answer · 2017-03-04 22:20:50Z

23

The of range size_t is guaranteed to be sufficient to store the size of the largest object supported by the implementation. The reverse is not true: you are not guaranteed to be able to create an object whose size fills the entire range of size_t.

Under such circumstances the question is: what does SIZE_MAX stand for? The largest supported object size? Or the largest value representable in size_t? The answer is: it is the latter, i.e. SIZE_MAX is (size_t) -1. You are not guaranteed to be able to create objects SIZE_MAX bytes large.

The reason behind that is that in addition to size_t, implementations must also provide ptrdiff_t, which is intended (but not guaranteed) to store the difference between two pointers pointing into the same array object. Since type ptrdiff_t is signed, the implementations are faced with the following choices:

Allow array objects of size SIZE_MAX and make ptrdiff_t wider than size_t. It has to be wider by at least one bit. Such ptrdiff_t can accommodate any difference between two pointers pointing into an array of size SIZE_MAX or smaller.
Allow array objects of size SIZE_MAX and use ptrdiff_t of the same width as size_t. Accept the fact that pointer subtraction can overflow and cause undefined behavior, if the pointers are farther than SIZE_MAX / 2 elements apart. The language specification does not prohibit this approach.
Use ptrdiff_t of the same width as size_t and restrict the maximum array object size by SIZE_MAX / 2. Such ptrdiff_t can accommodate any difference between two pointers pointing into an array of size SIZE_MAX / 2 or smaller.

You are simply dealing with an implementation that decided to follow the third approach.

edited Mar 4, 2017 at 22:20

answered Mar 4, 2017 at 9:31

AnT stands with Russia

323k44 gold badges548 silver badges793 bronze badges

6 Comments

M.M Over a year ago

The code in the question doesn't attempt to create any array objects though, it only makes a typedef. So is it non-conforming for the implementation to reject the typedef?

Evan Carroll Over a year ago

I really think this is close to a canonical answer (for all this ptrdiff/size/intptr stuff) and would urge you to merge you other answer into this one, outline the _MAX and _MIN constants for each type in question before you get into explaining them (because references make it easier) and bring the details of the others answers into this one. Good job!

Evan Carroll Over a year ago

Actually having your answer and the answer on stackoverflow.com/questions/9386979/… I almost feel like it's a diservice to allow the other one to remain open. Perhaps you could migrate your answer to that question and just dupe-close this entirely.

Evan Carroll Over a year ago

Not sure if you want an update for more symmetry in your very awesome breakdown, feel free to grab this through makes it a lot easier to process that. (imho) gist.github.com/EvanCarroll/121dfb870fd3da0dc52ba4e7edefe3ee

autistic Over a year ago

👌 nailed it, all that's left is to point out how implementations of any programming language are technically limited and yet intended to be somewhat functionally useful... We could probably hack our OS up to give us a 4GB stack, but that'd be mostly wasted and so the decision was to settle with somewhere around 1-4MB callstack space (which is where this allocation will likely end up, unless allocated), and just because it compiles that doesn't mean it functions usefully...

|

avysk · Accepted Answer · 2017-03-03 09:47:24Z

5

It looks very much like implementation-specific behaviour.

I'm running here Mac OS, and with gcc 6.3.0 the biggest size I can compile your definition with is SIZE_MAX/2; with SIZE_MAX/2 + 1 it does not compile anymore.

On the other side, witch clang 4.0.0 the biggest one is SIZE_MAX/8, and SIZE_MAX/8 + 1 breaks.

answered Mar 3, 2017 at 9:47

avysk

2,04514 silver badges22 bronze badges

5 Comments

2501 Over a year ago

SIZE_MAX/8+1 Interesting. what is the error message there? Can you successfully malloc SIZE_MAX/8+1?

avysk Over a year ago

Error message is very much the same: error: array is too large (2305843009213693952 elements)

2501 Over a year ago

clang 4.0.0 What is the value RSIZE_MAX and what is the value of PTRDIFF_MAX?

avysk Over a year ago

SIZE_MAX: 18446744073709551615 RSIZE_MAX: 9223372036854775807 PTRDIFF_MAX: 9223372036854775807

avysk Over a year ago

Indeed. No, I cannot malloc even SIZE_MAX/128/1024: blah-blah, set a breakpoint in malloc_error_break to debug. SIZE_MAX/256/1024 mallocs fine. +1/-1 to those don't change the behaviour.

Paul Ogilvie · Accepted Answer · 2017-03-03 09:46:08Z

0

Just reasoning from scratch, size_t is a type that can hold the size of any object. The size of any object is limited by the width of the address bus (ignoring multiplexing and systems that can handle eg 32 and 64 bit code, call that "code width"). Anologous to MAX_INT which is the largest integer value, SIZE_MAX is the largest value of size_t. Thus, an object of size SIZE_MAX is all addressable memory. It s reasonable that an implementation flags that as an error, however, I agree that it is an error only in a case where an actual object is allocated, be it on the stack or in global memory. (A call to malloc for that amount will fail anyway)

edited Mar 3, 2017 at 9:46

answered Mar 3, 2017 at 9:38

Paul Ogilvie

25.3k4 gold badges25 silver badges43 bronze badges

Collectives™ on Stack Overflow

Why is the maximum size of an array "too large"?

4 Answers 4

6 Comments

6 Comments

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

6 Comments

6 Comments

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related