5

I have initialized a 2D array like a 1D array:

int a[2][3] = {1,2,3,4,5} 

How are the values stored in this array?

13
  • Look up "row major order". Commented Jun 16, 2017 at 19:12
  • 1
    I was very surprised that I could not find a duplicate to this question. Commented Jun 16, 2017 at 19:16
  • Why both a[0][3] and a[1][0] points to 4 ? Commented Jun 16, 2017 at 19:18
  • I am drafting an answer that will include an answer to both questions. Commented Jun 16, 2017 at 19:19
  • 1
    @MadPhysicist-- a 2d array declared int a[2][3] is an array of arrays. Note that in most expressions, a will decay to a pointer to its first element, an array of 3 ints. Commented Jun 16, 2017 at 20:16

3 Answers 3

6

They are assigned as follows:

1 2 3
4 5 0

The zero is because you've allocated an array of size 6, but only specified 5 elements.

This is called "row-major order".

You may wish to formalize your code slightly. Your code is currently:

int a[2][3] = {1,2,3,4,5};

If you compile this with gcc main.c -Wall -pedantic --std=c99, you'll get a few warnings:

temp.c:2:17: warning: missing braces around initializer [-Wmissing-braces]

Resolve this using

int a[2][3] = {{1,2,3,4,5}};

This will give you a new warning:

temp.c:2:25: warning: excess elements in array initializer

Resolve this using:

int a[2][3] = {{1,2,3},{4,5,0}};

This explicitly represents the data as having two rows of three elements each.

Some thoughts on memory layout

int a[2][3] will produce an "array of arrays". This is similar to, but contradistinct from, an "array of pointers to arrays". Both have similar access syntax (e.g. a[1][2]). But only for the "array of arrays" can you reliably access elements using a+y*WIDTH+x.

Some code might clarify:

#include <stdlib.h>
#include <stdio.h>

void PrintArray1D(int* a){
  for(int i=0;i<6;i++)
    printf("%d ",a[i]);
  printf("\n");
}

int main(){
  //Construct a two dimensional array
  int a[2][3] = {{1,2,3},{4,5,6}};

  //Construct an array of arrays
  int* b[2];
  b[0] = calloc(3,sizeof(int));
  b[1] = calloc(3,sizeof(int));

  //Initialize the array of arrays
  for(int y=0;y<2;y++)
  for(int x=0;x<3;x++)
    b[y][x] = a[y][x];

  PrintArray1D(a[0]);
  PrintArray1D(b[0]);
}

When you run this, you get:

1 2 3 4 5 6 
1 2 3 0 0 0 

Printing b gives zeros (on my machine) because it runs into uninitialized memory. The upshot is that using contiguous memory allows you to do handy things, let set all of the values without needing a double loop.

Sign up to request clarification or add additional context in comments.

12 Comments

Just out of curiosity, what warning would be triggered by int a[2][3] = {{1,2},{3,4},{5}}?
"This is not necessarily so." - Please elaborate! "An array which contains to int[3] arrays may be non-contiguous in memory." - That's wrong. int a[2][3] is an array of arrays and guaranteed to be continous in memory! It cannot be anything else. If you refer to something like int *a[3]: that is a completely different datatype and not an array of arrays!
@Olaf: Exactly. int a[2][3] is contiguous in memory and all accesses are valid. However, the a[2][3] access pattern on an array of arrays is not guaranteed to be valid in situations where you allocate the rows separately.
@MadPhysicist: I get warning: excess elements in array initializer pointing that the bracket preceding the 5. It indicates too many rows, rather than too few elements.
So you basically say a different data type is different? Not really a surprise!
|
3

In C, 1D arrays are stored in a single linear buffer in memory in what is called "row major" order. Row major means that the last index varies fastest as you go from element to element. Column major would mean that the first index varies fastest, as it does in MATLAB, for example.

The array you declared is only 2D in the sense that the compiler helps you out by computing the linear address of the elements for you. The address of an element in a 1D array is computed linear[x] = linear + x. Similarly, for your 2D array, a[y][x] = a + 3 * y + x. In general, a[y][x] = a + num_cols * y + x.

You can initialize the array as a single vector of elements, which will first fill the first row, then the second, and so on. Since you have two rows of three elements each, the first row becomes 1, 2, 3 and the second row becomes 4, 5, 0.

Indexing past the end of a row is perfectly valid, as far as the compiler is concerned at least. In the example you give, a[0][3] is accessing the fourth element of the first row in an array that is three elements wide. With wrap-around, you can see that this is just the first element of the second row, which is more explicitly stated as a[1][0].

Because of the lax index checking, you can completely omit the first index in any array as long as you provide an initializer. The formula to compute the linear address does not depend on the first index (because it is row major), and the total number of elements is specified by the initializer itself. A 1D example is int linear[] = {1, 2, 3};.

Keep in mind that the name of the array also refers to the pointer to its first element. These are two different things that can be accessed by the same name.

4 Comments

A general statement "C does not do any bounds checking on pointer arithmetic and array indexing" is actually not true, as even pointer arithmetics beyond array bounds are undefined behaviour; It's even a discussion if violating the bounds of a sub-array is undefined behaviour.
@StephanLechner. Fair enough, although it is true that the compiler specifically allows you to trigger this sort of undefined behavior. I removed the offending lines since, as you pointed out, they were not relevant to the discussion.
Allowing this initialiser is actually a legacy and not related to the layout of an array in memory. Moern compilers will warn about such code.
Technically, the reason a flat list of initializers initializes an array in row-major order is not because of how arrays are stored in memory but because of the C rules for initialization. In the 2011 standard, section 6.7.9, clauses 17 to 20 describe a process of using the list to initialize the declared object, including its subobjects. Hypothetically, those clauses could be written to specify a different order, and then the initialization would occur in that order even though arrays are stored in row-major order.
2

From the definition of how an access to a 2D-array like a[1][2] is interpreted, "It follows from this that arrays are stored in row-major order" (cf, for example, this online C standard comitee draft / array subscripting). This means that for an array int a[ROWS][COLUMNS] for an access a[r][c] the offset in terms of int values is calculated like (r*COLUMNS + c).

So for an array int a[2][3], an access a[0][1] has offset 0*3 + 1 = 1, and an access a[1][0] has an offset 1*3 + 0 = 3. That said, a[0][3] might lead to offset 3, while a[1][0] for sure leads to 3. I wrote "might" because I think that accessing an array int a[2][3] with a[0][3] is undefined behaviour, as the range of the last subscript is 0..2. So according to 6.5.6 (8), expression a[0][3] is addressing the sub-array a[0] out of its bounds as argued, for example, here.

Now to the thing of how int a[2][3] = {1,2,3,4,5} is interpreted. This statement is initialization as defined in section 6.7.9 of this online C standard comitee draft, and paragraphs (20) to (26) describe the things needed here:

(20) If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions. If the initializer of a subaggregate or contained union begins with a left brace, the initializers enclosed by that brace and its matching right brace initialize the elements or members of the subaggregate or the contained union. Otherwise, only enough initializers from the list are taken to account for the elements or members of the subaggregate or the first member of the contained union; any remaining initializers are left to initialize the next element or member of the aggregate of which the current subaggregate or contained union is a part.

(21) If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

26 EXAMPLE

(3) The declaration

      int y[4][3] = {
            { 1, 3, 5 },
            { 2, 4, 6 },
            { 3, 5, 7 },
      };

is a definition with a fully bracketed initialization: 1, 3, and 5 initialize the first row of y (the array object y[0]), namely y[0][0], y[0][1], and y[0][2]. Likewise the next two lines initialize y[1] and y[2]. The initializer ends early, so y[3] is initialized with zeros. Precisely the same effect could have been achieved by

      int y[4][3] = {
            1, 3, 5, 2, 4, 6, 3, 5, 7
      };

The initializer for y[0] does not begin with a left brace, so three items from the list are used. Likewise the next three are taken successively for y[1] and y[2].

3 Comments

cppreference is not an authoritative resource. Why not reference the standard, resp. the final draft?
@Olaf: right, cppreference is not normative; But I was just looking for examples, and I found cppreference more brief here. Anyway, do you have a final draft that is online available? I always just find the one I used for citation...
Every body is a critic nowadays - § 6.7.9 Initialization (p21) is key, good citation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.