how array of strings in C looks like in the memory?

Question

I'm trying to figure out how 2d char array looks like in the memory. for example:

    char   c[][5]={"xa","ccc","bb","j","a","d"};

    printf("TEST: %u %u %u %u \n\n",c[0],*c[0],c[0]+1,*(c[0]+1));

output:

TEST: 3214246874 120 3214246875 97

c[0]=*(c+0) is the string "xa", and equals to 3214246874, so I guess c[0] is the address to the char array "xa". when I put a * to c[0], I got 120 which is 'x' in ascii.

so I think the first space in c array is an address to the char x. after that I tried the same with c[0]+1, and it printed the next address, and then i put * and i got ,97 which is 'a' in ascii.

so I assumed the array c looks like this:

c[0]                              c[1]
------------------------------------------------------------------
| pointer to x | pointer to a ||| pointer to c | pointer to c | etc ...
----------------------------------------------------------------------

but I searched the web and I didnt find any proof for my assumption.

It will be just a contiguous memory region of size 5*6 containing characters. No pointers are going to be stored there. — Eugene Sh.
– Eugene Sh., Commented Jan 9, 2017 at 16:19
Eugene Sh. is right, but just take a look at the memory in a debugger, or print it out to console for evidence. — LordWilmore
– LordWilmore, Commented Jan 9, 2017 at 16:21
Don't change the question once you have an answer addressing the changed information! Rolling back. — Eugene Sh.
– Eugene Sh., Commented Jan 9, 2017 at 16:24
@EugeneSh. it doesn't addressing the changed information. pointer is an address as far as I know — Daniel2708
– Daniel2708, Commented Jan 9, 2017 at 16:32

John Bollinger · Accepted Answer · 2017-01-09 17:22:42Z

3

You are conflating two senses of the term "string" as it is used in C.

Most correctly, a C string is a null-terminated array of char. You have declared an array of char arrays, and initialized it with null-terminated char sequences. It is perfectly reasonable to characterize this as an "array of strings".

Arrays are not at all the same thing as pointers, however. The elements of your array are other arrays, each one (in your case) five chars long. This is where the other sense of the term "string" comes in. C arrays are a bit slippery; if you evaluate a (sub-)expression of array type, it evaluates to a pointer to the first array element. In the case of strings, such a pointer has type char *, and so it is common it refer to pointers into strings as strings themselves. That is a colloquialism, however, and you will get yourself into trouble if you do not recognize the difference between the two related meanings.

Breaking down your example code:

    char   c[][5]={"xa","ccc","bb","j","a","d"};

    printf("TEST: %u %u %u %u \n\n",c[0],*c[0],c[0]+1,*(c[0]+1));

The expression c[0] designates an array of five char. When evaluated in the context of the function call expression, it becomes a pointer to the first element of the array. This value is of type char *, which is not the correct type for the corresponding printf field descriptor, %u. Undefined behavior results. You could correct this by casting the argument to void * and changing the field descriptor to %p.
Given that c[0] evaluates to a pointer to the first char of the first member array, it follows that the expression *c[0] evaluates to the pointed-to char. This value again fails to match the corresponding field descriptor, which should be %c -- you should then expect 'x' to be printed. Alternatively, you could cast the value: (unsigned int)*c[0]. In that case, you would expect the numeric code for 'x' to be printed; that is very likely to be 120. That 120 is in fact the value actually printed is an inconsequential characteristic of the specific manifestation of the undefined behavior of your program.
Again given that c[0] evaluates to a pointer to the first char of the first member array, it follows that c[0] + 1 is a pointer addition, resulting in a pointer to the second char in that array. As with c[0], this does not match the format.
And presumably it will be clear by this point that *(c[0] + 1) evaluates to the second char (at index 1) in array c[0]. The expression is rigorously equivalent to c[0][1]. This again does not match the format.

so I assumed the array c looks like this [...]

Nope. The array looks like this:

| c[0]         | c[1]         | c[2]         | c[3]         | c[4]         | c[5]         |
  x  a \0 \0 \0  c  c  c \0 \0  b  b \0 \0 \0  j \0 \0 \0 \0  a \0 \0 \0 \0  d \0 \0 \0 \0

edited Jan 9, 2017 at 17:22

answered Jan 9, 2017 at 16:50

John Bollinger

191k11 gold badges103 silver badges206 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Daniel2708 Over a year ago

thanks you! so c[0] in the real memory is 'x', but in C it will be evaluate to the address of the first element of the sub array?

John Bollinger Over a year ago

No, @Daniel2708, the representation of c[0] in memory is not (just) 'x'. c[0] is an array of five chars, which you have initialized with the specific char sequence xa\0\0\0. Those five char values together are its representation in memory. But yes, when C evaluates an expression containing or consisting of c[0], which designates an array, that array is replaced by a pointer to its first element for the purpose of evaluating the expression.

Eugene Sh. Over a year ago

@JohnBollinger with the specific char sequence xa\0\0\0 - Aren't the last two \0 are actually "don't-cares"?

Daniel2708 Over a year ago

I meant, in the physical memory, c[0] as 1 byte is 'x' , and not an address? C evaluated it as address in the example . this is what I understood. thank you again :)

John Bollinger Over a year ago

@EugeneSh., definitely not. This falls under the partial initialization rule in C2011 6.7.9/21: "If there are [...] fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration." For elements of type char, that means initialization to 0.

melpomene · Accepted Answer · 2017-01-09 16:42:44Z

2

c looks like this internally:

c[0]                   c[1]                  c[2]                  c[3]
|                      |                     |                     |
[0] [1] [2]  [3]  [4]  [0] [1] [2] [3]  [4]  [0] [1] [2]  [3]  [4] 
'x' 'a' '\0' '\0' '\0' 'c' 'c' 'c' '\0' '\0' 'b' 'b' '\0' '\0' '\0' ...

I.e. it's one long sequence of chars. It doesn't store any pointers or addresses.

The compiler knows the sizes of each part, so when you write e.g. c[2][1], it knows to fetch it from offset 2 * 5 + 1 = 11 (from the beginning of c).

answered Jan 9, 2017 at 16:42

melpomene

86.2k8 gold badges96 silver badges155 bronze badges

Comments

Grzegorz Szpetkowski · Accepted Answer · 2017-01-09 16:52:58Z

2

This line:

char c[][5] = {"xa", "ccc", "bb", "j", "a", "d"};

can be written more explicitely as:

char c[6][5] = {"xa\0\0\0", "ccc\0\0", "bb\0\0\0", "j\0\0\0\0", "a\0\0\0\0", "d\0\0\0\0"};

The c is array of 6 elements, where each element is of type char[5]. Each "subarray" takes 5 bytes (char always takes one byte), and they are placed next to each other. Thus, the total memory space occupied by c array is 30 bytes.

edited Jan 9, 2017 at 16:52

answered Jan 9, 2017 at 16:45

Grzegorz Szpetkowski

38.2k6 gold badges94 silver badges140 bronze badges

Comments

Serge Ballesta · Accepted Answer · 2017-01-09 17:05:53Z

Beware: arrays of pointers and 2D arrays are different animals! Once defined you use them almost the same way, but they are stored differently in memory.

Arrays of pointer:
```
char   *c[]={"xa","ccc","bb","j","a","d"};
```
This defines an array of 6 pointers. Each of this pointers points to its string that will be stored elsewhere in memory. A typical representation will be:
```
c -> address_of_x, address_of_c, address_of_ ... (array of pointers)
'x', 'a', '\0', 'c', 'c', 'c', '\0', 'b'... (arrays of chars)
 -               -                    -
```
The whole thing will use (in a 32 bits architecture): 6*4 + 3 + 4 + 3 + 2 + 2 + 2 = 40 bytes
2D array:
```
char   c[][5]={"xa","ccc","bb","j","a","d"};
```
This defines a 2D array of 6 rows of 5 columns each (exactly 30 bytes):
```
'x', 'a', '\0', ?, ?, 'c', 'c', 'c', '\0', ?, 'b' ...
```
(bytes noted as ? are do not care, they may be initialized or not depending of the implementation and the build options).

But whatever definition you use, c[1][2] will be the third character of the second string, and *(c[0] + 1) (which is by definition the same as c[0][1]) is the second char of first string, that is: x.

Collectives™ on Stack Overflow

how array of strings in C looks like in the memory?

4 Answers 4

5 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related