Return a malloc’ed matrix while being able to use subscript notation

Question

I have an exercise where I am supposed to use fixed-size arrays and in/out parameters to do stuff on matrices (add, scanf, print, etc.), but I’d like to do it on arbitrary-length matrices and return them rather than adding each time more (in/)out parameters (thus possibly allowing a more “functional” style).

Since I want to return them, I suppose I probably need malloc to keep the array in memory passed the function scope. Since I want to use multidimensional subscript notation (mat[x][y] rather than mat[x*len+y] or mat+x*len+y) I guess I should use some kind of vla or casting… yet it seems cast to array is forbidden (but I’m going to often return pointers, and how to use subscript notation on them if I can’t cast?), and I visibly “may not initialize a variable-sized object” as says the compiler (even if it’s not directly an array but a pointer to an array), like using this notation:

int *tab[x][y]=malloc(x*y*sizeof(int));

I also get “invalid initializer” if I replace x and y with constant values like 3 by hand.

I passed almost a week searching and maybe that’s impossible and I should just move forward… I also found this notation, which to me looks like function-pointer notation, unless it is a way to prioritize the * operator…

int (*tab)[x][y]=malloc(x*y*sizeof(int));

However I’m not totally sure to understand this notation as then get random values from printed/filled arrays with this way.

Previously I’ve tried to use VLAs (variable length arrays) and GNU extension for giving array lengths as parameter:

void
printMat (int h, int w; int tab[h][w], int h, int w)
{
   [code using tab[x][y]]
}

but I soon realized I needed to treat with pointers and malloc anyway for a “add” function adding two matrices and returning a pointer to a new malloc’ed matrix anyway…

I’d especially like to know, in case I wasn’t specific enough, how should I declare arguments and return type in order to be able to use them as multidimensional arrays without having to use an intermediary variable, while actually passing a pointer (anyway that’s already what’s passing a normal multidimensional array as parameter do right?)

Okay after many tests and tries, it now works as I intended, even if I’m not sure to have understood everything exactely well, especially on what’s a pointer and what’s not (I maybe confused myself by trying to figure out with gdb this, I should probably investigate further on if a normal uni- or multidimensional array is considered as an address or not by gdb, etc.), and as today I’ve not got my sleep/rest and concentration at its best.

Now, I’d like a proper answer to the second part of my initial question: how to return? is there a proper generic type (other than meaningless void*) which may be apropriated for a pointer to a 2-dimensional array (like int(*)[][] but that would work?)? if too generic, what’s the proper way to cast the returned pointer so I can use multidimensional subscript notation on it? is (int(*)[3][3]) correct?

However, if I get nothing satisfactory for this (a justified-enough “it’s impossible in C” is fine I guess), I’ll set @JohnBod current answer as solving the problem, as he gave confirmation for multidimensional vla malloc via a complete and explicative answer on multidimensional arrays, answering fully the first part of question, and gave several answers on the path to the second (if there is any).

#include <stdio.h>
#include <stdlib.h>

void
print_mat (int x, int y; int mat[x][y], int x, int y)
{
  for (int i = 0; i < x; i++)
    {
      for (int j=0; j < y ; j++)
        printf("%d ", mat[i][j]);
      putchar('\n');
    }
  putchar('\n');
}

void*
scan_mat (int x, int y)
{
  int (*mat)[x][y]=malloc(sizeof(*mat));
  for (int i = 0; i < x ; i++)
    for (int j = 0; j < y; j++)
      {
        printf("[%d][%d] = ", i, j);
        scanf("%d", &((*mat)[i][j]));
      }
  return mat;
}

void*
add_mat (int x, int y; int mat1[x][y], int mat2[x][y], int x, int y)
{
  int (*mat)[x][y]=malloc(*mat);
  #pragma GCC ivdep
  for (int i = 0; i < x ; i++)
    for (int j = 0; j < y; j++)
      (*mat)[i][j]=mat1[i][j]+mat2[i][j];
  return mat;
}

int
main ()
{
  int mat1[3][3] = {1, 2, 3,
                    4, 5, 6,
                    7, 8, 9},
    (*mat2)[3][3] = scan_mat(3, 3);
  print_mat(mat1, 3, 3);
  print_mat(*mat2, 3, 3);
  print_mat((int(*)[3][3])add_mat(mat1, *mat2, 3, 3), 3, 3); // both appears to work… array decay?
  print_mat(*(int(*)[3][3])add_mat(mat1, *mat2, 3, 3), 3, 3);
  printf("%d\n", (*(int(*)[3][3])add_mat(mat1, *mat2, 3, 3))[2][2]);
  return 0;
}

and the input/output:

[0][0] = 1
[0][1] = 1
[0][2] = 1
[1][0] = 1
[1][1] = 1
[1][2] = 1
[2][0] = 1
[2][1] = 1
[2][2] = 1
1 2 3 
4 5 6 
7 8 9 

1 1 1 
1 1 1 
1 1 1 

2 3 4 
5 6 7 
8 9 10 

2 3 4 
5 6 7 
8 9 10 

10

int *tab[x][y] makes tab be an array of x arrays of y pointers to int. int (*tab)[x][y] makes tab be a pointer to an array of x arrays of y int elements. — Some programmer dude
– Some programmer dude, Commented Mar 19, 2018 at 13:53
Possible duplicate: Correctly allocating multi-dimensional arrays. — Lundin
– Lundin, Commented Mar 19, 2018 at 13:57
@JonathanLeffler: except the fact before C99 they were in fact a GNU extension (in gnu89), I was refering to the backdeclaration of variables, so that you can have arguments defining the size of an array after this array in argument order, by respecifying them before a semi-colon: fun (int x, int y; array[x][y], int x, int y), which is called like fun (array, 3 3) instead of the normal standard thing which is fun (int x, int y, array[x][y]) and fun(3, 3, array), made mandatory by the necessary previous specification of array size arguments. — galex-713
– galex-713, Commented Mar 19, 2018 at 15:03
@JonathanLeffler: not anymore no, but I never said VLA were currently a GNU extension, I said using that ; so to declare following arguments before so you can then use them in a preceding array type declaration was. — galex-713
– galex-713, Commented Mar 19, 2018 at 15:31
Your question is an unusual mixture of naïveté and sophistication. The puzzlement over int *array[x][y]; is on the naïve end; it is an array of pointers to integers and you must use int (*array)[x][y] to get a pointer to an array. That's a non-negotiable consequence of the rules of C type formation and operator precedence. Then you delve into the intricacies of arcane and archaic GNU C extensions. You confused me — I'm sorry. — Jonathan Leffler
– Jonathan Leffler, Commented Mar 19, 2018 at 15:36

John Bode · Accepted Answer · 2018-03-19 14:33:31Z

6

If you want to allocate a buffer of type T, the typical procedure is

T *ptr = malloc( sizeof *ptr * N ); // sizeof *ptr == sizeof (T)

You're allocating enough space for N elements of type T.

Now let's replace T with an array type, R [M]:

R (*ptr)[M] = malloc( sizeof *ptr * N  ); // sizeof *ptr == sizeof (R [M])

You're allocating enough space for N elements of type R [M] - IOW, you've just allocated enough space for an N by M array of R. Note that the semantics are exactly the same as for the array of T above; all that's changed is the type of ptr.

Applying that to your example:

int (*tab)[y] = malloc( sizeof *tab * x );

You can then index tab as you would any 2D array:

tab[x][y] = new_value();

Edit

Answering the comment:

yet, still, I’m not sure to understand: what’s the meaning of the “(*tab)” syntax? it’s not a function pointer I guess, but why wouldn’t *tab without parenthesis work: what’s the actual different meaning? why doesn’t it work and what does change then?

The subscript [] and function call () operators have higher precedence than unary *, so a declaration like

int *a[N];

is parsed as

int *(a[N]);

and declares a as an array of pointers to int. To declare a pointer to an array, you must explicitly group the * operator with the identifier, like so:

int (*a)[N];

This declares a as a pointer to an array of int. The same rule applies to function declarations. Here's a handy summary:

T *a[N];    // a is an N-element array of pointers to T
T (*a)[N];  // a is a pointer to an N-element array of T
T *f();     // f is a function returning pointer to T
T (*f)();   // f is a pointer to a function returning T

In your code,

int *tab[x][y]=malloc(x*y*sizeof(int));

declares tab as a 2D array of pointers, not as a pointer to a 2D array, and a call to malloc(...) is not a valid initializer for a 2D array object.

The syntax

int (*tab)[x][y]=malloc(x*y*sizeof(int));

declares tab as a pointer to a 2D array, and a call to malloc is a valid initializer for it.

But...

With this declaration, you'll have to explicitly dereference tab before indexing into it, like so:

(*tab)[i][j] = some_value();

You're not indexing into tab, you're indexing into what tab points to.

Remember that in C, declaration mimics use - the structure of a declarator in a declaration matches how it will look in the executable code. If you have a pointer to an int and you want to access the pointed-to value, you use the unary * operator:

x = *ptr;

The type of the expression *ptr is int, so the declaration of ptr is written

int *ptr;

Same thing for arrays, if the ith element of an array has type int, then the expression arr[i] has type int, and thus the declaration of arr is written as

int arr[N];

Thus, if you declare tab as

int (*tab)[x][y] = ...;

then to index into it, you must write

(*tab)[i][j] = ...;

The method I showed avoids this. Remember that the array subscript operation a[i] is defined as *(a + i) - given an address a, offset i elements (not bytes!) from a and dereference the result. Thus, the following relationship holds:

*a == *(a + 0) == a[0]

This is why you can use the [] operator on a pointer expression as well as an array expression. If you allocate a buffer as

T *p = malloc( sizeof *p * N );

you can access each element as p[i].

So, given a declaration like

T (*a)[M];

we have the relationship

 (*a)[i] == (*(a + 0))[i] == (a[0])[i] == a[0][i];

Thus, if we allocate the array as

T (*a)[M] = malloc( sizeof *a * N );

then we can index each element of a as

a[i][j] = some_value();

edited Mar 19, 2018 at 14:33

answered Mar 19, 2018 at 13:53

John Bode

125k19 gold badges130 silver badges211 bronze badges

Sign up to request clarification or add additional context in comments.

21 Comments

galex-713 Over a year ago

why omitting parenthesis after sizeof? that confuses me a little as I’m not good at remembering operator priority and I considered sizeof as a macro/function for a long time…

Lundin Over a year ago

sizeof *tab * x looks a bit cryptic, especially to beginners. It might be easier to read the code if you use this alternative style instead: int (*tab)[y] = malloc( sizeof(int[x][y]) );

Lundin Over a year ago

@galex-713 When sizeof is applied to an expression, rather than a pure type such as int, you can omit the parenthesis (or you can keep it, it does no harm). For an alternative style, see my comment above this one.

cmaster - reinstate monica Over a year ago

@galex-713 int (*tab)[y] is indeed a pointer. However, you need y to be in scope when you write down its type. This is why it's not possible to return it directly from a function. The usual way is to use an output parameter, which is a pointer to the pointer that you want to return: void foo(int x, int y, int (**outTab)[y]) { ... *outTab = tab = malloc(x*sizeof*tab); } Note the two stars at the outTab argument: You want the caller to pass the address where your function can return the pointer. Call it with int (*tab)[y]; foo(x, y, &tab);

Jonathan Leffler Over a year ago

@galex-713: Since parentheses are always permitted with sizeof, I always use them. The 'sometimes not required' rule is real but just complicates life wholly unnecessarily — in my opinion. There are those who vehemently disagree with me on the topic.

|

Collectives™ on Stack Overflow

Return a malloc’ed matrix while being able to use subscript notation

1 Answer 1

21 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

21 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related