Multidimensional C-array and std algorithms

Question

I have the following use case:

int data[Y][X]{};

// I hope to do the following things:
int* begin = &data[0][0];
int* end = begin + Y * X;
std::fill(begin, end, 1);
std::all_of(begin, end, ...);

My questions:

Is begin + Y * X well-defined here?
If 1 -> not, are there well-defined ways to flatten 2D array to an iterator pair? It does not matter which C++ standard is used.
If 2 -> no, are there "zero-cost" ways to apply std::copy etc. to data? That is, can I apply these algorithms in a way as efficient as if I apply to a real int data[Y * X]?

According to Multidimensional array indexing using pointer to elements, it seems UB to "flatten" the 2D array to an iterator pair, at least begin + Y * X is not well-defined. However, in What is the safe way to fill multidimensional array using std::fill?, all answers take the begin + Y * X approach. So I'm really confused.

IMHO it is UB to "flatten" a 2d array like that. What you can do is write a matrix class that uses a 1d array/vector as to storage and then pretend that is 2d with its interface. This lets you write to the whole array at once and gives you the semantics you wish to have. — NathanOliver
– NathanOliver, Commented Jul 10, 2024 at 17:31
C++ doesn't really have multi-dimensional arrays. What it has are arrays of arrays. Marching a pointer off the end of an array is verboten, but what you're doing falls into the category of Usually Works the Way You Expect. I'm paranoid, so if I have to do stuff like this I make one big 1D array, wrap it in a class, and handle the indexing myself with the function call operator. Sort-of example intended to be a dynamic array. — user4581301
– user4581301, Commented Jul 10, 2024 at 17:31
Why use C-style arrays at all? Why not use std::array or std::vector? — Jesper Juhl
– Jesper Juhl, Commented Jul 10, 2024 at 17:32
C++ doesn't really have multidimensional arrays, but have a look at std::mdspan this allows you to very efficiently create a multidmensional view on 1D data (like std::array, std::vector). Or you can create a wrapper class with a 1D array inside it and do the offset calculations yourself. — Pepijn Kramer
– Pepijn Kramer, Commented Jul 10, 2024 at 17:35
std::array<int, X*Y> flat; auto f2d=flat | std::views::chunk(Y); f2d[i][j]; — Red.Wave
– Red.Wave, Commented Jul 10, 2024 at 17:45

Yakk - Adam Nevraumont · Accepted Answer · 2024-07-10 19:36:11Z

2

In practice, you can treat a N-dimensional array allocated like that as a one-dimensional array. While the standard will tell you not to, there is far far far too much code in production that does this for it to ever not be supported in an actual shipping commercial general purpose compiler.

At some point the wording of the standard will be modified to permit it, I'd be willing to bet good money on that, if it already hasn't been slipped in. But the last time I looked (and it was a while ago) it was still undefined behavior.

You do have to be careful and ensure you don't have a "jagged" array, ie an array of pointers to arrays. But your example is not an example of that.

Due to my dislike of using UB, I personally end up writing code to manage a flat buffer and provide a n-dimensional view of it that works just like it was a tiered buffer. As a toy 2 dimensional example:

template<class T>
struct as_const {
  using type=T const;
};
template<class T>
using as_const_t = typename as_const<T>::type;

template<class T, std::size_t N, class Ref=T&, std::size_t stride=1>
class view_helper {
public:
  T* buffer = nullptr;

  template<class, std::size_t, class, std::size_t>
  friend class view_helper;
protected:
  explicit view_helper(T& r):view_helper(std::addressof(r)) {}
public:
  view_helper(T* p):buffer(p){}
  view_helper(view_helper const&)=default;
  view_helper& operator=(view_helper const&)=delete;

  Ref operator[](std::size_t n) {
    return Ref(buffer[stride*n]);
  }
  as_const_t<Ref> operator[](std::size_t n) const {
    return as_const_t<Ref>(buffer[stride*n]);
  }
  T* data() { return buffer; }
  T const* data() const { return buffer; }
  std::size_t size() const { return N; }
};

template<class T>
struct as_const<T&> {
  using type=T const&;
};
template<class T, std::size_t N, class R, std::size_t S>
struct as_const<view_helper<T,N,R,S>> {
  using type=view_helper<const T,N, as_const_t<R>,S>;
};
template<class T, std::size_t...Dims>
struct view;
template<class T, std::size_t...Dims>
using view_t = typename view<T,Dims...>::type;

template<class T, std::size_t X>
struct view<T, X> {
  using type = view_helper<T, X>;
};
template<class T, std::size_t X0, std::size_t...Xs>
struct view<T, X0, Xs...> {
  using type = view_helper< T, X0, view_t<T, Xs...>, (1*...*Xs)>;
};

now we have

int raw_buffer[24];
view<int,6,2,2> bob(raw_buffer);

is a 3 dimensional view of a 24 element array of integers as if it was a int[6][2][2].

Implementing iterators takes just a bit of work on top of this; the easy way is to just write an indexed-range-to-iteration thing that keeps track of a pointer to the range and a std::size_t index, then invokes (*range)[i] when you dereference.

Live example.

edited Jul 10, 2024 at 19:36

answered Jul 10, 2024 at 19:03

Yakk - Adam Nevraumont

280k31 gold badges367 silver badges585 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Peter Over a year ago

In this case, a multi-dimensional array is a contiguous array, each element of which is a contiguous array of elements (int in the original question). As such, the behaviour is well-defined. For example, assuming n and n + 1 are a valid first indices of data, &data[n][X-1] + 1 == &data[n+1][0]. C-style arrays (like int data[Y][X]{};) are never jagged arrays, since all dimensions (Y and X) must be compile-time constants.

Yakk - Adam Nevraumont Over a year ago

@Peterpeter careful: "reachable" definitions and valid pointers can bite your assumptions. Neither C nor C++ assumes addresses are linear: both support segmented memory models, for instance. And rules saying that a pointer of type X into an array of X can use pointer arithmetic to go within the array and one past the end exist. Which won't let you span the two adjacent arrays...

Lundin Over a year ago

"While the standard will tell you not to, there is far far far too much code in production that does this for it to ever not be supported in an actual shipping commercial general purpose compiler." I think you are making the wrong assumption about mainstream compilers such as gcc giving a rat's behind about being a useful quality implementation for the benefit of programmers. They only care about performance and reading standards to the letter - if nobody ends up using the compiler they couldn't care less.

Yakk - Adam Nevraumont Over a year ago

@Lundin Ya, no, that isn't how it works. I'm sorry if gcc has upset you, but they aren't going to add an optimization that probably breaks most software of any decent size written in the last 40 years. I'd be shocked if even the Linux Kernel would work. They might do it with "flags to turn on optionally".

Lundin Over a year ago

@Yakk-AdamNevraumont Oh but they've already done so multiple times in the past, "strict aliasing" being the most spectacular fiasco. Also there's numerous Torvalds tantrums out there when he did something poorly-defined in C and watched everything break because of it...

alfC · Accepted Answer · 2024-07-11 23:22:45Z

The first question you must ask yourself is whether you want to use memory in the stack vs. in the heap. This is connected to the size and whether the sizes are runtime.

If you want to use memory in the stack, you can use

std::array<std::array<int, Y>, X> which has more guarantees and is better behaved than int[X][Y] (as others pointed out). As a bonus, you can copy them and pass them in functions using this type. In particular, you are guaranteed that the memory is continuous, and you can access &arr[0][0] to &arr[0][0] + X*Y.

If you want X and Y to be runtime and you want or don't mind using the heap, use a proper multidimensional array library, such as Boost.MultiArray, Eigen, or Multi (my lib, https://github.com/correaa/boost-multi).

All of them have ways to access the flatten view of the elements (for example, arr.elements().begin()/arr.elements().end() in the last case).

Collectives™ on Stack Overflow

Multidimensional C-array and std algorithms

2 Answers 2

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related