2

I have the following use case:

int data[Y][X]{};

// I hope to do the following things:
int* begin = &data[0][0];
int* end = begin + Y * X;
std::fill(begin, end, 1);
std::all_of(begin, end, ...);

My questions:

  1. Is begin + Y * X well-defined here?
  2. If 1 -> not, are there well-defined ways to flatten 2D array to an iterator pair? It does not matter which C++ standard is used.
  3. If 2 -> no, are there "zero-cost" ways to apply std::copy etc. to data? That is, can I apply these algorithms in a way as efficient as if I apply to a real int data[Y * X]?

According to Multidimensional array indexing using pointer to elements, it seems UB to "flatten" the 2D array to an iterator pair, at least begin + Y * X is not well-defined. However, in What is the safe way to fill multidimensional array using std::fill?, all answers take the begin + Y * X approach. So I'm really confused.

6
  • 3
    IMHO it is UB to "flatten" a 2d array like that. What you can do is write a matrix class that uses a 1d array/vector as to storage and then pretend that is 2d with its interface. This lets you write to the whole array at once and gives you the semantics you wish to have. Commented Jul 10, 2024 at 17:31
  • 1
    C++ doesn't really have multi-dimensional arrays. What it has are arrays of arrays. Marching a pointer off the end of an array is verboten, but what you're doing falls into the category of Usually Works the Way You Expect. I'm paranoid, so if I have to do stuff like this I make one big 1D array, wrap it in a class, and handle the indexing myself with the function call operator. Sort-of example intended to be a dynamic array. Commented Jul 10, 2024 at 17:31
  • Why use C-style arrays at all? Why not use std::array or std::vector? Commented Jul 10, 2024 at 17:32
  • C++ doesn't really have multidimensional arrays, but have a look at std::mdspan this allows you to very efficiently create a multidmensional view on 1D data (like std::array, std::vector). Or you can create a wrapper class with a 1D array inside it and do the offset calculations yourself. Commented Jul 10, 2024 at 17:35
  • 1
    std::array<int, X*Y> flat; auto f2d=flat | std::views::chunk(Y); f2d[i][j]; Commented Jul 10, 2024 at 17:45

2 Answers 2

2

In practice, you can treat a N-dimensional array allocated like that as a one-dimensional array. While the standard will tell you not to, there is far far far too much code in production that does this for it to ever not be supported in an actual shipping commercial general purpose compiler.

At some point the wording of the standard will be modified to permit it, I'd be willing to bet good money on that, if it already hasn't been slipped in. But the last time I looked (and it was a while ago) it was still undefined behavior.

You do have to be careful and ensure you don't have a "jagged" array, ie an array of pointers to arrays. But your example is not an example of that.

Due to my dislike of using UB, I personally end up writing code to manage a flat buffer and provide a n-dimensional view of it that works just like it was a tiered buffer. As a toy 2 dimensional example:

template<class T>
struct as_const {
  using type=T const;
};
template<class T>
using as_const_t = typename as_const<T>::type;

template<class T, std::size_t N, class Ref=T&, std::size_t stride=1>
class view_helper {
public:
  T* buffer = nullptr;

  template<class, std::size_t, class, std::size_t>
  friend class view_helper;
protected:
  explicit view_helper(T& r):view_helper(std::addressof(r)) {}
public:
  view_helper(T* p):buffer(p){}
  view_helper(view_helper const&)=default;
  view_helper& operator=(view_helper const&)=delete;

  Ref operator[](std::size_t n) {
    return Ref(buffer[stride*n]);
  }
  as_const_t<Ref> operator[](std::size_t n) const {
    return as_const_t<Ref>(buffer[stride*n]);
  }
  T* data() { return buffer; }
  T const* data() const { return buffer; }
  std::size_t size() const { return N; }
};

template<class T>
struct as_const<T&> {
  using type=T const&;
};
template<class T, std::size_t N, class R, std::size_t S>
struct as_const<view_helper<T,N,R,S>> {
  using type=view_helper<const T,N, as_const_t<R>,S>;
};
template<class T, std::size_t...Dims>
struct view;
template<class T, std::size_t...Dims>
using view_t = typename view<T,Dims...>::type;

template<class T, std::size_t X>
struct view<T, X> {
  using type = view_helper<T, X>;
};
template<class T, std::size_t X0, std::size_t...Xs>
struct view<T, X0, Xs...> {
  using type = view_helper< T, X0, view_t<T, Xs...>, (1*...*Xs)>;
};

now we have

int raw_buffer[24];
view<int,6,2,2> bob(raw_buffer);

is a 3 dimensional view of a 24 element array of integers as if it was a int[6][2][2].

Implementing iterators takes just a bit of work on top of this; the easy way is to just write an indexed-range-to-iteration thing that keeps track of a pointer to the range and a std::size_t index, then invokes (*range)[i] when you dereference.

Live example.

Sign up to request clarification or add additional context in comments.

5 Comments

In this case, a multi-dimensional array is a contiguous array, each element of which is a contiguous array of elements (int in the original question). As such, the behaviour is well-defined. For example, assuming n and n + 1 are a valid first indices of data, &data[n][X-1] + 1 == &data[n+1][0]. C-style arrays (like int data[Y][X]{};) are never jagged arrays, since all dimensions (Y and X) must be compile-time constants.
@Peterpeter careful: "reachable" definitions and valid pointers can bite your assumptions. Neither C nor C++ assumes addresses are linear: both support segmented memory models, for instance. And rules saying that a pointer of type X into an array of X can use pointer arithmetic to go within the array and one past the end exist. Which won't let you span the two adjacent arrays...
"While the standard will tell you not to, there is far far far too much code in production that does this for it to ever not be supported in an actual shipping commercial general purpose compiler." I think you are making the wrong assumption about mainstream compilers such as gcc giving a rat's behind about being a useful quality implementation for the benefit of programmers. They only care about performance and reading standards to the letter - if nobody ends up using the compiler they couldn't care less.
@Lundin Ya, no, that isn't how it works. I'm sorry if gcc has upset you, but they aren't going to add an optimization that probably breaks most software of any decent size written in the last 40 years. I'd be shocked if even the Linux Kernel would work. They might do it with "flags to turn on optionally".
@Yakk-AdamNevraumont Oh but they've already done so multiple times in the past, "strict aliasing" being the most spectacular fiasco. Also there's numerous Torvalds tantrums out there when he did something poorly-defined in C and watched everything break because of it...
0

The first question you must ask yourself is whether you want to use memory in the stack vs. in the heap. This is connected to the size and whether the sizes are runtime.

If you want to use memory in the stack, you can use

std::array<std::array<int, Y>, X> which has more guarantees and is better behaved than int[X][Y] (as others pointed out). As a bonus, you can copy them and pass them in functions using this type. In particular, you are guaranteed that the memory is continuous, and you can access &arr[0][0] to &arr[0][0] + X*Y.

If you want X and Y to be runtime and you want or don't mind using the heap, use a proper multidimensional array library, such as Boost.MultiArray, Eigen, or Multi (my lib, https://github.com/correaa/boost-multi).

All of them have ways to access the flatten view of the elements (for example, arr.elements().begin()/arr.elements().end() in the last case).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.