1

Consider a dataset with N samples, where every sample consists of head elements followed by tail elements. I'd like to perform an stable reordering such that N tails are grouped together at the top, and N heads are grouped together at the bottom.

For example, the dataset:

 -1  -2  -3   1   2   3   4   5
 -4  -5  -6   6   7   8   9  10
 -7  -8  -9  11  12  13  14  15
-10 -11 -12  16  17  18  19  20

has has N = 4 samples, with head = 3 (negative integers), and tail = 5 (positive integers). The desired transformation would yield:

  1   2   3   4   5   6   7   8
  9  10  11  12  13  14  15  16
 17  18  19  20  -1  -2  -3  -4
 -5  -6  -7  -8  -9 -10 -11 -12

I implemented a solution based on repeated application of rotations. The rotations are those implemented by the C++ algorithm std::rotate:

std::rotate(first, n_first, last) swaps the elements in the range [first, last) in such a way that the element n_first becomes the first element of the new range and n_first - 1 becomes the last element.

My implementation (shown below) provides a correct solution and performs well for my problems. However, it needs to perform N rotations, where the complexity of every rotation increases from O(head + tail) to O(N * tail + head).

Are you aware of an algorithm with better complexity?.

My code follows:

#include <algorithm>
#include <vector>
#include <iostream>
#include <iomanip>

template <typename I> // I models Forward Iterator
I reorder(I f, I l, std::size_t head_size, std::size_t tail_size)
{
  std::size_t k = 1;
  auto m = std::next(f, head_size);
  auto t = std::next(m, tail_size);
  while (t != l) {
    f = std::rotate(f, m, std::next(m, tail_size));
    m = std::next(f, ++k * head_size);
    t = std::next(m, tail_size);
  };
  return std::rotate(f, m, t);
}

template <typename C>
void show(const char* message, const C& c)
{
  std::size_t shown { 0 };
  std::cout << message << "\n";
  for (auto && ci : c)
    std::cout << std::setw(3) << ci
              << (++shown % 8 == 0 ? "\n" : " ");
}

int main()
{
  std::vector<int> v {
    -1,  -2,  -3,  1,  2,  3,  4,  5,
    -4,  -5,  -6,  6,  7,  8,  9, 10,
    -7,  -8,  -9, 11, 12, 13, 14, 15,
   -10, -11, -12, 16, 17, 18, 19, 20 };

  std::size_t head_size { 3 };
  std::size_t tail_size { 5 };

  show("before reorder", v);

  reorder(v.begin(), v.end(), head_size, tail_size);

  show("after reorder", v);

  return 0;
}

Compile and run:

$ clang++ example.cpp -std=c++14
before reorder
 -1  -2  -3   1   2   3   4   5
 -4  -5  -6   6   7   8   9  10
 -7  -8  -9  11  12  13  14  15
-10 -11 -12  16  17  18  19  20
after reorder
  1   2   3   4   5   6   7   8
  9  10  11  12  13  14  15  16
 17  18  19  20  -1  -2  -3  -4
 -5  -6  -7  -8  -9 -10 -11 -12

Details about the problem domain

I'm reading images where every line is preceded by pixel metadata, and followed by the raw pixels, like so:

[metadata] [pixels...]
[metadata] [pixels...]
[ many more of these]
[metadata] [pixels...]

I need to pack all the pixels together and pass them to OpenGL, but I also need to maintain the metadata accessible for the program. So, I do this:

 [all the pixels] [all the metadata]

and pass [all the pixels] to my graphics card while keeping a handle to [all the metadata] in the CPU.

Edit

Thanks for your responses - I've ended up implementing an alternative solution that does not rely on reordering. However, the point remains: can you improve the algorithm for "in place" dataset reordering? It is similar to in place matrix transpose.

1
  • Are metadata and pixel chunks sizes known in advance? If yes not simply mem copy those? Commented Feb 5, 2017 at 1:38

2 Answers 2

2

One very fast solution is to create another matrix and place elements directly where they belong.

For example, say you have n rows, t tails and h heads. The first tail on the first row (1, 1) is going to ( (h * n+1)/(h+t), (h * n+1)%(h+t)). I will let you formulate for the general case (i,j) going to (k,l). In any case, it's a calculation involving integer division and modulus.

Sign up to request clarification or add additional context in comments.

Comments

1

Starting from the input format:

[metadata] [pixels...]
[metadata] [pixels...]
[ many more of these]
[metadata] [pixels...]

What you should do is to read these directly into the structure you need, which as far as I can see are two: contiguous pixels and contiguous metadata. So:

std::vector<Pixel> pixels;
std::vector<Metadata> meta;
// maybe reserve() a reasonable amount based on input size

// for each record in input:
    pixels.push_back(...);
    meta.push_back(...);

Now you have one vector with all the pixel data to pass to the GPU, and one with all the metadata for yourself. No need to copy memory around.

1 Comment

That's how I ended up implementing it - but I'm still curious about an effective algorithm for reordering (similar to "in memory matrix transpose").

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.