Consider a dataset with N samples, where every sample consists of head elements followed by tail elements. I'd like to perform an stable reordering such that N tails are grouped together at the top, and N heads are grouped together at the bottom.
For example, the dataset:
-1 -2 -3 1 2 3 4 5
-4 -5 -6 6 7 8 9 10
-7 -8 -9 11 12 13 14 15
-10 -11 -12 16 17 18 19 20
has has N = 4 samples, with head = 3 (negative integers), and tail = 5 (positive integers). The desired transformation would yield:
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 -1 -2 -3 -4
-5 -6 -7 -8 -9 -10 -11 -12
I implemented a solution based on repeated application of rotations. The rotations are those implemented by the C++ algorithm std::rotate:
std::rotate(first, n_first, last)swaps the elements in the range[first, last)in such a way that the elementn_firstbecomes the first element of the new range andn_first - 1becomes the last element.
My implementation (shown below) provides a correct solution and performs well for my problems. However, it needs to perform N rotations, where the complexity of every rotation increases from O(head + tail) to O(N * tail + head).
Are you aware of an algorithm with better complexity?.
My code follows:
#include <algorithm>
#include <vector>
#include <iostream>
#include <iomanip>
template <typename I> // I models Forward Iterator
I reorder(I f, I l, std::size_t head_size, std::size_t tail_size)
{
std::size_t k = 1;
auto m = std::next(f, head_size);
auto t = std::next(m, tail_size);
while (t != l) {
f = std::rotate(f, m, std::next(m, tail_size));
m = std::next(f, ++k * head_size);
t = std::next(m, tail_size);
};
return std::rotate(f, m, t);
}
template <typename C>
void show(const char* message, const C& c)
{
std::size_t shown { 0 };
std::cout << message << "\n";
for (auto && ci : c)
std::cout << std::setw(3) << ci
<< (++shown % 8 == 0 ? "\n" : " ");
}
int main()
{
std::vector<int> v {
-1, -2, -3, 1, 2, 3, 4, 5,
-4, -5, -6, 6, 7, 8, 9, 10,
-7, -8, -9, 11, 12, 13, 14, 15,
-10, -11, -12, 16, 17, 18, 19, 20 };
std::size_t head_size { 3 };
std::size_t tail_size { 5 };
show("before reorder", v);
reorder(v.begin(), v.end(), head_size, tail_size);
show("after reorder", v);
return 0;
}
Compile and run:
$ clang++ example.cpp -std=c++14
before reorder
-1 -2 -3 1 2 3 4 5
-4 -5 -6 6 7 8 9 10
-7 -8 -9 11 12 13 14 15
-10 -11 -12 16 17 18 19 20
after reorder
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 -1 -2 -3 -4
-5 -6 -7 -8 -9 -10 -11 -12
Details about the problem domain
I'm reading images where every line is preceded by pixel metadata, and followed by the raw pixels, like so:
[metadata] [pixels...]
[metadata] [pixels...]
[ many more of these]
[metadata] [pixels...]
I need to pack all the pixels together and pass them to OpenGL, but I also need to maintain the metadata accessible for the program. So, I do this:
[all the pixels] [all the metadata]
and pass [all the pixels] to my graphics card while keeping a handle to [all the metadata] in the CPU.
Edit
Thanks for your responses - I've ended up implementing an alternative solution that does not rely on reordering. However, the point remains: can you improve the algorithm for "in place" dataset reordering? It is similar to in place matrix transpose.