I have 10 arrays. I want to sort them. But since their elements have the same behavior , I want to save computations and sort only one, and the others will be sorted based on the sorted array. I'm using thrust. There is an optimal why to do it? Thank you in advance.
-
1sort_by_key on the first data set, passing the first data set as the keys, and an index sequence (0, 1, 2, ...) as the values. Then use the rearranged index sequence in a thrust gather or scatter operation to rearrange the remaining indices.Robert Crovella– Robert Crovella2018-08-27 20:10:02 +00:00Commented Aug 27, 2018 at 20:10
-
1Thank you very much Dear Mr. Robert Crovella. Can you please provide the code since I'm novice with thrust and cuda. @RobertCrovellaDriss DS Idrissi– Driss DS Idrissi2018-08-27 20:14:14 +00:00Commented Aug 27, 2018 at 20:14
2 Answers
From the comments, my suggestion was:
Use thrust::sort_by_key on the first data set (array), passing the first data set as the keys, and an index sequence (0, 1, 2, ...) as the values. Then use the rearranged index sequence in a thrust gather or scatter operation to rearrange the remaining arrays.
As requested, here is a worked example:
$ cat t282.cu
#include <thrust/sort.h>
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <thrust/copy.h>
#include <thrust/sequence.h>
#include <iostream>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/zip_iterator.h>
const size_t ds = 5;
typedef float ft;
int main(){
ft a1[ds] = {0.0f, -3.0f, 4.0f, 2.0f, 1.0f};
// data setup
thrust::device_vector<ft> d_a1(a1, a1+ds);
thrust::device_vector<ft> d_a2(ds);
thrust::device_vector<ft> d_a3(ds);
thrust::device_vector<ft> d_a2r(ds);
thrust::device_vector<ft> d_a3r(ds);
thrust::device_vector<size_t> d_i(ds);
thrust::sequence(d_i.begin(), d_i.end());
thrust::sequence(d_a2.begin(), d_a2.end());
thrust::sequence(d_a3.begin(), d_a3.end());
// sort
thrust::sort_by_key(d_a1.begin(), d_a1.end(), d_i.begin());
// copy, using sorted indices
thrust::copy_n(thrust::make_permutation_iterator(thrust::make_zip_iterator(thrust::make_tuple(d_a2.begin(), d_a3.begin())), d_i.begin()), ds, thrust::make_zip_iterator(thrust::make_tuple(d_a2r.begin(), d_a3r.begin())));
// output results
thrust::host_vector<ft> h_a1 = d_a1;
thrust::host_vector<ft> h_a2 = d_a2r;
thrust::host_vector<ft> h_a3 = d_a3r;
std::cout << "a1: " ;
thrust::copy_n(h_a1.begin(), ds, std::ostream_iterator<ft>(std::cout, ","));
std::cout << std::endl << "a2: " ;
thrust::copy_n(h_a2.begin(), ds, std::ostream_iterator<ft>(std::cout, ","));
std::cout << std::endl << "a3: " ;
thrust::copy_n(h_a3.begin(), ds, std::ostream_iterator<ft>(std::cout, ","));
std::cout << std::endl;
}
$ nvcc -o t282 t282.cu
$ cuda-memcheck ./t282
========= CUDA-MEMCHECK
a1: -3,0,1,2,4,
a2: 1,0,4,3,2,
a3: 1,0,4,3,2,
========= ERROR SUMMARY: 0 errors
$
Here, in lieu of a thrust::gather or thrust::scatter operation, I'm simply doing a thrust::copy_n with a thrust::permutation_iterator, in order to effect the reordering. I combine the remaining arrays to be reordered using thrust::zip_iterator, but this isn't the only way to do it.
Note that I'm not doing it for 10 arrays but for 3, however this should illustrate the method. The extension to 10 arrays should be just mechanical. Note however that the method would have to be modified somewhat for more than 10-11 arrays, as thrust::tuple is limited to 10 items. As a modification, you could simply call thrust::copy_n in a loop, once for each array to be reordered, rather than using zip_iterator. I don't think this should make a large difference in efficiency.
Comments
A few ways to go about doing this (irrespective of Thrust):
- Initialize an array of indices
indicesto0, 1, 2, 3...etc. - Sort
indices, with the comparison function being accessing elements in one of the arrays (the one for which comparisons are cheapest), and comparing those. Call the resulting array - For each one of the 10 arrays,
arrapply a Gather operation using the sortedindicesandarras the data from which to gather. i.e.sorted_arr[i] = arr[indices[i]]for alli.
- Initialize an array of indices
- Adapt one of the sort implementations to also do "index log-keeping", i.e. whenever you swap or position data in a "real" array, also set an index in an indices array.
- Run this index-keeping sorty on one of the 10 arrays (the one that's cheapest to sort).
- Apply the Gather from 1.3 to the other 9 arrays
Let
cheapbe the cheapest array to sort (or to compare elements)- Create an array of pairs
pairs[i] = { i, cheap[i] }of the approrpriate type. - Have the comparison of these pairs only use the second element of the pair.
- Sort
pairs - Project
pairsonto its first element:indices[i] = pairs[i].first - Project
pairsonto its second element:sorted_cheap[i] = pairs[i].second - Apply the Gather from 1.3 to the other nine arrays
- Create an array of pairs
The second option should be the fastest, but would require more effort; and with thrust, it's probably quite difficult. Either the first or the third should be the easiest; and thrust accepts custom comparators, right? If it doesn't, you might have to define a wrapper class with the appropriate comparators.