CUDA: Sort an array according to the order defined by another array using thrust

Question

I have 10 arrays. I want to sort them. But since their elements have the same behavior , I want to save computations and sort only one, and the others will be sorted based on the sorted array. I'm using thrust. There is an optimal why to do it? Thank you in advance.

sort_by_key on the first data set, passing the first data set as the keys, and an index sequence (0, 1, 2, ...) as the values. Then use the rearranged index sequence in a thrust gather or scatter operation to rearrange the remaining indices. — Robert Crovella
– Robert Crovella, Commented Aug 27, 2018 at 20:10
Thank you very much Dear Mr. Robert Crovella. Can you please provide the code since I'm novice with thrust and cuda. @RobertCrovella — Driss DS Idrissi
– Driss DS Idrissi, Commented Aug 27, 2018 at 20:14

Robert Crovella · Accepted Answer · 2018-08-28 00:41:31Z

From the comments, my suggestion was:

Use thrust::sort_by_key on the first data set (array), passing the first data set as the keys, and an index sequence (0, 1, 2, ...) as the values. Then use the rearranged index sequence in a thrust gather or scatter operation to rearrange the remaining arrays.

As requested, here is a worked example:

$ cat t282.cu
#include <thrust/sort.h>
#include <thrust/device_vector.h>
#include <thrust/host_vector.h>
#include <thrust/copy.h>
#include <thrust/sequence.h>
#include <iostream>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/zip_iterator.h>

const size_t ds = 5;
typedef float ft;

int main(){
  ft a1[ds] = {0.0f, -3.0f, 4.0f, 2.0f, 1.0f};
// data setup
  thrust::device_vector<ft> d_a1(a1, a1+ds);
  thrust::device_vector<ft> d_a2(ds);
  thrust::device_vector<ft> d_a3(ds);
  thrust::device_vector<ft> d_a2r(ds);
  thrust::device_vector<ft> d_a3r(ds);
  thrust::device_vector<size_t> d_i(ds);
  thrust::sequence(d_i.begin(), d_i.end());
  thrust::sequence(d_a2.begin(), d_a2.end());
  thrust::sequence(d_a3.begin(), d_a3.end());
// sort
  thrust::sort_by_key(d_a1.begin(), d_a1.end(), d_i.begin());
// copy, using sorted indices
  thrust::copy_n(thrust::make_permutation_iterator(thrust::make_zip_iterator(thrust::make_tuple(d_a2.begin(), d_a3.begin())), d_i.begin()), ds, thrust::make_zip_iterator(thrust::make_tuple(d_a2r.begin(), d_a3r.begin())));
// output results
  thrust::host_vector<ft> h_a1 = d_a1;
  thrust::host_vector<ft> h_a2 = d_a2r;
  thrust::host_vector<ft> h_a3 = d_a3r;
  std::cout << "a1: " ;
  thrust::copy_n(h_a1.begin(), ds, std::ostream_iterator<ft>(std::cout, ","));
  std::cout << std::endl << "a2: " ;
  thrust::copy_n(h_a2.begin(), ds, std::ostream_iterator<ft>(std::cout, ","));
  std::cout << std::endl << "a3: " ;
  thrust::copy_n(h_a3.begin(), ds, std::ostream_iterator<ft>(std::cout, ","));
  std::cout << std::endl;
}
$ nvcc -o t282 t282.cu
$ cuda-memcheck ./t282
========= CUDA-MEMCHECK
a1: -3,0,1,2,4,
a2: 1,0,4,3,2,
a3: 1,0,4,3,2,
========= ERROR SUMMARY: 0 errors
$

Here, in lieu of a thrust::gather or thrust::scatter operation, I'm simply doing a thrust::copy_n with a thrust::permutation_iterator, in order to effect the reordering. I combine the remaining arrays to be reordered using thrust::zip_iterator, but this isn't the only way to do it.

Note that I'm not doing it for 10 arrays but for 3, however this should illustrate the method. The extension to 10 arrays should be just mechanical. Note however that the method would have to be modified somewhat for more than 10-11 arrays, as thrust::tuple is limited to 10 items. As a modification, you could simply call thrust::copy_n in a loop, once for each array to be reordered, rather than using zip_iterator. I don't think this should make a large difference in efficiency.

einpoklum · Accepted Answer · 2018-08-27 20:18:21Z

A few ways to go about doing this (irrespective of Thrust):

‏
1. Initialize an array of indices indices to 0, 1, 2, 3... etc.
2. Sort indices, with the comparison function being accessing elements in one of the arrays (the one for which comparisons are cheapest), and comparing those. Call the resulting array
3. For each one of the 10 arrays, arr apply a Gather operation using the sorted indices and arr as the data from which to gather. i.e. sorted_arr[i] = arr[indices[i]] for all i.
‏
1. Adapt one of the sort implementations to also do "index log-keeping", i.e. whenever you swap or position data in a "real" array, also set an index in an indices array.
2. Run this index-keeping sorty on one of the 10 arrays (the one that's cheapest to sort).
3. Apply the Gather from 1.3 to the other 9 arrays
Let cheap be the cheapest array to sort (or to compare elements)
1. Create an array of pairs pairs[i] = { i, cheap[i] } of the approrpriate type.
2. Have the comparison of these pairs only use the second element of the pair.
3. Sort pairs
4. Project pairs onto its first element: indices[i] = pairs[i].first
5. Project pairs onto its second element: sorted_cheap[i] = pairs[i].second
6. Apply the Gather from 1.3 to the other nine arrays

The second option should be the fastest, but would require more effort; and with thrust, it's probably quite difficult. Either the first or the third should be the easiest; and thrust accepts custom comparators, right? If it doesn't, you might have to define a wrapper class with the appropriate comparators.

Collectives™ on Stack Overflow

CUDA: Sort an array according to the order defined by another array using thrust

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related