3

When using OpenMP, I would like to declare an user-defined reduction for a class template.

#include <omp.h>
#include <iostream>

template<typename T>
class Foo
{
  public:
  T Data_;

  template<typename U> friend Foo<U> operator+( const Foo<U>& lhs, const Foo<U>& rhs );
};

template<typename U>
Foo<U> operator+( const Foo<U>& lhs, const Foo<U>& rhs )
{
  Foo<U> Addition;

  Addition.Data_ = lhs.Data_ + rhs.Data_;

  return Addition;
}

#pragma omp declare reduction( + : template<typename U> Foo<U> : omp_out = omp_out + omp_in ) initializer (omp_priv=omp_orig)

int main( int argc, char* argv[] )
{
  Foo<int> Array[100];
  
  for ( int i = 0 ; i < 100 ; ++i )
  {
    Array[i].Data_ = i;
  }
  
  Foo<int> Sum {0};
  
  #pragma omp parallel for num_threads(4) reduction( + : Sum )
  for ( int i = 0 ; i < 100 ; ++i )
  {
    Sum.Data_ += Array[i].Data_;
  }
  
  std::cout << Sum.Data_ << std::endl;
  
  return 0;
}

But I got the following errors:

error: expected type-specifier before 'template'
#pragma omp declare reduction( + : template<typename U> Foo<U> : omp_out = omp_out + omp_in ) initializer (omp_priv=omp_orig)
                                   ^~~~~~~~

I can fix the error by replacing template<typename U> Foo<U> with Foo<int>.

But I would like to know is there any solution by keeping using template.

Thanks.

0

1 Answer 1

3

Supposing you have a templated function

template<typename T>
T f(T x,T y) {};

you can templatize the reduction:

#pragma omp declare reduction                                   \
  (rwzt:T:omp_out=f<T>(omp_out,omp_in))

And use that as:

template<typename T>
T generic_reduction( const vector<T>& tdata ) {
  #pragma omp declare reduction                                   \
    (rwzt:T:omp_out=f<T>(omp_out,omp_in))

  T tmin;
  #pragma omp parallel for reduction(rwzt:tmin)
   for ( stuff ) {}
  return tmin;
}

auto tm = generic_reduction<float>( /* some vector<float> */ );

I'm somewhat bothered by the fact that this needs to define a named function to contain them the reduction, rather than having all code inlined. I can not figure out a way to do this with a lambda in the calling environment.

Sign up to request clarification or add additional context in comments.

2 Comments

Good. This seems to be well supported by mainstream compilers. Just a minor remark: I think it is better to use const vector<T>& instead of vector<T> to avoid unneeded copies.
@JérômeRichard With vectors of a size worth parallelizing you're absolutely right. Code edited.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.