0

here a piece of simplified code that causes me a problem/error (Vec4d is coming from the Agner Fog library VCL)

#define AVX256_ALIGNED_MALLOC(type,size) (type *)_aligned_malloc(size * sizeof(type),32)
#define AVX256_FREE(ptr) _aligned_free(ptr)

int N = 1024;
std::vector<double> A(N);
double* Aaligned = AVX256_ALIGNED_MALLOC(double, N);
memcpy(Aaligned, &A[0], N * sizeof(double));

int N4=N>>2;

for(size_t i = 4; i <N4-4; ++i)
{
    //....
    ... = ((Vec4d*)(Aaligned - 1))[i] + ((Vec4d*)(Aaligned))[i] + ((Vec4d*)(Aaligned + 1))[i];
}

AVX256_FREE(Aaligned);

If it is clear to me that I'am allowed to use

((Vec4d*)(Aaligned))[i]

Can you confirm that I cannot use

((Vec4d*)(Aaligned-1))[i] 

or ((Vec4d*)(Aaligned+1))[i] Any hints ? Many thanks. Luc

6
  • 1
    The misaligned access may fail, depending on how the compiler implements it. Use Vec4d().load, rather than Vec4d().load_a if you want to use misaligned data. Commented Oct 28, 2022 at 6:48
  • You can find aligned container classes for the Vector Class Library at github.com/vectorclass/add-on/tree/master/containers Commented Oct 28, 2022 at 6:50
  • Is it even safe to point a Vec4d* at some double[] data? __m256d is defined as __attribute__((may_alias)), but Vec4d isn't, so that's a strict-aliasing violation. Use load intrinsics like _mm256_loadu_pd, or the .load member function. Also, use an aligned allocator on your std::vector<double, custom_allocator> if you really want alignment, or just use unaligned loads on your std::vector directly. Definitely don't allocate + memcpy! Commented Oct 28, 2022 at 14:53
  • Thanks for these hints. May the instruction "load" suffer from a disabling calculation time compared to the casting operation (Vec4d*) ? Thank you. Commented Nov 18, 2022 at 15:15
  • Thanks for these hints. May the instruction "load" suffer from a disabling calculation time compared to the casting operation (Vec4d*) ? Put another way, does "load" perform a copy or does it simply wrap a memory buffer ? Thank you Commented Dec 2, 2022 at 14:56

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.