diff options
| author | Thiago Macieira <thiago.macieira@intel.com> | 2024-10-03 16:37:06 -0700 |
|---|---|---|
| committer | Thiago Macieira <thiago.macieira@intel.com> | 2024-10-28 21:15:06 -0700 |
| commit | cabadef38341a6c29c49a64d8fea18d606637619 (patch) | |
| tree | 0b96e36b4b95d91bc820f020f14d3b605e7598d8 /src/corelib/serialization/qdatastream.cpp | |
| parent | 8a8e91a7c141994c331d2b9da7d9c5e2129c70b9 (diff) | |
QUtf8: add AVX512VL/AVX10.1-256 version of simd{Encode,Decode}Ascii()
We keep the AVX2 looping code and just add the code to perform short
loads using masks. This means the SSE2 code for short content gets dead-
code-eliminated. I also made a preference for this for exactly 32
characters.
The best looping code I could come up with that used the VPMOVUSBW
instruction [1] was much worse than the AVX2 code, for either function.
Both functions may benefit from 512-bit support, but benchmarking on
real hardware is required.
[1] https://analysis.godbolt.org/z/scEa8bW1T
Change-Id: Ie76ef558f52bb2cf1f60fffd192d947ecb011706
Reviewed-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Diffstat (limited to 'src/corelib/serialization/qdatastream.cpp')
0 files changed, 0 insertions, 0 deletions
