I'm using the Stringzilla string library, which is supposed to be close to how std::string operates. It has a flag called SZ_USE_MISALIGNED_LOADS. By default it sets it to on, and it looks like this:
/**
* @brief A misaligned load can be - trying to fetch eight consecutive bytes from an address
* that is not divisible by eight. On x86 enabled by default. On ARM it's not.
*
* Most platforms support it, but there is no industry standard way to check for those.
* This value will mostly affect the performance of the serial (SWAR) backend.
*/
#ifndef SZ_USE_MISALIGNED_LOADS
#if defined(__x86_64__) || defined(_M_X64) || defined(__i386__) || defined(_M_IX86)
#define SZ_USE_MISALIGNED_LOADS (1) // true or false
#else
#define SZ_USE_MISALIGNED_LOADS (0) // true or false
#endif
#endif
And so I understand that on x86_64 (my platform) that unaligned loads are supported. I turned on UBSAN, and I get a warning for unaligned reads. It's when I construct a string with a char*. The stringzilla class creates a string view from my char* and then does:
void init(string_view other) noexcept(false)
{
// "other" is the string_view I passed in the constructor
sz_ptr_t start; // after allocating memory start = 0x7bfff5900029
if (!_with_alloc(
[&](sz_alloc_type &alloc) { return (start = sz_string_init_length(&string_, other.size(), &alloc)); }))
throw std::bad_alloc();
sz_copy(start, (sz_cptr_t)other.data(), other.size());
}
Then inside sz_copy it looks like this:
SZ_PUBLIC void sz_copy_serial(sz_ptr_t target, sz_cptr_t source, sz_size_t length) {
// The most typical implementation of `memcpy` suffers from Undefined Behavior:
//
// for (char const *end = source + length; source < end; source++) *target++ = *source;
//
// As NULL pointer arithmetic is undefined for calls like `memcpy(NULL, NULL, 0)`.
// That's mitigated in C2y with the N3322 proposal, but our solution uses a design, that has no such issues.
// https://developers.redhat.com/articles/2024/12/11/making-memcpynull-null-0-well-defined
#if SZ_USE_MISALIGNED_LOADS
while (length >= 8) *(sz_u64_t *)target = *(sz_u64_t const *)source, target += 8, source += 8, length -= 8;
#endif
while (length--) *(target++) = *(source++);
}
The very first copy, i.e.:
(*(sz_u64_t *)target = *(sz_u64_t const *)source
"target" has the address which I assume is the result of the allocation, which was 0x7bfff5900029. UBSAN says here:
runtime error: store to misaligned address 0x7bfff5900029 for type 'sz_u64_t' (aka 'unsigned long'), which requires 8 byte alignment
My questions are:
- If x86_64 supports unaligned reads, and this feature is enabled by default, why does UBSAN say it's undefined behaviour. Should I disable it manually? If I disable it the copies end up being done byte for byte in a loop.
- Why is the allocation to store the string even at address 0x7bfff5900029? That sounds like a ridiculous address in terms of alignment to have received from an allocation function. Usually they're 8 or 16 byte-aligned.
- This custom function is implemented because memcpy(NULL) is UB? Wouldn't it be better just check for NULL and use memcpy, which is guaranteed to be much better than any manual loop?