I want to use vector<char> as a buffer. The interface is perfect for my needs, but there's a performance penalty when resizing it beyond its current size, since the memory is initialized. I don't need the initialization, since the data will be overwritten in any case by some third-party C functions. Is there a way or a specific allocator to avoid the initialization step? Note that I do want to use resize(), not other tricks like reserve() and capacity(), because I need size() to always represent the significative size of my "buffer" at any moment, while capacity() might be greater than its size after a resize(), so, again, I cannot rely on capacity() as a significative information for my application. Furthemore, the (new) size of the vector is never known in advance, so I cannot use std::array. If vector cannot be configured that way, I'd like to know what kind of container or allocator I could use instead of vector<char, std::alloc>. The only requirement is that the alternative to vector must at most be based on STL or Boost. I have access to C++11.
6 Answers
It is a known issue that initialization can not be turned off even explicitly for std::vector.
People normally implement their own pod_vector<> that does not do any initialization of the elements.
Another way is to create a type which is layout-compatible with char, whose constructor does nothing:
struct NoInitChar
{
char value;
NoInitChar() noexcept {
// do nothing
static_assert(sizeof *this == sizeof value, "invalid size");
static_assert(__alignof *this == __alignof value, "invalid alignment");
}
};
int main() {
std::vector<NoInitChar> v;
v.resize(10); // calls NoInitChar() which does not initialize
// Look ma, no reinterpret_cast<>!
char* beg = &v.front().value;
char* end = beg + v.size();
}
15 Comments
NoInitChar has the same size and alignment as char. Those static asserts are primarily compilable documentation.char size and alignment (whatever a compiler could gain from adding unneccessary padding, but well, the standard at least allows it). So those asserts are not true by standard, even if probably true on any reasonable platform. But interesting answer +1.std::vector::vector and std::vector::resize take default parameter const T& value=T(), which means that the uninitialized value would be copied over the whole vector, thus leading to UB due to read of uninitialized variable.There's nothing in the standard library that meets your requirements, and nothing I know of in boost either.
There are three reasonable options I can think of:
- Stick with
std::vectorfor now, leave a comment in the code and come back to it if this ever causes a bottleneck in your application. - Use a custom allocator with empty
construct/destroymethods - and hope your optimiser will be smart enough to remove any calls to them. - Create a wrapper around a a dynamically allocated array, implementing only the minimal functionality that you require.
11 Comments
_Construct if std::allocator is used, because the result is "as if" each element had been initialized by std::allocator<T>::construct, but for a custom allocator that optimization is not used. See <bits/stl_uninitialized.h> and the __uninitialized_fill_n_a function which is overloaded on the allocator type and either dispatches to __uninitialized_fill_n or allocator_traits<A>::constructdefault_init_constructor of this answer implements an allocator whose construct method default initializes if no arguments are given, i.e., it implements solution 2 proposed by Joe Gauderin.As an alternative solution that works with vectors of different pod types:
template<typename V>
void resize(V& v, size_t newSize)
{
struct vt { typename V::value_type v; vt() {}};
static_assert(sizeof(vt[10]) == sizeof(typename V::value_type[10]), "alignment error");
typedef std::vector<vt, typename std::allocator_traits<typename V::allocator_type>::template rebind_alloc<vt>> V2;
reinterpret_cast<V2&>(v).resize(newSize);
}
And then you can:
std::vector<char> v;
resize(v, 1000); // instead of v.resize(1000);
This is most likely UB, even though it works properly for me for cases where I care more about performance. Difference in generated assembly as produced by clang:
test():
push rbx
mov edi, 1000
call operator new(unsigned long)
mov rbx, rax
mov edx, 1000
mov rdi, rax
xor esi, esi
call memset
mov rdi, rbx
pop rbx
jmp operator delete(void*)
test_noinit():
push rax
mov edi, 1000
call operator new(unsigned long)
mov rdi, rax
pop rax
jmp operator delete(void*)
3 Comments
resize_hack or similar in recognition of the fact.So to summarize the various solutions found on stackoverflow:
- use a special default-init allocator. (https://stackoverflow.com/a/21028912/1984766)
drawback: changes the vector-type tostd::vector<char, default_init_allocator<char>> vec; - use a wrapper-struct
struct NoInitChararound a char that has an empty constructor and therefore skips the value-initialization (https://stackoverflow.com/a/15220853/1984766)
drawback: changes the vector-type tostd::vector<NoInitChar> vec; - temporarily cast the
vector<char>tovector<NoInitChar>and resize it (https://stackoverflow.com/a/57053750/1984766)
drawback: does not change the type of the vector but you need to callyour_resize_function (vec, x)instead ofvec.resize (x).
With this post I wanted to point out that all of these methods need to be optimized by the compiler in order to speed up your program. I can confirm that the initialization of the new chars when resizing is indeed optimized away with every compiler I tested. So everything looks good ...
But --> since method 1 & 2 change the type of the vector, what happens when you use these vectors under more "complex" circumstances.
Consider this example:
#include <time.h>
#include <vector>
#include <string_view>
#include <iostream>
//high precision-timer
double get_time () {
struct timespec timespec;
::clock_gettime (CLOCK_MONOTONIC_RAW, ×pec);
return timespec.tv_sec + timespec.tv_nsec / (1000.0 * 1000.0 * 1000.0);
}
//method 1 --> special allocator
//reformated to make it readable
template <typename T, typename A = std::allocator<T>>
class default_init_allocator : public A {
private:
typedef std::allocator_traits<A> a_t;
public:
template<typename U>
struct rebind {
using other = default_init_allocator<U, typename a_t::template rebind_alloc<U>>;
};
using A::A;
template <typename U>
void construct (U* ptr) noexcept (std::is_nothrow_default_constructible<U>::value) {
::new (static_cast<void*>(ptr)) U;
}
template <typename U, typename...Args>
void construct (U* ptr, Args&&... args) {
a_t::construct (static_cast<A&>(*this), ptr, std::forward<Args>(args)...);
}
};
//method 2 --> wrapper struct
struct NoInitChar {
public:
NoInitChar () noexcept { }
NoInitChar (char c) noexcept : value (c) { }
public:
char value;
};
//some work to waste time
template<typename T>
void do_something (T& vec, std::string_view str) {
vec.push_back ('"');
vec.insert (vec.end (), str.begin (), str.end ());
vec.push_back ('"');
vec.push_back (',');
}
int main (int argc, char** argv) {
double timeBegin = get_time ();
std::vector<char> vec; //normal case
//std::vector<char, default_init_allocator<char>> vec; //method 1
//std::vector<NoInitChar> vec; //method 2
vec.reserve (256 * 1024 * 1024);
for (int i = 0; i < 1024 * 1024; ++i) {
do_something (vec, "foobar1");
do_something (vec, "foobar2");
do_something (vec, "foobar3");
do_something (vec, "foobar4");
do_something (vec, "foobar5");
do_something (vec, "foobar6");
do_something (vec, "foobar7");
do_something (vec, "foobar8");
vec.resize (vec.size () + 64);
}
double timeEnd = get_time ();
std::cout << (timeEnd - timeBegin) * 1000 << "ms" << std::endl;
return 0;
}
You would expect that method 1 & 2 outperform the normal vector with every "recent" compiler since the resize is free and the other operations are the same. Well think again:
g++ 7.5.0 g++ 8.4.0 g++ 9.3.0 clang++ 9.0.0
vector<char> 95ms 134ms 133ms 97ms
method 1 130ms 159ms 166ms 91ms
method 2 166ms 160ms 159ms 89ms
All test-applications are compiled like this and executed 50times taking the lowest measurement:
$(cc) -O3 -flto -std=c++17 sample.cpp
2 Comments
do_something(). That's why I said never trust casual benchmarks.It's very rare that you would need to do this; I strongly encourage you to benchmark your situation to be absolutely sure this hack is needed.
Even then, I prefer the NoInitChar solution. (See Maxim's answer)
But if you're sure you would benefit from this, and NoInitChar doesn't work for you, and you're using clang, or gcc, or MSVC as your compiler, consider using folly's routine for this purpose.
See https://github.com/facebook/folly/blob/master/folly/memory/UninitializedMemoryHacks.h
The basic idea is that each of these library implementations has a routine for doing uninitialized resizing; you just need to call it.
While hacky, you can at least console yourself with knowing that facebook's C++ code relies on this hack working properly, so they'll be updating it if new versions of these library implementations require it.
reserve/capacity? Orstd::array, or an appropriate constructor..Not very clear what you want to do/avoid..operator[]will assert if you use that funcationality to access the 'fake' elements etc.