One often reads that there is little performance difference between dynamically allocated array and std::vector.
Here are two versions of the problem 10 of project Euler test with two versions:
with std::vector:
const __int64 sum_of_primes_below_vectorversion(int max)
{
auto primes = new_primes_vector(max);
__int64 sum = 0;
for (auto p : primes) {
sum += p;
}
return sum;
}
const std::vector<int> new_primes_vector(__int32 max_prime)
{
std::vector<bool> is_prime(max_prime, true);
is_prime[0] = is_prime[1] = false;
for (auto i = 2; i < max_prime; i++) {
is_prime[i] = true;
}
for (auto i = 1; i < max_prime; i++) {
if (is_prime[i]) {
auto max_j = max_prime / i;
for (auto j = i; j < max_j; j++) {
is_prime[j * i] = false;
}
}
}
auto primes_count = 0;
for (auto i = 0; i < max_prime; i++) {
if (is_prime[i]) {
primes_count++;
}
}
std::vector<int> primes(primes_count, 0);
for (auto i = 0; i < max_prime; i++) {
if (is_prime[i]) {
primes.push_back(i);
}
}
return primes;
}
Note that I also tested the version version with the call to the default constructor of std::vector and without the precomputation of its final size.
Here is the array version:
const __int64 sum_of_primes_below_carrayversion(int max)
{
auto p_length = (int*)malloc(sizeof(int));
auto primes = new_primes_array(max, p_length);
auto last_index = *p_length - 1;
__int64 sum = 0;
for (int i = 0; i < last_index; i++) {
sum += primes[i];
}
free((__int32*)(primes));
free(p_length);
return sum;
}
const __int32* new_primes_array(__int32 max_prime, int* p_primes_count)
{
auto is_prime = (bool*)malloc(max_prime * sizeof(bool));
is_prime[0] = false;
is_prime[1] = false;
for (auto i = 2; i < max_prime; i++) {
is_prime[i] = true;
}
for (auto i = 1; i < max_prime; i++) {
if (is_prime[i]) {
auto max_j = max_prime / i;
for (auto j = i; j < max_j; j++) {
is_prime[j * i] = false;
}
}
}
auto primes_count = 0;
for (auto i = 0; i < max_prime; i++) {
if (is_prime[i]) {
primes_count++;
}
}
*p_primes_count = primes_count;
int* primes = (int*)malloc(*p_primes_count * sizeof(__int32));
int index_primes = 0;
for (auto i = 0; i < max_prime; i++) {
if (is_prime[i]) {
primes[index_primes] = i;
index_primes++;
}
}
free(is_prime);
return primes;
}
This is compiled with the MVS2013 compiler, with optimization flags O2. I don't really see what should be the big difference, because of the move semantics (allowing returning the big vector by value without copy).
Here are the results, with an input of 2E6:
C array version
avg= 0.0438
std= 0.00928224
vector version
avg= 0.0625
std= 0.0005
vector version (no realloc)
avg= 0.0687
std= 0.00781089
The statistics are on 10 trials. I think there are quite some differences here. Is it because something in my code to be improved?
edit: after correction of my code (and another improvement), here are my new results:
C array version
avg= 0.0344
std= 0.00631189
vector version
avg= 0.0343
std= 0.00611637
vector version (no realloc)
avg= 0.0469
std= 0.00997447
which confirms that there is no penalty of std::vector compare to C arrays (and that one should avoid reallocating).
const, that may inhibit optimization