5

I have tried to allocate space for 10^7 integers in heap and stack memory to see which one is faster. Obviously allocating in heap-memory was much faster but I don't understand the reason.

#include <bits/stdc++.h>
#include <chrono>

using namespace std;
using namespace std::chrono;

int main()
{
  high_resolution_clock::time_point t1 = high_resolution_clock::now();

  int *p = new int[1e7];

  high_resolution_clock::time_point t2 = high_resolution_clock::now();
  auto duration = duration_cast<microseconds>( t2 - t1 ).count();
  cout << duration / 1e6 << "\n"; // 5e-06



  t1 = high_resolution_clock::now();

  vector<int> v(1e7);

  t2 = high_resolution_clock::now();
  duration = duration_cast<microseconds>( t2 - t1 ).count();
  cout << duration / 1e6 << "\n"; // 0.112284

  return 0;
}
10
  • 17
    Where do you think you allocated space for 10^7 integers on the stack? Commented Jul 30, 2019 at 14:20
  • 3
    Why one shouldn't include bits/stdc++.h... Commented Jul 30, 2019 at 14:23
  • 1
    About using namespace std... Commented Jul 30, 2019 at 14:24
  • 2
    @ReticulatedSpline However, that memory isn't actually reserved by the OS until you try to write to it. That's a "feature" of Linux (and a few other OSes) which can be disabled if you value system stability. Commented Jul 30, 2019 at 14:27
  • 2
    std::vector allocates its data on the heap as well! Only a few member variables, including a pointer, actually reside on the stack; try sizeof(std::vector<int>), giving the number of bytes that really are allocated on stack... Commented Jul 30, 2019 at 14:27

4 Answers 4

17

new int[1e7] allocates space for 1e7 int values and doesn't initialize them.

vector<int> v(1e7); creates a vector<int> object on the stack, and the constructor for that object allocates space for 1e7 int values on the heap. It also initializes each int value to 0.

The difference in speed is because of the initialization.

To compare the speed of stack allocation you need to allocate an array on the stack:

int data[1e7];

but be warned: there's a good chance that this will fail because the stack isn't large enough for that big an array.

Sign up to request clarification or add additional context in comments.

1 Comment

You have nailed it!
7

I am only a beginner, but let me give what I understand mainly to test myself.

In

int *p = new int[1e7];

you are allocating consecutive memory for 10 million integers on the heap.

In

vector<int> v(1e7);

you are allocating on the stack memory for a vector<int> object. Among the members of that object there is a pointer to an int[1e7] on the heap, that is also allocated. Moreover, all the values in it are initialized with the value of int() (with 0s). See constructor (2) of std::vector.

Comments

6

Other answers pointed out that there is at least a "hidden" initialization in vector constructor.

But your example has another problem: maybe it does not even measure what you think it does. Benchmarking unoptimized code in C++ is pretty much meaningless and properly timing optimized code is hard.

Let's take a look at your (modified for readability) example compiled by Clang with -O3 optimization level: godbolt link.

double test1() {
  high_resolution_clock::time_point t1 = high_resolution_clock::now();

  int *p = new int[1e7];

  high_resolution_clock::time_point t2 = high_resolution_clock::now();
  auto duration = duration_cast<microseconds>( t2 - t1 ).count();
  return duration / 1e6; // 5e-06
}

compiled to:

test1():                              # @test1()
        push    rbx
        call    std::chrono::_V2::system_clock::now()
        mov     rbx, rax
        call    std::chrono::_V2::system_clock::now()
        sub     rax, rbx
        movabs  rcx, 2361183241434822607
        imul    rcx
        mov     rax, rdx
        shr     rax, 63
        sar     rdx, 7
        add     rdx, rax
        cvtsi2sd        xmm0, rdx
        divsd   xmm0, qword ptr [rip + .LCPI0_0]
        pop     rbx
        ret
.LCPI1_0:
        .quad   4696837146684686336     # double 1.0E+6

First part does not even call operator new! Compiler saw through your program and realized that you never used allocated array so it removed allocation from resulting executable.

So first part of your program does not allocate array on heap at all when compiled with such settings making measurements meaningless.

I recommend to read about benchmarking and use specialized micro benchmark frameworks to make such tests. Take a look at Google Benchmark (and online QuickBench) and it's documentation.

1 Comment

I like your answer here, it makes an important point--I think you say "Benchmarking unoptimized code in C++ is pretty much meaningless" because in the end you will probably allow the compiler to optimize the code. It might help to add that if that's what is meant. It seems to me if you never intend to optimize it, then benchmarking that might be what you want.
0

I would like to note that the stack allocation takes absolutely no time at the run time; all the work is done by compiler. The comparison is meaningless regardless of optimization.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.