One important reason is performance, shared_ptr::get() doesn't have to dereference a pointer to find the object address if it's stored directly inside the shared_ptr object.
But apart from performance, the smart pointer with pointer to pointer block implementation wouldn't support all the things you can do with shared_ptr e.g.
std::shared_ptr<int> pi(new int(0));
std::shared_ptr<void> pv = pi;
std::shared_ptr<int> pi2 = static_pointer_cast<int>(pv);
struct A {
int i;
};
std::shared_ptr<A> pa(new A);
std::shared_ptr<int> pai(pa, pa->i);
struct B { virtual ~B() = default; };
struct C : B { };
std::shared_ptr<B> pb(new C);
std::shared_ptr<C> pc = std::dynamic_pointer_cast<C>(pb);
In these examples pv, pai and pb store a pointed that is not the same type as the pointer owned by the control block, so there must be a second pointer (which might be a different type) stored in the shared_ptr itself.
For pv and pb it would be possible to make it work, by converting the pointer stored in the control block to the type that needs to be returned. That would work in some cases, although there are examples using multiple inheritance that would not work correctly.
But for the pai example (which uses the aliasing constructor) there is no way to make that work without storing a pointer separate to the one in the control block, because the two pointers are completely unrelated types and you can't convert between them.
You said in a comment:
I see and in case of make_shared, second pointer points to the address internal to the allocated block. (I actually tried this already and it seems that way)
Yes, that's correct. There is still a second pointer, but both poitners refer into the same block of memory. This has the advantage that only one memory allocation is needed instead of two separate ones for the object and the control block. Additionally, the object and control block are adjacent in memory so are more likely to share a cache line. If the CPU has got the ref-count in its cache already then it probably also has the object in its cache, so accessing them both is faster and means there is another cache line available to be used for other data.
std::unique_ptrdoesn't perform any extra heap allocations, whilemake_shareddoes. So of course the memory layout will be different. If you meanauto up = std::shared_ptr(ptr);- then yes, it is a common implementation technique to havemake_sharedcombine memory for the object itself, and for the control block, together into a single heap allocation, whileshared_ptr's constructor obviously can't do that (the allocation for the object has already happened).