I don't KNOW the answer, but here's some thoughts:
There is absolutely no REQUIREMENT that malloc is implemented in a particular way. However, an unbalanced tree is just as bad as a linked list in the worst case. A balanced linked list is a lot more maintenance. A tree, where you have two links per node, also takes up more memory than a single-linked list. Removing nodes in a linked list is easier as well as inserting at the end is very easy.
And in most systems, there is (almost) exactly one free for every malloc - so if you make one faster by making the other slower, you gain very little.
It is also relatively common that the "next allocation is the same as the previous one", meaning that if the last allocation is first in list, it's an O(1) operation.
In real-time systems, buckets are often used for allocations, such that there are a number of fixed sizes, and each time something is allocated out of the main heap, the size is rounded to the nearest larger size, and when freed it goes into the bucket of that size (which is a linked list). If there is a free element of that size already, then that allocation is used. Besides the speed of allocation/free being O(1), this has the benefit of reducing fragmentation - it's not completely impossible to "rip all the heap into small pieces, and then not have any large lumps left", but it's at least not possible to take up most of the memory by simply allocating a byte more each time until you have half the heap size in one allocation.
(Also, in Linux's GLIBC, allocations over a certain size do not end up in the linked list at all - they are allocated directly via mmap and released back with munmap when free is called)
Finally, algorithmic complexity is not everything - in real life, it is the actual time spent on something that matters - even if the algorithm has O(n) but each operation is fast, it can beat O(logn). Similarly, particularly in C++, small allocations are highly dominant, which means that overhead of more memory per node is an important factor.
mallocuses a linked list? just because one (of many many) implementation used it?ptmalloc2uses chunks in a linked list, and it's in the latest version of glibc.mallocandfreefunctions, using something else, and measure the time it takes to allocate, compared to current implementations.malloctake a large portion of the time. But in non-trivial programs, it's very often not where the time is spent - obviously highly dependent on exactly what the program does! But memory allocation has been around for the past 40 years at least, and the model has, largely, been the same - and I'm sure some people HAVE studied it.