11

Background: I'm developing a multiplatform framework of sorts that will be used as base for both game and util/tool creation. The basic idea is to have a pool of workers, each executing in its own thread. (Furthermore, workers will also be able to spawn at runtime.) Each thread will have it's own memory manager.

I have long thought about creating my own memory management system, and I think this project will be perfect to finally give it a try. I find such a system fitting due to the types of usages of this framework will often require memory allocations in realtime (games and texture edition tools).

Problems:

  • No generally applicable solution(?) - The framework will be used for both games/visualization (not AAA, but indie/play) and tool/application creation. My understanding is that for game development it is usual (at least for console games) to allocate a big chunk of memory only once in the initialization, and then use this memory internally in the memory manager. But is this technique applicable in a more general application?

    In a game you could theoretically know how much memory your scenes and resources will need, but for example, a photo editing application will load resources of all different sizes... So in the latter case a more dynamic memory "chunk size" would be needed? Which leads me to the next problem:

  • Moving already allocated data and keeping valid pointers - Normally when allocating on the heap, you will acquire a simple pointer to the memory chunk. In a custom memory manager, as far as I understand it, a similar approach is then to return a pointer to somewhere free in the pre-allocated chunk. But what happens if the pre-allocated chunk is too small and needs to be resized or even defragmentated? The data would be needed to be moved around in the memory and the old pointers would be invalid. Is there a way to transparently wrap these pointers in some way, but still use them as normally "outside" the memory management as if they were usual C++ pointers?

  • Third party libraries - If there is no way to transparently use a custom memory management system for all memory allocation in the application, every third party library I'm linking with, will still use the "old" OS memory allocations internally. I have learned that it is common for libraries to expose functions to set custom allocation functions that the library will use, but it is not guaranteed every library I will use will have this ability.

Questions: Is it possible and feasible to implement a memory manager that can use a dynamically sized memory chunk pool? If so, how would defragmentation and memory resize work, without breaking currently in-use pointers? And finally, how is such a system best implemented to work with third party libraries?

I'm also thankful for any related reading material, papers, articles and whatnot! :-)

3
  • 1
    I'll add a couple of resources as a comment I've found that are interesting I have for future readers (trying not to clutter up the actual question): swedishcoding.com/2008/08/31/are-we-out-of-memory and the book 'Game Engine Architecture' by Jason Gregory - Section 5.2 Memory Management. Commented Jun 25, 2013 at 20:08
  • 1
    Don't reinvent the wheel if you don't have to and aren't being paid to do it. Commented Jun 25, 2013 at 20:18
  • 1
    @AJMansfield Doing it for learning experience and out of curiosity! Also could not find a good answer relative to my specific questions, and I guess someone else could possibly be interested in the same thing. Commented Jun 25, 2013 at 20:23

4 Answers 4

10

As someone who has previously written many memory managers and heap implementations for AAA games for the last few generations of consoles let me tell you its simply not worth it anymore.

Your information is old - back in the gamecube era [circa 2003] we used to do what you said- allocate a large chunk and carve out that chunk manually using custom algorithms tweaked for each game.

Once virtual memory came along (xbox era), games got more complicated [and so made more allocations and became multimthreaded] address fragmentation made this untenable. So we switched to custom allocators to handle certain types of requests only - for instance physical memory, or lock free small block low fragmentation heaps or thread local cache of recently used blocks.

As built in memory managers become better it gets harder to do better than those - certainly in the general case and a close thing for a specific use cases. Doug Lea Allocator [or whatever the mainstream c++ linux compilers come with now] and the latest Windows low fragmentation heaps are really very good, and you'd do far better investing your time elsewhere.

I've got spreadsheets at work measuring all kinds of metrics for a whole load of allocators - all the big name ones and a fair few I've collected over the years. And basically whilst the specialist allocators can win on a few metrics [lowest overhead per alloc, spacial proximity, lowest fragmentation, etc] for overall metrics the mainstream ones are simply the best.

As a user of your library, my personal preferred option is you just allocate memory when you need it. Use operator new/the new operator and I can use the standard C++ mechanisms to replace those and use my custom heap (if I indeed have one), or alternatively I can use platform specific ways of replacing your allocations (e.g. XMemAlloc on Xbox). I don't need tagging [capturing callstacks is far superior which I can do if I want]. Lower down that list comes you giving me an interface that you'll call when you need to allocate memory - this is just a pain for you to implement and I'll probably just pass it onto operator new anyway. The worst thing you can do is 'know best' and create your own custom heaps. If memory allocation performance is a problem, I'd much rather you share the solution the whole game uses than roll your own.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the in depth explanation! :-) I was mainly going by what was written in Game Engine Architecture by Jason Gregory, where he explains more or less how the memory management was done in Uncharted for PS3. However I do take your word, that actually trying to outsmart the OS implementations probably is a bad thing. But my main goal and question was essentially to learn how a custom memory allocator would be implemented even though the performance would be very application specific. So, if I decide to give it a try in the end, it will be and optional setting per application (and thread).
Go for it, one of the most fun things you can do in modern programming is actually implementing a heap! Its a shame you live in a world where this probably wont matter, don't let that stop you though as it really is fun. And possibly useful.
3

If you're looking to write your own malloc()/free(), etc., you probably should start by checking out the source code for existing systems such as dlmalloc. This is a hard problem, though, for what it's worth. Writing your own malloc library is Hard. Beating existing general purpose malloc libraries will be Even Harder.

7 Comments

An interactive 3D graphics framework hardly qualifies as general purpose.
Sure, but the poster is asking how best to implement an allocator that will work both for his game and for his toolchain and with different workloads and threading environments. He is not, as far as I can tell, asking for a 3D graphics framework at all.
But that is at the core of his mentioned projects. Sure, it would be a bad idea to implement custom memory management everywhere, but using it judicially, after profiling, can bring huge improvements.
Though I'm doing something very similar atm ( bitbucket.org/manasij7479/gl ) and not yet run into a scenario in which I'd need my own allocator. So, your advise has some merit unless he is much more ahead of me.
Not really -- application specific pooling techniques don't make sense for general purpose, threaded allocator libraries. They certainly won't yield general efficiencies -- a specific app may benefit hugely in terms of memory-locality, fragmentation, etc., by using a particular allocator library with particular tuning/performance biases, but another application might be impacted terribly by using the same library. Different application domains require different trade-offs in their allocators. General purpose allocator libraries strive to yield good across-the-board performance.
|
3

And now, here is the correct answer: DON'T IMPLEMENT YET ANOTHER MEMORY MANAGER.

It is incredibly hard to implement a memory manager that does not fail under different kinds of usage patterns and events. You may be able to build a specific manager that works well under YOUR usage patterns, but to write one which works well for MANY users is a full-time job that almost no one has really done well. Worse, it is fantastically easy to implement a memory manager that works great 99% of the time and then 1% of the time crash or suddenly consume most or all available memory on your system due to unexpected heap fragmentation.

I say this as someone who has written multiple memory managers, watched multiple people write their own memory managers, and watched even more people attempt to write memory managers and fail. This problem is deceptively difficult, not because it's hard to write templated allocators and generic types with inheritance and such, but because the other solutions given in this thread tend to fail under corner types of load behavior. Once you start supporting byte alignments (as all real-world allocators must) then heap fragmentation rears its ugly head. Cute heuristics that work great for small test programs, fail miserably when subjected to large, real-world programs.

And once you get it working, someone else will need: cookies to verify against memory stomps; heap usage reporting; memory pools; pools of pools; memory leak tracking and reporting; heap auditing; chunk splitting and coalescing; thread-local storage; lookasides; CPU and process-level page faulting and protection; setting and checking and clearing "free-memory" patterns aka 0xdeadbeef; and whatever else I can't think of off the top of my head.

Writing yet another memory manager falls squarely under the heading of Premature Optimization. Since there are multiple free, good, memory managers with thousands of hours of development and testing behind them, you have to justify spending the cost of your own time in such a way that the result would provide some sort of measurable improvement over what other people have done, and you can use, for free.

If you are SURE you want to implement your own memory manager (and hopefully you are NOT sure after reading this message), read through the dlmalloc sources in detail, then read through the tcmalloc sources in detail as well, then read through the mimalloc sources in detail, THEN make sure you understand the performance trade-offs in implementing a thread-safe versus a thread-unsafe memory manager, and why the naive implementations tend to give poor performance results.

EDIT: I wrote this answer ten years ago, and it's become trebly true over the past ten years. DON'T IMPLEMENT YET ANOTHER MEMORY MANAGER; especially, NEVER work for anyone who tells you to implement one during a job interview.

Comments

2
  1. Prepare more than one solution and let the user of the framework adopt any particular one. Policy classes to the generic allocator you develop would do this nicely.

  2. A nice way to get around this is to wrap up pointers in a class with overloaded * operator. Make the internal data of that class only an index to the memory pool. Now, you can just change the index quickly after a background thread copies the data over.

  3. Most good C++ libraries support allocators and you should implement one. You can also overload the global new so your version gets used. And keep in mind that you generally won't need to think about a library allocating or deallocating a large amount of data, which is generally a responsibility of client code.

1 Comment

Thanks, this is great short answers to all my questions. As you propose, I will make the whole management part optional and I'll try to explore different solutions to choose from if enabled. Good advice on overloading the * operator, will look into this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.