Skip to main content
added 58 characters in body
Source Link
user321630
user321630

This is getting a bit fancytowards micro-optimizations (but not necessarily with microscopic effects, I've gotten "mega" effects from it) but you can implement linked lists (and all other linked structures, including trees) using contiguous arrays as the memory backing and largely mitigate the memory inefficiencies there (the need to dynamically allocate each node separately, and likewise mitigate the loss of spatial locality), and all without reaching for custom memory allocators like free lists. We can actually do this sort of thing:

I'm still rather competitive performance-wise in my field and sometimes I feel like my ability to outperform some of my competitors isn't coming from algorithmic magic so much as just utilizing linked lists and sillyleaning heavily on my profilers and little things like that while many of my peers have become allergic to them, but in a way where I don't degrade locality of reference too much or involve dynamic allocations per node. Of course if I implemented them the same way I started out when CPU caches were almost non-existent and pointers were a quarter of the size, they'd perform very poorly nowadays. But I've turned my old implementations into "array linked lists" with great benefit and in ways that still outperform turning them into just random-access sequences using arrays with linear-time insertions/removals to/from the sequence.

This is getting a bit fancy but you can implement linked lists (and all other linked structures, including trees) using contiguous arrays as the memory backing and largely mitigate the memory inefficiencies there (the need to dynamically allocate each node separately, and likewise mitigate the loss of spatial locality), and all without reaching for custom memory allocators like free lists. We can actually do this sort of thing:

I'm still rather competitive performance-wise in my field and sometimes I feel like my ability to outperform some of my competitors isn't coming from algorithmic magic so much as just utilizing linked lists and silly things like that while many of my peers have become allergic to them, but in a way where I don't degrade locality of reference too much or involve dynamic allocations per node. Of course if I implemented them the same way I started out when CPU caches were almost non-existent and pointers were a quarter of the size, they'd perform very poorly nowadays. But I've turned my old implementations into "array linked lists" with great benefit and in ways that still outperform turning them into just random-access sequences using arrays.

This is getting a bit towards micro-optimizations (but not necessarily with microscopic effects, I've gotten "mega" effects from it) but you can implement linked lists (and all other linked structures, including trees) using contiguous arrays as the memory backing and largely mitigate the memory inefficiencies there (the need to dynamically allocate each node separately, and likewise mitigate the loss of spatial locality), and all without reaching for custom memory allocators like free lists. We can actually do this sort of thing:

I'm still rather competitive performance-wise in my field and sometimes I feel like my ability to outperform some of my competitors isn't coming from algorithmic magic so much as just utilizing linked lists and leaning heavily on my profilers and little things like that while many of my peers have become allergic to them, but in a way where I don't degrade locality of reference too much or involve dynamic allocations per node. Of course if I implemented them the same way I started out when CPU caches were almost non-existent and pointers were a quarter of the size, they'd perform very poorly nowadays. But I've turned my old implementations into "array linked lists" with great benefit and in ways that still outperform turning them into just random-access sequences using arrays with linear-time insertions/removals to/from the sequence.

added 722 characters in body
Source Link
user321630
user321630

I'm still rather competitive performance-wise in my field and sometimes I feel like my ability to outperform some of my competitors isn't coming from algorithmic magic so much as just utilizing linked lists and silly things like that while many of my peers have become allergic to them, but in a way where I don't degrade locality of reference too much or involve dynamic allocations per node. Of course if I implemented them the same way I started out when CPU caches were almost non-existent and pointers were a quarter of the size, they'd perform very poorly nowadays. But I've turned my old implementations into "array linked lists" with great benefit and in ways that still outperform turning them into just random-access sequences using arrays.

I'm still rather competitive performance-wise in my field and sometimes I feel like my ability to outperform some of my competitors isn't coming from algorithmic magic so much as just utilizing linked lists and silly things like that while many of my peers have become allergic to them, but in a way where I don't degrade locality of reference too much or involve dynamic allocations per node. Of course if I implemented them the same way I started out when CPU caches were almost non-existent and pointers were a quarter of the size, they'd perform very poorly nowadays. But I've turned my old implementations into "array linked lists" with great benefit and in ways that still outperform turning them into just random-access sequences using arrays.

added 124 characters in body
Source Link
user321630
user321630

Now if we want to insert a new node to the back of the "array linked list", we first pop off an index from the free stack if it's not empty. Then we overwrite the array at that position (if the free stack is empty and free_head is -1, we just append to the back of the array instead of overwriting an existing part of the array). Then we make our tail point to this new index and update the links on the new node (the rest is straightforward linked list implementation besides the free stack stuff). 

The resulting diagram is scary-looking (and I might have drawn it incorrectly -- I'm squinting at it now and it looks like I might have missed updating a link) but only because I'm drawing how the nodes are laid out in memory in this array. It's a straightforward linked list implementation with the exception of the fact that we're using indices and have this "free stack/list" concept to reclaim previously removed node positions in the array on subsequent insertions to the back of the list.

And I devoted a rather hefty section to the linked list because so many people nowadays seem to think linked lists are almost useless (like favoring random-access sequences even when they need to remove from the middle of massive sequences). They have a point given how important the CPU cache is nowadays. But they can be a lot more useful in daily scenarios if we utilize this sort of "array linked list" approach to mitigate the immediate loss of spatial locality you get between nodes. In that case we're basically storing the nodes contiguously in an array but just storing indices to skip around in the array to get from one element to the next and from one to the previous. Of course if we start having to skip around too much all over the place then we've lost the spatial locality to get from one node to another as it is ordered in the linked nature of the array, but at least that takes a lot of insertions and removals from the middle before it degrades too much that way, and it's rather easy to restore (also we might still benefit from temporal locality of neighboring nodes in the array if we can access them before they're evicted from the cache). In my measurements they still tend to outperform just using a random-access sequence in large input cases that need frequent removals from the middle.

Now if we want to insert a new node to the back of the "array linked list", we first pop off an index from the free stack if it's not empty. Then we overwrite the array at that position (if the free stack is empty and free_head is -1, we just append to the back of the array instead of overwriting an existing part of the array). Then we make our tail point to this new index and update the links on the new node (the rest is straightforward linked list implementation besides the free stack stuff). The resulting diagram is scary-looking but only because I'm drawing how the nodes are laid out in memory in this array. It's a straightforward linked list implementation with the exception of the fact that we're using indices and have this "free stack/list" concept to reclaim previously removed node positions in the array on subsequent insertions to the back of the list.

And I devoted a rather hefty section to the linked list because so many people nowadays seem to think linked lists are almost useless (like favoring random-access sequences even when they need to remove from the middle of massive sequences). They have a point given how important the CPU cache is nowadays. But they can be a lot more useful in daily scenarios if we utilize this sort of "array linked list" approach to mitigate the immediate loss of spatial locality you get between nodes. In that case we're basically storing the nodes contiguously in an array but just storing indices to skip around in the array to get from one element to the next and from one to the previous. Of course if we start having to skip around too much all over the place then we've lost the spatial locality to get from one node to another as it is ordered in the linked nature of the array, but at least that takes a lot of insertions and removals from the middle before it degrades too much that way, and it's rather easy to restore. In my measurements they still tend to outperform just using a random-access sequence in large input cases that need frequent removals from the middle.

Now if we want to insert a new node to the back of the "array linked list", we first pop off an index from the free stack if it's not empty. Then we overwrite the array at that position (if the free stack is empty and free_head is -1, we just append to the back of the array instead of overwriting an existing part of the array). Then we make our tail point to this new index and update the links on the new node (the rest is straightforward linked list implementation besides the free stack stuff). 

The resulting diagram is scary-looking (and I might have drawn it incorrectly -- I'm squinting at it now and it looks like I might have missed updating a link) but only because I'm drawing how the nodes are laid out in memory in this array. It's a straightforward linked list implementation with the exception of the fact that we're using indices and have this "free stack/list" concept to reclaim previously removed node positions in the array on subsequent insertions to the back of the list.

And I devoted a rather hefty section to the linked list because so many people nowadays seem to think linked lists are almost useless (like favoring random-access sequences even when they need to remove from the middle of massive sequences). They have a point given how important the CPU cache is nowadays. But they can be a lot more useful in daily scenarios if we utilize this sort of "array linked list" approach to mitigate the immediate loss of spatial locality you get between nodes. In that case we're basically storing the nodes contiguously in an array but just storing indices to skip around in the array to get from one element to the next and from one to the previous. Of course if we start having to skip around too much all over the place then we've lost the spatial locality to get from one node to another as it is ordered in the linked nature of the array, but at least that takes a lot of insertions and removals from the middle before it degrades too much that way, and it's rather easy to restore (also we might still benefit from temporal locality of neighboring nodes in the array if we can access them before they're evicted from the cache). In my measurements they still tend to outperform just using a random-access sequence in large input cases that need frequent removals from the middle.

added 2104 characters in body
Source Link
user321630
user321630
Loading
added 2104 characters in body
Source Link
user321630
user321630
Loading
added 110 characters in body
Source Link
user321630
user321630
Loading
added 862 characters in body
Source Link
user321630
user321630
Loading
added 126 characters in body
Source Link
user321630
user321630
Loading
added 524 characters in body
Source Link
user321630
user321630
Loading
Source Link
user321630
user321630
Loading