To skip all those unwanted nodes, I think what you're looking for is a sparse but contiguously-allocated octree, which can be easily indexed by (x, y, z). You also seem to want sequential ordering.
For this, with a maximal tree i.e. what you call a regular tree, we use something like Morton ordering to rapidly and sequentially index into the 1D array representing the tree, with just a few arithmetic or bitshift operations. This works fine when we've allocated the full tree, because Morton assumes the full range of indices for that size of tree, are available in the array.
However, once we shorten the tree array to only as many nodes as we actually use (say 40%, but even 99%), then Morton indexing throws array out-of-bounds errors. At first, this apppears to be a catch-22 situation. Good news: it's really not.
What we have to do is to build our tree into a maximal tree structure first, one that can be Morton-indexed. Then extract all relevant nodes and do some array index indirection.
- While building the maximal tree, we mark which nodes were actually used (versus just being "present"). Keep track of the Morton indices to those nodes, in a separate vector / list (
vector.push_back()). This is the reverse-mapped LUT in array form via vector.data().
- Create the forward-mapped lookup table (maybe
std::unordered_map<uint32, uint32>) which maps the Morton index (which we will use to access using (x,y,z)) to the index used to store the node in the compact array. This is simply a reversal of the keys to value, and values to keys, from the vector above.
We then read / write our sparse / compact tree array as follows (assume a quadtree for simplicity):
//...up here we do reverse and forward mapping...
Node sparseTree = new Node[nodesUsed];
uint32 mortonIndex = pos.Y * maxX + pos.X;
uint32 sparseIndex = mortonToSparse[mortonIndex]; //forward LUT
Node node = sparseTree[sparseIndex];
Note on performance
Iterating a 1D array using Morton order (or Hilbert curves) is one of the fastest possible data access arrangements since it preserves locality of data for CPU cache.
I never use double- or triple-nested loops to iterate (x,y,z). Even if you set constant upper bounds on those loops, the nesting alone has me question whether the compiler will unroll these loops - and if they remain nested, walks can potentially be very slow due to conditional branches.
Rather always iterate in 1D over the tree array:
for (uint32 sparseIndex = 0; sparseIndex < sparseTreeLength; sparseIndex++)
{
uint32 mortonIndex = sparseToMorton[sparseIndex]; //reverse LUT
uint32 x = mortonIndex % maxX;
uint32 y = mortonIndex / maxX; //maybe use floor()
...
}