Sorting algorithm using bitmask and binary operators

Question

I'm struggling to understand how the following algorithm (in Java) works.

private static void sort(int[] values, k) 
{
    int mask = 0x00000001 << k;
    int insertIndex = 0;
    int ArrayList<Integer> big = new ArrayList<Integer>();

    for (int i = 0; i < values.length; ++i) {
        if ((values[i] & mask) == 0) 
            values[insertIndex++] = values[i];
        else
            big.add(values[i]);
    }

    for (int i = 0; i < big.size(); ++i)
        values[insertIndex++] = big.get(i);
}

public static void sort(int[] values) 
{
    for (int i = 0; i < 31; ++i)
        sort(values, i);
}

Here's what I undestand so far:

At first 0x00000001 (32 bit?) gets left-shifted by k. So mask now is some other number. Then we're checking if current value values[i] added with mask using Binary-And operator equals to 0.

I can't understand the role of (values[i] & mask) == 0 clause. Second for-loop messing with my head also. And why are we iterating in public static void sort(int[] values) only 31 times?

This algorithm doesn't sort negative integers correctly. How so? How can it be modified so that negative integers get sorted too?

It is said that this algorithm uses similar concepts of one the well-known sorting algorithms. Like Heap-sort, Insertion Sort, Quick-sort, Merge-Sort, Bucket-Sort or Radix-Sort. I eliminated the possibility of Quick-sort, because it uses partitions and recursion. Merge sort uses recursion and merges sub-arrays, so I eliminated it too. Insertion-Sort isn't likely to be the one either, because of the drastic time-complexity difference. Insertions sort O(n^2) and given algorithm is O(n). Bucket-sort doesn't actually sort anything, it just divides array in sub-arrays, which then can be sorted using some sorting algorithm. Heap-sort is comparison-based algorithm, and the given algorithms doesn't look like one. So the only possibility that's left is Radix-Sort, which is not a comparison-based algorithm. So my best bet is that the given algorithm is similar to Radix sort. Am I right or am I hopelessly lost?

It looks like it's local to private static void sort function. I forgot to initialize int insertIndex. Fixed. — the_dude
– the_dude, Commented Nov 14, 2014 at 14:00

isharacomix · Accepted Answer · 2014-11-14 14:54:25Z

Yep, this is an implementation of radix sort using bitwise operators. The implementation isn't very intuitive thanks to the bitwise operators.

The algorithm works by sorting through the list one digit at a time. This is why there are 31 loops... because there are only 32 bits in a Java integer, and the leftmost bit indicates that the value is negative, so there are only 31 bits in a positive Java integer.

The mask is used to keep track of which place is being checked. 0x0001 is the ones place, 0x0002 is the twos place, 0x0004 is the threes place, and 0x0001 << n is the nth place.

The sorting is done by putting all of the integers where the checked bit is 0 on the left side of the array, and all of the integers where the checked bit is 1 on the right. This is pretty easy: mask & values[i] == 0 if the masked bit of the value is a 0.

Things get tricky when we start moving variables around. Whenever we find a value with a 0 in our checked bit, we want to move it to the left. However, this involves overwriting a value. insertIndex and i both start at 0. Each time we find a value at position 'i' that we need to move to the left, we move it to insertIndex, and if we find a value we need to move to the right, we store it in a temporary arraylist for later.

0 0 1 0 1 0   i and insertindex are the same, so we copy the 0 to itself
^

0 0 1 0 1 0   Again.
  ^

0 0 1 0 1 0   Here we find our first 1. We don't increment insertindex
    ^         We store the 1 in the temp array list. Nothing is copied.

0 0 0 0 1 0   This is a zero. We copy the zero from i to insertindex
    ^<^

0 0 0 0 1 0   Another 1 goes in the temp array list. Nothing is copied, we don't 
              increment insertindex.
      ^ ^

0 0 0 0 1 0   We copy the last zero from i to insertindex. Remember that the
      ^<<<^   zero we overwrite is actually safely in the third bit - it
              was copied earlier.

Now for the second for loop, we take all of the values we stored in our temp array list and we put them on the now-empty right side of the array. Everything on this side is somewhere else on the "left side" of the array.

0 0 0 0 1 0   Retrieve the first value from the temp array list
        ^

0 0 0 0 1 1   Retrieve the second value from the temp array list
          ^

Hope that helps! If you want to extend this to negative values, try looking at Two's Complement notation

I solved negative number problem. I add Integer.MAX_VALUE to all elements in array to make all negative numbers positive/non-negative, sort the array using original Radix-Sort and then subtract Integer.MAX_VALUE to get the original input already sorted.

Sergey Kalinichenko · Accepted Answer · 2014-11-14 14:09:18Z

3

This is an implementation of radix sort, with the radix being 2. This variety looks at the least significant digit first. The implementation works only with positive numbers.

Your function goes through the array 31 times. Each path sorts based on the value of the k-th bit. Note that when the algorithm re-arranges the numbers, it preserves the relative order of items with the identical k-th bit. This is critical to the implementation, because it lets the algorithm preserve the relative order achieved at earlier stages.

answered Nov 14, 2014 at 14:09

Sergey Kalinichenko

729k85 gold badges1.2k silver badges1.6k bronze badges

Comments

Konrad Höffner · Accepted Answer · 2014-11-14 14:09:43Z

2

First, the way you formulate the question sounds like it is some kind of homework, as you have the code and some presumably true statements but don't have the name or source. If this is true, please be up front so that we can help you in the right way.

The int mask = 0x00000001 << k; statement along with (values[i] & mask) == 0 checks if exactly bit k is set. Remember that "a & b = 1" if and only if a = 1 and b = 1, so that all bits not on position k are surely 0 and the bit on position k is equal to that bit in values[i].

The for-loop loops for 31 times, not 30 ( 1..30 is 30 numbers + 1 for the zero). The reason is that a Java integer has exactly 32 bits with the highest one used only by negative numbers. That is also the reason why it doesn't work correctly with negative numbers.

answered Nov 14, 2014 at 14:09

Konrad Höffner

12.6k20 gold badges75 silver badges139 bronze badges

3 Comments

the_dude Over a year ago

Yes, it is a homework. What's wrong with asking question about a homework? So does it mean that if I iterate 32 times the negatives integers get sorted too?

Konrad Höffner Over a year ago

There is nothing wrong with it but if you purposefully obscure this, the readability of your question and adequacy of your answers will suffer, as homework has a learning motivation. If you iterate 32 times I think the negatives will get sorted too but you need to look up how two's complement integers work to know if the sorting is correct (and not just consistent).

the_dude Over a year ago

I wasn't obscuring anything. I wanted to show that I actually put some effort to understand the algorithm before actually asking a question. Because I don't want you to do my homework for me, I want you to help me do it myself.

DarkDust · Accepted Answer · 2014-11-14 14:02:22Z

1

Yes, this is an implementation of Radix sort.

answered Nov 14, 2014 at 14:02

DarkDust

92.7k19 gold badges199 silver badges231 bronze badges

Comments

Eran · Accepted Answer · 2014-11-14 14:10:59Z

It looks like the first call to sort(values, i) puts all the integers whose lowest bit is 0 at the start of the array, and all the integers whose lowest bit is 1 at the end of the array.
The second call puts all the integers whose seconds lowest bit is 0 at the start of the array, and all the integers whose seconds lowest bit is 1 at the end of the array (without changing the internal order of each group).
...
The last call puts all the integers whose 31st bit is 0 at the start of the array, and all the integers whose 31st bit is 1 at the end of the array (without changing the internal order of each group)

Thus the sort(int[] values) sorts the array in ascending order, assuming that it only contains positive integers (since it ignores the sign bit).

Dror Lotan · Accepted Answer · 2022-12-07 07:30:35Z

0

This article explain in detail the algorithm:

https://www.researchgate.net/publication/259044206_The_Bitwise_Operations_Related_to_a_Fast_Sorting_Algorithm

As people said it's an implementation of radix sort where you have 2 "buckets" (for bit zero and one).

The algorithm loop k iteration on n elements. n = number of elements in array. k = number of bits in element.

All in all the complexity is O(n*k) where k is constant so O(n).

edited Dec 7, 2022 at 7:30

answered Dec 7, 2022 at 7:28

Dror Lotan

275 bronze badges

Collectives™ on Stack Overflow

Sorting algorithm using bitmask and binary operators

6 Answers 6

1 Comment

Comments

3 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

1 Comment

Comments

3 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related