21

So, the correct way of calculating mid in a binary search is mid = low + ((high - low) / 2) in order to handle overflow errors.

My implementation uses unsigned 64 bit variables and I don't ever see a situation where my arrays get so big so as to cause an overflow. Do I still need use the above implementation or can I use mid = (low + high) / 2

What's best practice here?

3
  • @EdHeal I think that these work out to be the same algebraically and even with rounding factored in. Commented Jan 13, 2014 at 20:55
  • @EdHeal I apologize if I'm missing something, but don't those come out the same in both cases? Commented Jan 13, 2014 at 21:22
  • 4
    low + (high - low) / 2 is not guaranteed to be safe either. I work on code that supports negative indices. A positive high and negative low could overflow high - low. Commented Jan 13, 2014 at 21:43

4 Answers 4

14

If there is no possibility of overflow, the overflow-safe way of computing the midpoint is technically unnecessary: you can use the unsafe formula if you wish. However, it's probably a good idea to keep it there anyway, in case that your program gets modified some day to break your assumptions. I think that adding a single CPU instruction to make your code future-proof is a great investment in maintainability of your code.

Sign up to request clarification or add additional context in comments.

1 Comment

Also: you never know when somebody is going to cut and paste your code and use it elsewhere, where your assumptions don't fly. Maybe they will run it on a 32-bit machine, and use huge arrays - or something. If you know a programming idiom that always works, don't replace it with one that occasionally works unless there's a good reason. (Better than saving a few keystrokes or 2 machine instructions in a loop) just my $0.02
9

Check this article Nearly All Binary Searches and Mergesorts are Broken

Better practice (for today)

Probably faster, and arguably as clear is: 6: int mid = (low + high) >>> 1;

and after that :

In C and C++ (where you don't have the >>> operator), you can do this: 6: mid = ((unsigned int)low + (unsigned int)high)) >> 1;

And at the end :

Update 17 Feb 2008: Thanks to Antoine Trux, Principal Member of Engineering Staff at Nokia Research Center Finland for pointing out that the original proposed fix for C and C++ (Line 6), was not guaranteed to work by the relevant C99 standard (INTERNATIONAL STANDARD - ISO/IEC - 9899 - Second edition - 1999-12-01, 3.4.3.3), which says that if you add two signed quantities and get an overflow, the result is undefined. The older C Standard, C89/90, and the C++ Standard are both identical to C99 in this respect. Now that we've made this change, we know that the program is correct;)

Bottom line, there always will be a case when it won't work

Comments

7

Don Knuth's method works perfectly through a bitmask with no possibility of an overflow:

return (low & high) + ((low ^ high) >> 1)

EDIT: low + high = (low ^ high) + (low & high) << 1

page 19, The Art of Computer Programming, Vol. 4, Donald E. Knuth

Comments

0

So, the correct way of calculating mid in a binary search is mid = low + ((high - low) / 2) in order to handle overflow errors.

This is incorrect. Consider low = 18 * pow(10, 18) and high = 1 * pow(10, 18). Both mid = (low + high) / 2 and your method overflow, yielding 276627963145224192 instead of the correct 9500000000000000000.

1 Comment

why is high > low? It should be impossible.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.