Calculating midpoint index in binary search

Question

So, the correct way of calculating mid in a binary search is mid = low + ((high - low) / 2) in order to handle overflow errors.

My implementation uses unsigned 64 bit variables and I don't ever see a situation where my arrays get so big so as to cause an overflow. Do I still need use the above implementation or can I use mid = (low + high) / 2

What's best practice here?

@EdHeal I think that these work out to be the same algebraically and even with rounding factored in. — templatetypedef
– templatetypedef, Commented Jan 13, 2014 at 20:55
@EdHeal I apologize if I'm missing something, but don't those come out the same in both cases? — templatetypedef
– templatetypedef, Commented Jan 13, 2014 at 21:22
low + (high - low) / 2 is not guaranteed to be safe either. I work on code that supports negative indices. A positive high and negative low could overflow high - low. — Eric Postpischil
– Eric Postpischil, Commented Jan 13, 2014 at 21:43

Sergey Kalinichenko · Accepted Answer · 2014-01-13 20:55:29Z

14

If there is no possibility of overflow, the overflow-safe way of computing the midpoint is technically unnecessary: you can use the unsafe formula if you wish. However, it's probably a good idea to keep it there anyway, in case that your program gets modified some day to break your assumptions. I think that adding a single CPU instruction to make your code future-proof is a great investment in maintainability of your code.

answered Jan 13, 2014 at 20:55

Sergey Kalinichenko

729k85 gold badges1.2k silver badges1.6k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

JVMATL Over a year ago

Also: you never know when somebody is going to cut and paste your code and use it elsewhere, where your assumptions don't fly. Maybe they will run it on a 32-bit machine, and use huge arrays - or something. If you know a programming idiom that always works, don't replace it with one that occasionally works unless there's a good reason. (Better than saving a few keystrokes or 2 machine instructions in a loop) just my $0.02

Dabo · Accepted Answer · 2014-01-13 21:11:42Z

Check this article Nearly All Binary Searches and Mergesorts are Broken

Better practice (for today)

Probably faster, and arguably as clear is: 6: int mid = (low + high) >>> 1;

and after that :

In C and C++ (where you don't have the >>> operator), you can do this: 6: mid = ((unsigned int)low + (unsigned int)high)) >> 1;

And at the end :

Update 17 Feb 2008: Thanks to Antoine Trux, Principal Member of Engineering Staff at Nokia Research Center Finland for pointing out that the original proposed fix for C and C++ (Line 6), was not guaranteed to work by the relevant C99 standard (INTERNATIONAL STANDARD - ISO/IEC - 9899 - Second edition - 1999-12-01, 3.4.3.3), which says that if you add two signed quantities and get an overflow, the result is undefined. The older C Standard, C89/90, and the C++ Standard are both identical to C99 in this respect. Now that we've made this change, we know that the program is correct;)

Bottom line, there always will be a case when it won't work

user1095108 · Accepted Answer · 2022-07-03 19:32:26Z

7

Don Knuth's method works perfectly through a bitmask with no possibility of an overflow:

return (low & high) + ((low ^ high) >> 1)

EDIT: low + high = (low ^ high) + (low & high) << 1

page 19, The Art of Computer Programming, Vol. 4, Donald E. Knuth

edited Jul 3, 2022 at 19:32

user1095108

14.7k10 gold badges72 silver badges126 bronze badges

answered Jan 20, 2018 at 4:39

coder

931 silver badge8 bronze badges

Comments

Elia Immanuel Auer · Accepted Answer · 2021-06-11 18:17:20Z

0

So, the correct way of calculating mid in a binary search is mid = low + ((high - low) / 2) in order to handle overflow errors.

This is incorrect. Consider low = 18 * pow(10, 18) and high = 1 * pow(10, 18). Both mid = (low + high) / 2 and your method overflow, yielding 276627963145224192 instead of the correct 9500000000000000000.

answered Jun 11, 2021 at 18:17

Elia Immanuel Auer

2744 silver badges15 bronze badges

1 Comment

user1095108 Over a year ago

why is high > low? It should be impossible.

Collectives™ on Stack Overflow

Calculating midpoint index in binary search

4 Answers 4

1 Comment

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related