Return to Answer

Mention throwing as an option

Source Link

edited Nov 5, 2021 at 10:31

88.7k
14
104
327

If we're returning an index, that also should be a std::size_t, as int might not be sufficient. However, that leaves a problem - we can no longer use -1 to indicate that the value is before the 0th element of the array. Our options for that case include:

return (std::size_t)-1
add 1 to the result (so we return the position after the found position, and caller must subtract 1 to use it)
return the index of the first element greater or equal to target (caller can check if it's equal or not) - this is the convention used by std::lower_bound()
throw an exception (this is expensive, so not a good choice for a "normal" result)

return a std::optional<std::size_t>optional.

return (std::size_t)-1
add 1 to the result (so we return the position after the found position, and caller must subtract 1 to use it)
return the index of the first element greater or equal to target (caller can check if it's equal or not) - this is the convention used by std::lower_bound()
return a std::optional<std::size_t>

return (std::size_t)-1
add 1 to the result (so we return the position after the found position, and caller must subtract 1 to use it)
return the index of the first element greater or equal to target (caller can check if it's equal or not) - this is the convention used by std::lower_bound()
throw an exception (this is expensive, so not a good choice for a "normal" result)

return a std::optional.

Source Link

answered Nov 2, 2021 at 9:41

Toby Speight

88.7k
14
104
327

size_t is in the std namespace, so should be written std::size_t.

We don't need to include <cmath>.

main() should return non-zero if any of the tests failed.

Length+size is one way to pass a collection (and yes, std::size_t is the appropriate type for the size here); a more flexible (but more advanced) way is to write a general template that accepts a C++ Range object.

return (std::size_t)-1
add 1 to the result (so we return the position after the found position, and caller must subtract 1 to use it)
return the index of the first element greater or equal to target (caller can check if it's equal or not) - this is the convention used by std::lower_bound()
return a std::optional<std::size_t>

I recommend the third option, for the reason I've highlighted.

        const auto mid = start + ((end - start) / 2);

That's good - you've avoided a common mistake there. (If we were to write (start + end) / 2, we would be at risk of integer overflow).

As you observed, the test cases are very repetitive. We can extract the common parts to a separate function. My function looks something like this (most of the complexity is in giving good diagnostic output):

#include <vector>

static int test_binary_search(const std::vector<int>& v,
                              int target,
                              std::size_t min_result,
                              std::size_t max_result = 0)
{
    if (max_result == 0) {
        // fill in optional argument
        max_result = min_result;
    }

    auto const result = binarySearch(v.data(), v.size(), target);
    if (min_result <= result && result <= max_result) {
        // test passed
        return 0;
    }

    // We failed - emit some diagnostics
    std::cerr << "Test failed: search for " << target
              << " in {";
    auto *sep = "";
    for (auto i: v) {
        std::cerr << sep << i;
        sep = ",";
    }
    std::cerr << "} returned " << result
              << "; expected " << min_result;
    if (min_result != max_result) {
        std::cerr << "-" << max_result;
    }
    std::cerr << '\n';

    return 1;
}

Then it's very easy to use:

int selfTest()
{
    return test_binary_search({1,2,3,4,5,6,7}, 4, 3)
        +  test_binary_search({1,2,3,4,5,6,7}, 1, 0)
        +  test_binary_search({1,2,3,4,5,6,7}, 7, 6)
        +  test_binary_search({1,2,2,2,2,3}, 2, 1, 4)
        +  test_binary_search({-10, -2, 5, 6, 10}, -2, 1);
}

A couple of things I fixed in passing: error messages should go to std::cerr rather than std::cout, and prefer plain newline (\n) rather than std::endl (which also flushes the output stream).

Having the test function print the inputs and expected result range helps avoid inconsistencies like this one:

result = binarySearch(test1, 7, 4);
if(test1[result] != 4)
{
    std::cout << "Test A failed. Expected 2, got " << result << std::endl;
    failed++;
}

(In fact, almost all the tests have wrong "Expected" value in the string).