3

I have written a hexadecimal-parsing function for my uint128 structure, which is internally just two uint64_t's - hi and lo. Here's the function in question:

uint128_t parse_from_hex(const char *string, uint128_t previous_value, const size_t num_digits) {
    if (string == NULL || string[0] == '\0')
        return previous_value;

    if (num_digits + 1 > 2 * SIZEOF_INT128) { // SIZEOF_INT128 is 16 - meaning 16 bytes
        return UINT128_MAX; // the maximum value of 128bit uint, if we overflow
    }

    int64_t current_digit = parse_hex_digit(string[0]);
    if (current_digit < 0)
        return UINT128_ZERO; // a global variable which I use multiple times which represents a 0
    return parse_from_hex(string + 1,
                          uint128_or_uint64(uint128_shift_left(previous_value, 4), (uint64_t)current_digit),
                          num_digits + 1);
}

For some reason, gcc does not optimize the function even though the recursive call is clearly made a single time at the end of the function. The other functions used in the parsing function don't have any side effects and return a new value, so I do not think that the problem is with them. I have tried making the uint128_t struct members non-const (originally they were non-const) as well as the function arguments non-const, but that didn't help either. Originally compiled with Ofast, but also tried with O3 and O2 - no luck. Could anyone who knows better on the subject please help? I thought I understood it quite well but clearly I'm missing something.

8
  • 1
    Is there any reason you'd do this recursively? This is a job for a loop. Commented Nov 25, 2020 at 3:44
  • 2
    Here's a godbolt link showing the original code not being tail recursive: godbolt.org/z/Ws4YTz Commented Nov 25, 2020 at 3:46
  • 2
    GCC 10.2 will do tail recursion here, Clang doesn't. Commented Nov 25, 2020 at 3:47
  • What version of compiler isn't doing the tail call optimization? Commented Nov 25, 2020 at 3:58
  • 1
    GCC does have an easier time optimizing and in many cases unrolling loops than dealing with tail recursion which is more of a functional programming pattern. The optimizer deals with patterns people use most frequently. Even optimized this code looks like it's a lot more complicated than a loop-based version of same. Commented Nov 25, 2020 at 4:19

1 Answer 1

2

As it has been pointed out by @BillLynch in the comments - it's clang which doesn't optimize the function for some reason, not GCC. On my PC GCC 10.2.0 optimizes the function properly, so there's no problem here.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.