Skip to main content
Commonmark migration
Source Link

Native Double.TryParse took ~4500 ms.

 

Custom Parsers.FastTryParseDouble took ~950 ms.

 

Performance gain was ~370%

Native Double.TryParse took ~4500 ms.

 

Custom Parsers.FastTryParseDouble took ~950 ms.

 

Performance gain was ~370%

Native Double.TryParse took ~4500 ms.

Custom Parsers.FastTryParseDouble took ~950 ms.

Performance gain was ~370%

Removing test case to discourage pedantry.
Source Link
Alain
  • 472
  • 1
  • 4
  • 17

I'm trying to beat the native Double.TryParse for performance in parsing large multi-million row (simple) CSV files as much as possible. I do not have to support exponential notation, thousand separators, Inf, -Inf, NaN, or anything exotic. Just millions of "0.00##" format doubles.

TestSuccess("0", 0d);
TestSuccess("1", 1d);
TestSuccess("-1", -1d);
TestSuccess("123.45678"45", 123.4567845);
TestSuccess("-123.45678"45", -123.4567845);
TestSuccess("12345678901234", 12345678901234d);
TestSuccess("-12345678901234", -12345678901234d);
TestSuccess("0.12345678901234"12", 0.1234567890123412);
TestSuccess("-0.12345678901234", -0.12345678901234);
TestSuccess(".12345678901234", 0.12345678901234);
TestSuccess("-.12345678901234"12", -0.1234567890123412);
TestSuccess("0.00000987654321"00", 0.0000098765432100);
TestSuccess("-0.00000987654321"00", -0.0000098765432100);
TestSuccess("1234567890123.0123456789"01", 1234567890123.012345678901);
TestSuccess("-1234567890123.0123456789"01", -1234567890123.012345678901);
TestSuccess("123456789000000000000000", 123456789000000000000000d);
TestSuccess("-123456789000000000000000", -123456789000000000000000d);
TestSuccess("0.00000000000000000123456789", 0.00000000000000000123456789);
TestSuccess("-0.00000000000000000123456789", -0.00000000000000000123456789);
// Special case, an empty dash is interpreted as negative zero (natively not parsable)
TestSuccess("-", -0d);

I'm trying to beat the native Double.TryParse for performance in parsing large multi-million row (simple) CSV files as much as possible. I do not have to support exponential notation, thousand separators, Inf, -Inf, NaN, or anything exotic. Just millions of "0.00" format doubles.

TestSuccess("0", 0d);
TestSuccess("1", 1d);
TestSuccess("-1", -1d);
TestSuccess("123.45678", 123.45678);
TestSuccess("-123.45678", -123.45678);
TestSuccess("12345678901234", 12345678901234d);
TestSuccess("-12345678901234", -12345678901234d);
TestSuccess("0.12345678901234", 0.12345678901234);
TestSuccess("-0.12345678901234", -0.12345678901234);
TestSuccess(".12345678901234", 0.12345678901234);
TestSuccess("-.12345678901234", -0.12345678901234);
TestSuccess("0.00000987654321", 0.00000987654321);
TestSuccess("-0.00000987654321", -0.00000987654321);
TestSuccess("1234567890123.0123456789", 1234567890123.0123456789);
TestSuccess("-1234567890123.0123456789", -1234567890123.0123456789);
TestSuccess("123456789000000000000000", 123456789000000000000000d);
TestSuccess("-123456789000000000000000", -123456789000000000000000d);
TestSuccess("0.00000000000000000123456789", 0.00000000000000000123456789);
TestSuccess("-0.00000000000000000123456789", -0.00000000000000000123456789);
// Special case, an empty dash is interpreted as negative zero (natively not parsable)
TestSuccess("-", -0d);

I'm trying to beat the native Double.TryParse for performance in parsing large multi-million row (simple) CSV files as much as possible. I do not have to support exponential notation, thousand separators, Inf, -Inf, NaN, or anything exotic. Just millions of "0.##" format doubles.

TestSuccess("0", 0d);
TestSuccess("1", 1d);
TestSuccess("-1", -1d);
TestSuccess("123.45", 123.45);
TestSuccess("-123.45", -123.45);
TestSuccess("12345678901234", 12345678901234d);
TestSuccess("-12345678901234", -12345678901234d);
TestSuccess("0.12", 0.12);
TestSuccess("-0.12", -0.12);
TestSuccess("0.00", 0.00);
TestSuccess("-0.00", -0.00);
TestSuccess("1234567890123.01", 1234567890123.01);
TestSuccess("-1234567890123.01", -1234567890123.01);
TestSuccess("123456789000000000000000", 123456789000000000000000d);
TestSuccess("-123456789000000000000000", -123456789000000000000000d);
Slight tweak seems to help
Source Link
Alain
  • 472
  • 1
  • 4
  • 17
    unchecked
    {
        while (true)
        {
            // Return now if we have reached the end of the string
            if (currentIndex >= length)
            {
                result *= sign;
                return true;
            }
            nextChar = input[currentIndex++];
            // Break if the result wasn't a digit between 0 and 9
            if (nextChar < '0' || nextChar > '9') break;
            // Multiply by 10 and add the next digit.
            result = result * 10 + (nextChar - '0');
        }
        // The next character should be a decimal character, or else it's invalid.
        if (nextChar != CharDecimalSeparator) return false;
        double fractionalPart = 0d;
        int fractionLengh = length - currentIndex;
        while (currentIndex < length)
        {
            nextChar = input[currentIndex++];
            // If we encounter a non-digit now, it's an error
            if (nextChar < '0' || nextChar > '9') return false;
            fractionalPart = fractionalPart * 10 + (nextChar - '0');
        }
        // Adjust the magnitude ofAdd the fractional part and add to the result
 , apply sign, and return
    result += fractionalPart * if (fractionLengh < NegPow10.Length ?)
            NegPow10[fractionLengh]result := Math.Pow(10,result -fractionLengh)+ fractionalPart * NegPow10[fractionLengh]);
 * sign;
      // Apply theelse
 sign (1 or -1) before returning.
      result = (result *=+ fractionalPart * Math.Pow(10, -fractionLengh)) * sign;
    }
    return true;
}

Native Double.TryParse took ~4400~4500 ms.

Custom Parsers.FastTryParseDouble took ~1200~950 ms.

Performance gain was ~260%~370%

    unchecked
    {
        while (true)
        {
            // Return now if we have reached the end of the string
            if (currentIndex >= length)
            {
                result *= sign;
                return true;
            }
            nextChar = input[currentIndex++];
            // Break if the result wasn't a digit between 0 and 9
            if (nextChar < '0' || nextChar > '9') break;
            // Multiply by 10 and add the next digit.
            result = result * 10 + (nextChar - '0');
        }
        // The next character should be a decimal character, or else it's invalid.
        if (nextChar != CharDecimalSeparator) return false;
        double fractionalPart = 0d;
        int fractionLengh = length - currentIndex;
        while (currentIndex < length)
        {
            nextChar = input[currentIndex++];
            // If we encounter a non-digit now, it's an error
            if (nextChar < '0' || nextChar > '9') return false;
            fractionalPart = fractionalPart * 10 + (nextChar - '0');
        }
        // Adjust the magnitude of the fractional part and add to the result
         result += fractionalPart * (fractionLengh < NegPow10.Length ?
            NegPow10[fractionLengh] : Math.Pow(10, -fractionLengh));
        // Apply the sign (1 or -1) before returning.
        result *= sign;
    }
    return true;
}

Native Double.TryParse took ~4400 ms.

Custom Parsers.FastTryParseDouble took ~1200 ms.

Performance gain was ~260%

    unchecked
    {
        while (true)
        {
            // Return now if we have reached the end of the string
            if (currentIndex >= length)
            {
                result *= sign;
                return true;
            }
            nextChar = input[currentIndex++];
            // Break if the result wasn't a digit between 0 and 9
            if (nextChar < '0' || nextChar > '9') break;
            // Multiply by 10 and add the next digit.
            result = result * 10 + (nextChar - '0');
        }
        // The next character should be a decimal character, or else it's invalid.
        if (nextChar != CharDecimalSeparator) return false;
        double fractionalPart = 0d;
        int fractionLengh = length - currentIndex;
        while (currentIndex < length)
        {
            nextChar = input[currentIndex++];
            // If we encounter a non-digit now, it's an error
            if (nextChar < '0' || nextChar > '9') return false;
            fractionalPart = fractionalPart * 10 + (nextChar - '0');
        }
        // Add the fractional part to the result, apply sign, and return
        if (fractionLengh < NegPow10.Length)
            result = (result + fractionalPart * NegPow10[fractionLengh]) * sign;
        else
            result = (result + fractionalPart * Math.Pow(10, -fractionLengh)) * sign;
    }
    return true;
}

Native Double.TryParse took ~4500 ms.

Custom Parsers.FastTryParseDouble took ~950 ms.

Performance gain was ~370%

Fixed two bugs - one where strings without a decimal were parsed incorrectly, and an index out of bounds on NegPow
Source Link
Alain
  • 472
  • 1
  • 4
  • 17
Loading
deleted 10 characters in body; edited tags; edited title
Source Link
200_success
  • 145.7k
  • 22
  • 191
  • 481
Loading
Tweeted twitter.com/StackCodeReview/status/1024218566822907904
edited tags
Link
t3chb0t
  • 44.7k
  • 9
  • 85
  • 191
Loading
Improved formatting
Source Link
Alain
  • 472
  • 1
  • 4
  • 17
Loading
Support for unlimited significant digits.
Source Link
Alain
  • 472
  • 1
  • 4
  • 17
Loading
added 1786 characters in body
Source Link
Alain
  • 472
  • 1
  • 4
  • 17
Loading
Source Link
Alain
  • 472
  • 1
  • 4
  • 17
Loading