12

In my project I have to compute division, multiplication, subtraction, addition on a matrix of double elements. The problem is that when the size of matrix increases the accuracy of my output is drastically getting affected. Currently I am using double for each element which I believe uses 8 bytes of memory & has accuracy of 16 digits irrespective of decimal position. Even for large size of matrix the memory occupied by all the elements is in the range of few kilobytes. So I can afford to use datatypes which require more memory. So I wanted to know which data type is more precise than double. I tried searching in some books & I could find long double. But I dont know what is its precision. And what if I want more precision than that?

13
  • 1
    Check out the GMP project. Also there are methods to minimize round off error in computations. Commented Mar 27, 2013 at 13:14
  • In case you could rely on external dependencies, Boost 1.53 has a Multiprecision library that can helps you!! Commented Mar 27, 2013 at 13:21
  • 1
    Using a little algebra to rearrange mathematical calculations can help to reduce rounding errors Commented Mar 27, 2013 at 13:26
  • 1
    Switching to a larger type merely delays the numerical collapse. To avoid it completely, crack out a numerical analysis book and read the chapter on "stability". Commented Mar 27, 2013 at 13:30
  • 1
    Numerical collapse is the phenomenon you're experiencing: Rounding errors accumulate and lead to a wrong answer. Commented Mar 27, 2013 at 13:35

4 Answers 4

16

According to Wikipedia, 80-bit "Intel" IEEE 754 extended-precision long double, which is 80 bits padded to 16 bytes in memory, has 64 bits mantissa, with no implicit bit, which gets you 19.26 decimal digits. This has been the almost universal standard for long double for ages, but recently things have started to change.

The newer 128-bit quad-precision format has 112 mantissa bits plus an implicit bit, which gets you 34 decimal digits. GCC implements this as the __float128 type and there is (if memory serves) a compiler option to set long double to it.

Sign up to request clarification or add additional context in comments.

10 Comments

so who would you recommend between long double & __float128, considering the tradeoff involved in speed & accuracy?
@Cool_Coder I don't know the characteristics of your program, but since it's easy, just try both!
ok I will & let you know. Just for the sake of it let me know if the following is incorrect: __float128 *nicePrecision = new __float128();
128-bit floats aren't all that new. They're the long double type on SPARC, which has been around for ages (as in, a little more than twenty years).
@PeteBecker That's still a lot newer than the 8087! And the standardization only came about in 2008, unless I'm mistaken. Anyway my impression is that they're gaining traction now because the legacy 80-bit hardware is mostly gone.
|
10

You might want to consider the sequence of operations, i.e. do the additions in an ordered sequence starting with the smallest values first. This will increase overall accuracy of the results using the same precision in the mantissa:

1e00 + 1e-16 + ... + 1e-16 (1e16 times) = 1e00
1e-16 + ... + 1e-16 (1e16 times) + 1e00 = 2e00

The point is that adding small numbers to a large number will make them disappear. So the latter approach reduces the numerical error

Comments

3

Floating point data types with greater precision than double are going to depend on your compiler and architecture.

In order to get more than double precision, you may need to rely on some math library that supports arbitrary precision calculations. These probably won't be fast though.

3 Comments

"These probably won't be fast enough" <- Fast enough for what? What makes you say that? And what alternatives do you suggest if one does need more precision?!
You sort of seem to be ignoring the existence of long double. The same issues do sort of apply, but to a much lesser extent.
@us2012 I just said probably won't be fast, not not fast enough. So yes, it depends a lot on what the OP is trying to do. I'd suggest a math library if I knew one, but my experience with arbitrary precision like this is limited to other languages.
0

On Intel architectures the precision of long double is 80bits.

What kind of values do you want to represent? Maybe you are better off using fixed precision.

3 Comments

long float? Really? 80 bits precision and how many go into the exponent?
Depends on the compiler; with MS, a long double has the same precision as a double.
I meant long double, it was just a glitch.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.