12

I am generating 10 random floats between 6 and 8 (all for good reason), and writing them to a mysql database in a serialized form. But one quirk seems to emerge at the storage time:

Before storing I'm just outputting the same data to see what it looks like, and this is the result I get

a:10:{i:0;d:6.20000000000000017763568394002504646778106689453125;i:1;d:7.5999999999999996447286321199499070644378662109375;i:2;d:6.4000000000000003552713678800500929355621337890625;..}

As you can see, I'm getting long numbers like 6.20000000000000017763568394002504646778106689453125 instead of what I'd really to like see, just 6.2. This is happening only when I serialize the data, if I just output the array, I do get the floats to one decimal. Here is my code:

function random_float ($min,$max) {
   return ($min+lcg_value()*(abs($max-$min)));
}

$a1 = random_float(6, 8);
$a1 = round($a1, 1);
$a2 = random_float(6, 8);
$a2 = round($a2, 1);    
$a3 = random_float(6, 8);
$a3 = round($a3, 1);
    ...
$array = array($a1, $a2, $a3, $a4, $a5, $a6, $a7, $a8, $a9, $a10);

echo serialize($array);
1
  • Looks like echo rounds floats itself, but that's strange Commented Jul 10, 2009 at 13:38

9 Answers 9

16

A number like 6.2 can't be represented exactly using floating-point math in computers as there is no finite base-2 representation of it. What you are seeing when echo-ing the number is something intended for human reading, and thus the value will be rounded to what floats can provide in accuracy (about 6 decimal places for 32-bit and 17 for 64-bit FP values).

When serializing those values, however, you really want the exact value (i. e. all bits that are in there) and not just the nearest "nice" value. There could be more than one float/double representation which evaluates to approximately 6.2 and when serializing you usually really want to store he exact values to the last bit you are having in order to restore them correctly. That's why you're getting ridiculous "accuracy" in values there. It's all just to preserve the exact bit representation of what you started with.

But why exactly do you want to control the serialized output that tightly? I mean, it's just there so you can round-trip your data structure and read it back in later. You certainly don't want to use that serialized representation somewhere in output for humans or so. So if it's just about "nice-looking" values, you shouldn't use serialize which has an entirely different purpose.

Sign up to request clarification or add additional context in comments.

6 Comments

ah, thanks for the explanation, that'll probably go a long way. The reason I would want to keep my serialized array straightforward is because of my fear that a longer serial string will overall increase load on the query server. I would love to learn that I am wrong. thanks to everyone else for the prompt responses...you guys rock! Just to keep things simple though (ie no extra functions), I just might stick with this solution, though Gumbo yours is very elegant as well.
Well, databases are pretty much optimized for the use case of retrieving data. Whether you are reading 200 bytes or 2000 shouldn't make much of a difference unless you do something really load-intense. But you can still optimize when it proves to be a bottleneck.
I ran into this issue when storing serialized data. I was storing large numbers of records so using literally 58 digits to store what should have been 4 characters just seemed unacceptable. I used the number_format solution provided by shadowhand below.
I solved this by setting both precision and serialize_precision to the same value (10): ini_set('precision', 10); ini_set('serialize_precision', 10); You can also set this in your php.ini
An explanation without an answer is like if fire men show up to a fire and then join the spectators.
|
7

Store them as strings after using number_format:

$number = number_format($float, 2);

2 Comments

number_format is not a safe way to convert numbers to string. the en_US locale will add grouping commas. "1234.56" becomes "1,234.56". Append an empty string to number to convert. $number.='';
JSON API that require floats will not necessarilly accept strings. As an example the Business Cntral API will reply with an error if you pass a string in the JSON where it expects a float
6

Just reduce the precision:

ini_set('serialize_precision',2);

2 Comments

That should be a correct answer, because it's the right solution, not a workaround.
You don't need to go that low, just count the zeroes... ini_set('serialize_precision',10); will do.
3

Store them as integers (shift the first decimal place in front of the point by multiplying it by 10) and convert them back if you need it:

function random_float($min,$max) {
    return ($min+lcg_value()*(abs($max-$min)));
}

$array = array();
for ($i=0; $i<10; $i++) {
    $array[] = (int) round(random_float(6, 8) * 10);
}
$serialized = serialize($array);
var_dump($serialize);

$array = unserialize($serialized);
foreach ($array as $key => $val) {
    $array[$key] = $val / 10;
}
var_dump($array);

1 Comment

that is indeed a good solution, but I think I'll stick with the serialization for now, for reasons I wrote in the above comment. thanks for your help though
2

Casting also works, and it is faster, Example:

$a = 0.631;
$b = serialize($a);
$c = serialize((string)$a);
var_dump($b);

string(57) "d:0.6310000000000000053290705182007513940334320068359375;"

var_dump($c);

string(12) "s:5:"0.631";"

var_dump(unserialize($b));

float(0.631)

var_dump(unserialize($c));

string(5) "0.631"

The important thing is to cast it back on unserialize:

var_dump((float)unserialize($c));

float(0.631)

Comments

2

PHP.INI file contains a serialize_precision directive, which allows you to control how many significant digits will be serialized for your float. In your case, storing just one decimal of numbers between 6 to 8 means two significant digits.

You can set this setting in php.ini file or directly in your script:

ini_set('serialize_precision', 2);

If you do not care about the exact number of significant digits, but care about not having a spaghetti of digits resulting from the way float numbers are stored, you can also give a go to a value of -1, which invokes "special rounding algorithm", this is likely to do exactly what is required:

ini_set('serialize_precision', -1);

You can even reset it back to its original value after your serialization:

    $prec = ini_get('serialize_precision');
    ini_set('serialize_precision', -1);

    ... // your serialization here

    ini_set('serialize_precision', $prec);

Comments

1

Here's my take on Gumbo's answer. I put IteratorAggregate on there so it would be foreach-able, but you could add Countable and ArrayAccess also.

<?php

class FloatStorage implements IteratorAggregate
{
  protected $factor;
  protected $store = array();

  public function __construct( array $data, $factor=10 )
  {
    $this->factor = $factor;
    $this->store = $data;
  }

  public function __sleep()
  {
    array_walk( $this->store, array( $this, 'toSerialized' ) );
    return array( 'factor', 'store' );
  }

  public function __wakeup()
  {
    array_walk( $this->store, array( $this, 'fromSerialized' ) );
  }

  protected function toSerialized( &$num )
  {
    $num *= $this->factor;
  }

  protected function fromSerialized( &$num )
  {
    $num /= $this->factor;
  }

  public function getIterator()
  {
    return new ArrayIterator( $this->store );
  }
}

function random_float ($min,$max) {
   return ($min+lcg_value()*(abs($max-$min)));
}

$original = array();
for ( $i = 0; $i < 10; $i++ )
{
  $original[] = round( random_float( 6, 8 ), 1 );
}

$stored = new FloatStorage( $original );

$serialized = serialize( $stored );
$unserialized = unserialize( $serialized );

echo '<pre>';
print_r( $original );
print_r( $serialized );
print_r( $unserialized );
echo '</pre>';

Comments

1

For me I found 3 ways:

  1. convert float to integer after float var is multiplied to a big number (for example 1,000,000); it's not a very convenient way as you should not forget to divide by the same 1,000,000 it when it's used
  2. to use preg_replace('/d:([0-9]+(\.[0-9]+)?([Ee][+-]?[0-9]+)?);/e', "'d:'.((float)$1).';'", $value); where $value is your float; found here
  3. also, to round the float by round() and store it in array as string.

In any case I use variant #2

Comments

1

Setting your serialize_precision value in php.ini to -1 will solve the floating point issue, or you can set it to a value that you prefer, as per the specifications here: http://php.net/manual/en/ini.core.php#ini.serialize-precision

PHP versions <= 5.3.5 shipped with the default value of "100", while the default now at version 7.0.33 is "17", although the package bundled with your distro might have shipped with a "-1"

As pointed out in other responses, you can override this setting in the application itself or even a custom php.ini that your VirtualHost container or .htaccess specifies.

I hope that helps :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.