PHP's json floating point precision conundrum

preamble

Before the development of the interface need to use the json plus signature, once docking JAVA, how the signature can not pass, carefully compared the string, found that theWhen PHP does json_encode, it removes all meaningless zeros from floating-point types (echo and var_dump do it, too)., and not on the JAVA side. So I wrote in the documentation: "Please remove the meaningless 0's in json". #doge

This came up again, recently.Demand direct requirements: display character type, and precision to retain two decimal places, so had to start looking into the question of how the precision of floating point types should be preserved in PHP's json.

The principle of removing meaningless zeros is here:Jump to: Principles - Displaying Floating Point Types in PHP

The solution to the demand is here:Jump to: String Processing - Regular

Here is the overall solution process and related principles.

prescription

json_encode constant parameter (unsolvable)

Relevant knowledge

The function prototype for json_encode is as follows:
json_encode(mixed $value, int $flags = 0, int $depth = 512): string|false

As you know, the first advanced use of json_encode is its second parameter, flags, which is the "optional json encoding method", all sorts of wonderful constants. For example, the one I've been using the longest, theJSON_UNESCAPED_UNICODEThe first thing I want to do is to check if there is a corresponding constant parameter. So the first thing that comes to mind is to check if there is a corresponding constant parameter.

Looking at the source code, the constant parameters of the json are placed in thephp-src/ext/json/php_json.h in the following:

/* json_encode() options */
#define PHP_JSON_HEX_TAG                    (1<<0)
#define PHP_JSON_HEX_AMP                    (1<<1)
#define PHP_JSON_HEX_APOS                   (1<<2)
#define PHP_JSON_HEX_QUOT                   (1<<3)
#define PHP_JSON_FORCE_OBJECT               (1<<4)
#define PHP_JSON_NUMERIC_CHECK              (1<<5)
#define PHP_JSON_UNESCAPED_SLASHES          (1<<6)
#define PHP_JSON_PRETTY_PRINT               (1<<7)
#define PHP_JSON_UNESCAPED_UNICODE          (1<<8)
#define PHP_JSON_PARTIAL_OUTPUT_ON_ERROR    (1<<9)
#define PHP_JSON_PRESERVE_ZERO_FRACTION     (1<<10)
#define PHP_JSON_UNESCAPED_LINE_TERMINATORS (1<<11)

PHP_JSON_UNESCAPED_UNICODE, which happens to correspond to 256, and the binary is designed so that they can be easily compounded. There are also many variations on how to write it, such asjson_encode($data, JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES),json_encode($data, JSON_UNESCAPED_UNICODE + JSON_UNESCAPED_SLASHES),json_encode($data, 256 + 64). It's all the same realization.

PHP json_encode Chinese documentation
PHP json_encode constants documentation

Jump to: Constant Version Applicability

One of the things that has to do with numbers isPHP_JSON_NUMERIC_CHECKas well asPHP_JSON_PRESERVE_ZERO_FRACTION。

// Encodes all numeric strings into numbers（numbers）。
// Encodes numeric strings as numbers.
JSON_NUMERIC_CHECK (int)

// assure float The value is always encoded as the value of float (be) worth。
// Ensures that float values are always encoded as a float value.
JSON_PRESERVE_ZERO_FRACTION (int)

permutation test

$str_arr = [
    'str1' => '1',
    'str2' => '1.0',
    'str3' => '1.00',
    'str4' => '1.1',
    'str5' => '1.10',
    'str6' => '1.110'
];
$s_j1 = json_encode($str_arr, JSON_NUMERIC_CHECK);
$s_j2 = json_encode($str_arr, JSON_PRESERVE_ZERO_FRACTION);
$s_j3 = json_encode($str_arr, JSON_NUMERIC_CHECK | JSON_PRESERVE_ZERO_FRACTION);
echo $s_j1,PHP_EOL;
echo $s_j2,PHP_EOL;
echo $s_j3,PHP_EOL;
echo PHP_EOL;

$float_arr = [
    'f1' => 1,
    'f2' => 1.0,
    'f3' => 1.00,
    'f4' => 1.1,
    'f5' => 1.10,
    'f6' => 1.110
];
$f_j1 = json_encode($float_arr, JSON_NUMERIC_CHECK);
$f_j2 = json_encode($float_arr, JSON_PRESERVE_ZERO_FRACTION);
$f_j3 = json_encode($float_arr, JSON_NUMERIC_CHECK | JSON_PRESERVE_ZERO_FRACTION);
echo $f_j1,PHP_EOL;
echo $f_j2,PHP_EOL;
echo $f_j3,PHP_EOL;

in the end

{"str1":1,"str2":1,"str3":1,"str4":1.1,"str5":1.1}
{"str1":"1","str2":"1.0","str3":"1.00","str4":"1.1","str5":"1.10"}
{"str1":1,"str2":1.0,"str3":1.0,"str4":1.1,"str5":1.1}

{"f1":1,"f2":1,"f3":1,"f4":1.1,"f5":1.1}
{"f1":1,"f2":1.0,"f3":1.0,"f4":1.1,"f5":1.1}
{"f1":1,"f2":1.0,"f3":1.0,"f4":1.1,"f5":1.1}

reach a verdict

can be seenJSON_NUMERIC_CHECKencoding all numeric strings as in the documentation description.Meaningless zeros will still be disposed of。

(indicates contrast)JSON_PRESERVE_ZERO_FRACTIONThe manifestation of this is a bit strange, theOnly one 0 can be retained if there is a first decimal and it is 0。

In-text jump: handling of JSON_PRESERVE_ZERO_FRACTION

Obviously, flags are not going to cut it.

Configuration item "serialize_precision" ("precision") (cannot be resolved)

The documentation has this to say

If the argument is an array or object, it will be serialized recursively.

The encoding is affected by the flags parameter passed in.In addition the encoding of floating point values depends on serialize_precision.。

serialize_precision document location

serialize_precision int
The number of significant digits stored when serializing floating point numbers. -1 indicates that an enhancement algorithm will be used to round such numbers.

PHP.serialize_precisionconfiguration item is used to control the precision of floating-point numbers during serialization, while theprecisionUsed for control during usual display.

We take a number.echo json_encode(17.2);willserialize_precision, set from low to high. Get the result below:

You can see the effect of this configuration more clearly, and it is clear that the demand cannot be reached.

Off-topic:

When testing, it was found that in PHP 7.1 and above, if you set theserialize_precisionis set to a very large value, such as5.*The version default of 17 gives the following result.17.199999999999999。precisionSimilarly, the action ofecho,var_dump,print_retc.

So it's recommended that for everyday use, setting it to the default of -1 is fine.

String Processing - Regular

In this case, it seems impossible to solve this requirement from the coding configuration level, so let's use the simplest and most direct approach: use a regular, and go directly to the string.

foreach ($data as &$item) {
    if (is_numeric($item)) {
        $item = sprintf("%.2f", $item);
    }
}
$json = json_encode($data);
// Floating point to numeric conversion
$pattern = '/"(\d+\.\d+)"/';
$replacement = '$1';
$new_json = preg_replace($pattern, $replacement, $json);

This function, is to convert all the values to a 2-digit string, json_encode, and then remove all the double quotes in the string with "." , and then remove the double quotes from the outer layer of the string if it is a number.

If your json is more complex, you need to tweak the regulars.

Principle - Floating Point Display in PHP

Let's look at this code and guess what his output will be:

echo 1.0;
var_dump(1);
var_dump(1.0);
var_dump(1.0 === 1);
var_dump(1.00 === 1.0);

Results.

1
int(1)
float(1)
bool(false)
bool(true)

So why is it that thefloat(1),1.00 === 1.0What about such strange output? The reason lies in the implementation of the variable container Zval (Zend value) in the PHP kernel, and the display handling.

PHP is a weakly typed language, a variable can be of any type, thanks to the implementation of Zval. Zval, or _zval_struct, is a structure that records three things: value, type, and reference count. There is no "display precision" property or configuration. (The reference count is related to garbage collection.)

So when var_dump is used, what is shown is the variable's type float, and and the stored value, the closest meaningful value, which is float(1). And when using the ==== comparison, the stored values are equal, and of equal type, so naturally it will show up as true.

corresponding source code

Output function for floating point types smart_str_append_double

// Zend\zend_smart_str.c
ZEND_API void ZEND_FASTCALL smart_str_append_double(
		smart_str *str, double num, int precision, bool zero_fraction) {
	char buf[ZEND_DOUBLE_MAX_LENGTH];
	/* Model snprintf precision behavior. */
	zend_gcvt(num, precision ? precision : 1, '.', 'E', buf);
	smart_str_appends(str, buf);
	if (zero_fraction && zend_finite(num) && !strchr(buf, '.')) {
		smart_str_appendl(str, ".0", 2);
	}
}

JSON_PRESERVE_ZERO_FRACTION is where the impact is made, and will determine if it's shaping or not in the end, and add ".0"

Referenced portions of smart_str_append_double

// ext\standard\
PHPAPI zend_result php_var_export_ex(zval *struc, int level, smart_str *buf) {
    ...
    case IS_DOUBLE:
        smart_str_append_double(
            buf, Z_DVAL_P(struc), (int) PG(serialize_precision), /* zero_fraction */ true);
        break;
    ...
}

// Zend\zend_ast.c
static ZEND_COLD void zend_ast_export_zval(smart_str *str, zval *zv, int priority, int indent) {
    ...
    case IS_DOUBLE:
        smart_str_append_double(
            str, Z_DVAL_P(zv), (int) EG(precision), /* zero_fraction */ false);
        break;
    ...
}

It is obvious to see that serialize_precision and precision, are introduced from here.

smart_str_append_double function for floating point strings: zend_gcvt

// Zend\zend_strtod.c
ZEND_API char *zend_gcvt(double value, int ndigit, char dec_point, char exponent, char *buf) {
    ...
    if ((decpt >= 0 && decpt > ndigit) || decpt < -3) { /* use E-style */
		/* exponential format (. 1.2345e+13) */
        ...
    } else if (decpt < 0) {
		/* standard format 0. */
		*dst++ = '0';   /* zero before decimal point */
		*dst++ = dec_point;
		do {
			*dst++ = '0';
		} while (++decpt < 0);
		src = digits;
		while (*src != '\0') {
			*dst++ = *src++;
		}
		*dst = '\0';
	} else {
		/* standard format */
		for (i = 0, src = digits; i < decpt; i++) {
			if (*src != '\0') {
				*dst++ = *src++;
			} else {
				*dst++ = '0';
			}
		}
		if (*src != '\0') {
			if (src == digits) {
				*dst++ = '0';   /* zero before decimal point */
			}
			*dst++ = dec_point;
			for (i = decpt; digits[i] != '\0'; i++) {
				*dst++ = digits[i];
			}
		}
		*dst = '\0';
	}
	zend_freedtoa(digits);
	return (buf);  
}

The writing of e, which clears meaningless zeros, is implemented here.

How to Display Accuracy

If you want to show the exact precision, you can only convert to a string type, and there are two ways to do this:

$number = 1;
echo sprintf("%.2f", $number);
echo number_format($number, 2, '.', '');

Both methods are implemented in the PHP4 version and can be used with confidence.

Note that if the number of bits itself exceeds the precision, both methodsThey're all rounded up.。

Also.number_formatThe third parameter is the "decimal symbol".The fourth parameter is the "thousands separator".. The defaults are "." and "," respectively. You need to pay attention to the setting of the "thousands separator" especially when you need to calculate numbers and display them normally.

About "double" and "float".

The floating-point type in PHP is implemented using the double type in c. All of them follow the IEEE754 standard, 64-bit double-precision floating-point numbers, and there is no single precision.

In PHP, the naming of double and float is used in a confusing way. In the source code, double is mostly seen, and the type judgment uses theIS_DOUBLE. But.After 7, to display the defined type, you must use float. e.g.function(float $num): float, which seems to be an attempt to be consistent with how other languages are named.

The function related to getting the type was tested briefly using a different version, it's weird, try not to use 8.2:

gettype(1.0); // double
var_dump(1.0); // 8.2 shows double, 8.3 and other versions are float, 8.2 also has extra output for file position

Other functions.

// They're all just aliases, they all do the same thing
is_float().
is_double().

is_float(); is_double(); floatval(); floatval()
doubleval().

...

Comparison and exact calculation of floating-point types

PHP float type documentation

As you can see from the float documentation, there is no official support for comparing and calculating floats directly due to precision issues."Never trust a floating-point result to the last digit, and never compare two floating-point numbers for equality.". (For example, 0.1 + 0.2 does not equal 0.3 in a computer, it equals 0.30000000000000004)

Generally normal quadratic operations do not really have much impact, but if there is a high demand for precision, it is recommended to use theBC Series FunctionsorGMP function。

Before comparing, use theround()function that rounds floating point types. (Similar to the official processing given, but better understood)

$x = 8 - 6.4;  // which is equal to 1.6
$y = 1.6;
var_dump($x == $y); // is not true

PHP thinks that 1.6 (coming from a difference) is not equal to 1.6. To make it work, use round()

var_dump(round($x, 2) == round($y, 2)); // this is true

This happens probably because $x is not really 1.6, but 1.599999.. and var_dump shows it to you as being 1.6.

Float type underscore

After 7.4, adding underscores to floating-point types is supported, just to increase readability, similar to thousandths:

1_000.0 == 1000.0; // true

(sth. or sb) else

json_encode constant parameter version applicability

PHP_JSON_HEX_TAG, PHP_JSON_HEX_AMP, PHP_JSON_HEX_APOS, PHP_JSON_HEX_QUOT, PHP_JSON_FORCE_OBJECT, PHP_JSON_NUMERIC_CHECK, PHP_JSON_ UNESCAPED_SLASHES, PHP_JSON_PRETTY_PRINT, PHP_JSON_UNESCAPED_UNICODE: available in PHP 5.3.0 and above.
PHP_JSON_PARTIAL_OUTPUT_ON_ERROR: available in PHP 5.5.0 and above.
PHP_JSON_PRESERVE_ZERO_FRACTION: Available in PHP 5.6.6 and above.
PHP_JSON_UNESCAPED_LINE_TERMINATORS: available in PHP 7.3.0 and above.

Serialization of objects - JsonSerializable (other advanced uses of json_encode)

When reading the json_encode documentation, one can also see that the

JsonSerializable Document Location

Classes that implement JsonSerializable can customize their JSON representation in json_encode().

go's json serialization is more common and can be combined to understand.

JAVA also has the same name JsonSerializable method, is the class information is also brought into the json, you can achieve deserialization, not commonly used.

class IDou implements JsonSerializable
{
    public function __construct(protected $name, protected $year)
    {}

    public function jsonSerialize()
    {
        return ['name' => $this->name, 'year' => $this->year];
    }
}

echo json_encode(new IDou('cxk', 2.5));

Results:

{"name":"cxk","year":2.5}