# Is floating point math broken?

Consider the following code:

0.1 + 0.2 == 0.3 -> false
0.1 + 0.2 -> 0.30000000000000004

Why do these inaccuracies happen?

Another way to look at this: Used are 64 bits to represent numbers. As consequence there is no way more than 2**64 = 18,446,744,073,709,551,616 different numbers can be precisely represented.

However, Math says there are already infinitely many decimals between 0 and 1. IEE 754 defines an encoding to use these 64 bits efficiently for a much larger number space plus NaN and +/- Infinity, so there are gaps between accurately represented numbers filled with numbers only approximated.

Unfortunately 0.3 sits in a gap.

Many of this question's numerous duplicates ask about the effects of floating point rounding on specific numbers. In practice, it is easier to get a feeling for how it works by looking at exact results of calculations of interest rather than by just reading about it. Some languages provide ways of doing that - such as converting a float or double to BigDecimal in Java.

Since this is a language-agnostic question, it needs language-agnostic tools, such as a Decimal to Floating-Point Converter.

Applying it to the numbers in the question, treated as doubles:

0.1 converts to 0.1000000000000000055511151231257827021181583404541015625,

0.2 converts to 0.200000000000000011102230246251565404236316680908203125,

0.3 converts to 0.299999999999999988897769753748434595763683319091796875, and

0.30000000000000004 converts to 0.3000000000000000444089209850062616169452667236328125.

Adding the first two numbers manually or in a decimal calculator such as Full Precision Calculator, shows the exact sum of the actual inputs is 0.3000000000000000166533453693773481063544750213623046875.

If it were rounded down to the equivalent of 0.3 the rounding error would be 0.0000000000000000277555756156289135105907917022705078125. Rounding up to the equivalent of 0.30000000000000004 also gives rounding error 0.0000000000000000277555756156289135105907917022705078125. The round-to-even tie breaker applies.

Returning to the floating point converter, the raw hexadecimal for 0.30000000000000004 is 3fd3333333333334, which ends in an even digit and therefore is the correct result.

Math.sum ( javascript ) .... kind of operator replacement

.1 + .0001 + -.1 --> 0.00010000000000000286
Math.sum(.1 , .0001, -.1) --> 0.0001

Object.defineProperties(Math, { sign: { value: function (x) { return x ? x < 0 ? -1 : 1 : 0; } }, precision: { value: function (value, precision, type) { var v = parseFloat(value), p = Math.max(precision, 0) || 0, t = type || 'round'; return (Math[t](v * Math.pow(10, p)) / Math.pow(10, p)).toFixed(p); } }, scientific_to_num: { // this is from https://gist.github.com/jiggzson value: function (num) { //if the number is in scientific notation remove it if (/e/i.test(num)) { var zero = '0', parts = String(num).toLowerCase().split('e'), //split into coeff and exponent e = parts.pop(), //store the exponential part l = Math.abs(e), //get the number of zeros sign = e / l, coeff_array = parts[0].split('.'); if (sign === -1) { num = zero + '.' + new Array(l).join(zero) + coeff_array.join(''); } else { var dec = coeff_array[1]; if (dec) l = l - dec.length; num = coeff_array.join('') + new Array(l + 1).join(zero); } } return num; } } get_precision: { value: function (number) { var arr = Math.scientific_to_num((number + "")).split("."); return arr[1] ? arr[1].length : 0; } }, diff:{ value: function(A,B){ var prec = this.max(this.get_precision(A),this.get_precision(B)); return +this.precision(A-B,prec); } }, sum: { value: function () { var prec = 0, sum = 0; for (var i = 0; i < arguments.length; i++) { prec = this.max(prec, this.get_precision(arguments[i])); sum += +arguments[i]; // force float to convert strings to number } return Math.precision(sum, prec); } }
});

the idea is to use Math instead operators to avoid float errors

Math.diff(0.2, 0.11) == 0.09 // true
0.2 - 0.11 == 0.09 // false

also note that Math.diff and Math.sum auto-detect the precision to use

Math.sum accepts any number of arguments

A different question has been named as a duplicate to this one:

In C++, why is the result of cout << x different from the value that a debugger is showing for x?

The x in the question is a float variable.

One example would be

float x = 9.9F;

The debugger shows 9.89999962, the output of cout operation is 9.9.

The answer turns out to be that cout's default precision for float is 6, so it rounds to 6 decimal digits.

See here for reference