# Round-offs in floating-point arithmetic

A friend of mine recently proposed the following problem on twitter:

I didn’t pay much attention to his warning and fell for the trap. I though I could write a small program in two minutes to compute the series and find what was the value of x[80]. So here is (a slightly modified version of) the C++ code that I put together in a couple of minutes.

C++

When I ran it I was surprised to notice that the series was converging to 100 by x[26].

Actually, the initial program didn’t call std::setprecision and the numbers you get without that are less precise, but that doesn’t change the convergence, since it is just a printing artifact.

Finding the series interesting I searched a bit and then I understood his warning. I found this was a well known problem proposed around 1980 by Jean-Michel Muller and discussed in several papers by Prof. W. Kahan.

Given the function
Œ(y, z) := 108 – ( 815 – 1500/z )/y
and initial values x0 := 4 and x1 := 4.25 , deﬁne xn+1 := Œ(xn, xn-1) for n = 1, 2, 3, … in turn.
Our task is to compute xN for some moderately big preassigned integer N, say N = 80 .

This exercise is intended to show the problem that arises in using floating-point numbers. The float and double (both an IEEE Standard for Floating-Point Arithmetic, IEEE 754) representations use inverse powers of 2, which means most numbers require a an infinite number of bits for a precise representation. Numbers such as 0.25 or 0.875 can be exactly encoded as 1/4 and 1/2+1/4+1/8, but numbers such as 0.10 cannot be encoded with a finite sum of such terms. The result is problems with accuracy of calculations. Rand-offs can propagate through calculations in unexpected ways, just like Muller’s recurrence shows.

The actual limit of Muller’s series is not 100, but 5.

I was curios then how the decimal type from .NET compares to double. decimal (that uses base 10 instead of 2) has more precision (but a smaller range) than float or double which makes is more suitable for some applications, such as financial ones. (For a discussion of when to use decimal and when to use double see decimal vs double! – Which one should I use and when?).

So here is my C# program that uses decimal.

The output of this program is:

This represents and improvement, but in the end suffers from the same accumulated round-offs problem. It takes more iterations, but eventually the series also converges to 100.

My friend then suggested trying a data type that doesn’t suffer from rounding issues. Such a type is BigRational for F# (it can be used with any .NET language). It is available in the F# PowerPack that is an open-source project available on CodePlex. Below is the F# equivalent of the previous program that uses BigRational.

The output looks like this:

Now this is a totally different story. The values converge to the expected value of 5.

You probably noticed the casting to double for printing. It is necessary because otherwise the output would look like this:

That isn’t very helpful. I can’t even read these insane large numbers, not to mention dividing them. So to get the actual real number and be able to compare with the previous programs a conversion to double is necessary.

As I mentioned earlier, BigRational can also be used from C#.

It yields the very same output as the F# program so I will not list it again. However, below is a comparison table with the results for various number data types.

index C++ with float C++/C# with double C# with decimal C#/F# with BigRational
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
4
4.25
4.47058868408203
4.64474487304688
4.77070617675781
4.85921478271484
4.98312377929688
6.39543151855469
27.6326293945313
86.9937591552734
99.2555084228516
99.9625854492188
99.9981307983398
99.9999084472656
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
4
4.25
4.47058823529412
4.64473684210522
4.77053824362508
4.85570071256856
4.91084749866063
4.94553739553051
4.966962408041
4.98004220429301
4.98790923279579
4.99136264131455
4.96745509555227
4.42969049830883
-7.81723657845932
168.939167671065
102.039963152059
100.09994751625
100.004992040972
100.000249579237
100.00001247862
100.000000623922
100.000000031196
100.00000000156
100.000000000078
100.000000000004
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
4
4.25
4.47058823529411764705882353
4.64473684210526315789473686
4.77053824362606232294617603
4.85570071258907363420428376
4.91084749908279320044042857
4.94553740412391672477683015
4.96696258176270059878160878
4.98004570135563116267108889
4.98797944847839228829979003
4.99277028806206866201151005
4.99565589150664533306792637
4.99739126838157043427422171
4.99843394394934565979621707
4.99906007206149646425952424
4.99943593895922460992955065
4.99966156035548033890851805
4.99979762579572007199519838
4.99989263769854913604459541
5.00021692999623515255759378
5.00575688343630115907717069
5.11585535860978057839952266
7.26513170553842597520695497
36.178328937337879304087182981
91.17958879988455033108590199
99.51631713443793014723080822
99.97569833055963020623148188
99.99878462167868201734350518
99.99993923036059445960870932
99.99999696151664049461733529
99.99999984807584112595945239
99.99999999240379245628007687
99.99999999962018963513083004
99.99999999998100948212683970
99.99999999999905047411745292
99.99999999999995252370620598
99.99999999999999762618532030
99.99999999999999988130926632
99.99999999999999999406546333
99.99999999999999999970327317
99.99999999999999999998516366
99.99999999999999999999925818
99.99999999999999999999996291
99.99999999999999999999999815
99.99999999999999999999999991
100.00000000000000000000000000
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
4
4.25
4.47058823529412
4.64473684210526
4.77053824362606
4.85570071258907
4.91084749908279
4.94553740412392
4.9669625817627
4.98004570135563
4.98797944847839
4.99277028806207
4.99565589150663
4.99739126838134
4.99843394394482
4.99906007197089
4.99943593714684
4.99966152410377
4.99979690071342
4.99987813547793
4.9999268795046
4.99995612706116
4.99997367600571
4.99998420552027
4.99999052328223
4.99999431395856
4.99999658837126
4.99999795302136
4.99999877181231
4.99999926308721
4.99999955785226
4.99999973471133
4.99999984082679
4.99999990449607
4.99999994269764
4.99999996561859
4.99999997937115
4.99999998762269
4.99999999257362
4.99999999554417
4.9999999973265
4.9999999983959
4.99999999903754
4.99999999942252
4.99999999965351
4.99999999979211
4.99999999987527
4.99999999992516
4.9999999999551
4.99999999997306
4.99999999998384
4.9999999999903
4.99999999999418
4.99999999999651
4.9999999999979
4.99999999999874
4.99999999999925
4.99999999999955
4.99999999999973
4.99999999999984
4.9999999999999
4.99999999999994
4.99999999999996
4.99999999999998
4.99999999999999
4.99999999999999
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5

The conclusion is that you should be aware that round-offs can accumulate and lead to unexpected results. Use the most appropriate data types suitable. Do not use double (not to mention float) for financial data.

### 2 Replies to “Round-offs in floating-point arithmetic”

Example in boost::multiprecision

#include “stdafx.h”
#include
#include

#include

using boost::multiprecision::cpp_dec_float_100;

template
T x(int n)
{
static T cache[Size + 1] = { 0 };
if (n == 0) cache[n] = 4.0;
else if (n == 1) cache[n] = 4.25;
else cache[n] = 108 – (815 – 1500 / cache[n – 2]) / cache[n – 1];

return cache[n];
}

int _tmain(int argc, _TCHAR* argv[])
{
for (int i = 0; i <= 80; ++i)
{
std::cout << "x[" << i << "]=" << std::setprecision(15) << x(i) << std::endl;
}

return 0;
}

2. Very nice problem, very nice solution, and very interesting article (also the original). Left me wondered how short/fast would it be in Python:

from fractions import Fraction as F

def fn(y,z):
return F(108) – (F(815) – F(1500)/z)/y

def run(N):
a, b = F(4), F(4.25)
for n in range(N):
print n, a
b, a = fn(b,a), b
return b

On my machine Python executes `run(1000)` in 1.6 seconds (without the print) and `run(2000) in 10 seconds.

This site uses Akismet to reduce spam. Learn how your comment data is processed.