Numerical Analysis

Programming language questions
cfbsoftware
Posts: 55
Joined: Wed Sep 18, 2013 10:06 pm
Contact:

Re: Numerical Analysis

Post by cfbsoftware »

There was a similar discussion with 20+ entries on the subject real math earlier this month on the ETH Oberon mailing list. You can see it in the mailing list archives starting here:

https://lists.inf.ethz.ch/pipermail/obe ... 07857.html
lschwerdfager
Posts: 3
Joined: Tue Jun 19, 2018 2:07 am

Re: Numerical Analysis

Post by lschwerdfager »

With respect to the following test code, note that the "x outside Foo" value is 2.255140518769849E-17 while the
"x inside Foo" value is -2.036944831557141E-17. Accordingly, it appears that the negative Sqrt argument value originates somehow as a result of passing u and v as Foo arguments. These arguments are stored on the stack; presumably, the stack values are 64 bits wide while the actual REAL values are 80 bits wide. This might account for the loss of precision and hence the negative value being passed to Sqrt. Does anyone have an opinion on this line of possible explanation?

MODULE NumAnalysis;

IMPORT Math, StdLog;

PROCEDURE Foo (u, v: REAL): REAL;

VAR
x: REAL;
result: REAL;

BEGIN
x := 1.0 - u * u - v * v;
StdLog.String ("x inside Foo is ");
StdLog.Real (x);
StdLog.Ln;
result := Math.Sqrt (x);
RETURN result
END Foo;

PROCEDURE Foo2 (x: REAL): REAL;

VAR
result: REAL;

BEGIN
result := Math.Sqrt (x);
RETURN result
END Foo2;

PROCEDURE Test*;

CONST
u = 0.9925092578236596;
v = 0.1221694443563053;

VAR
x, w: REAL;

BEGIN
IF 1.0 - u * u + v * v >= 0.0 THEN
StdLog.String ("1.0 - u * u - v * v = ");
StdLog.Real (1.0 - u * u - v * v);
StdLog.Ln;
StdLog.Ln;
w := Foo2 (1.0 - u * u - v * v);
StdLog.String ("Foo2 is ");
StdLog.Real (w);
StdLog.Ln;
x := 1.0 - u * u - v * v;
StdLog.String ("x outside Foo is ");
StdLog.Real (x);
StdLog.Ln;
w := Foo (u, v);
StdLog.String ("Foo is ");
StdLog.Real (w);
StdLog.Ln
ELSE
StdLog.String ("Out of bounds!");
StdLog.Ln
END
END Test;

END NumAnalysis.

NumAnalysis.Test
Josef Templ
Posts: 262
Joined: Tue Sep 17, 2013 6:50 am

Re: Numerical Analysis

Post by Josef Templ »

The expression x := 1.0 - u * u - v * v; is evaluated at run-time with 80 bits internally and the result is rounded
and assigned to a 64 bit variable.
The compiler evaluates constant REAL expressions at compile time with 64 bit IEEE floating point arithmetic
because it saves intermediate results to 64-bit variables.
This explains the difference.
In general, REAL operations should not make any assumptions about the precision.

The Help text file "Platform-Specific Issues" contains some information about that.


BTW, the first IF in Test looks buggy.

- Josef
User avatar
Robert
Posts: 177
Joined: Sat Sep 28, 2013 11:04 am
Location: Edinburgh, Scotland

Re: Numerical Analysis

Post by Robert »

lschwerdfager wrote:Accordingly, it appears that the negative Sqrt argument value originates somehow as a result of passing u and v as Foo arguments. These arguments are stored on the stack; presumably, the stack values are 64 bits wide while the actual REAL values are 80 bits wide.
Does anyone have an opinion on this line of possible explanation?
No, the explanation is, as Josef said, the difference between compile time (64-bit intermediate results) and run-time (80-bit intermediate results here) arithmetic. Argument passing is not involved.
REAL values are 64 bits.
The software stack is 64 bits.
The hardware floating point unit stack is 80 bits.
Four answers are possible: see the (hopefully) simple example below:

I have written your u & v constants as exactly the values they take internally in 64-bits.

Code: Select all

Compile time arithmetic ...
u                :  0.9925092578236596
v                :  0.1221694443563053
1 -  u^2 - v^2   :        2.25514E−017
1 - (u^2 + v^2)  :        0.00000E+000

Run time arithmetic ...
x                :  0.9925092578236596
y                :  0.1221694443563053
1 -  x^2 - y^2   :       −2.03694E−017
1 - (x^2 + y^2)  :       −2.03830E−017

High precision arithmetic ...
u            =   0.99250925 78236596 18580336 42783993 85511875 15258789 06250000 0000
v            =   0.12216944 43563053 06742527 25020778 59826385 97488403 32031250 0000
1 - u² - v²  =  −0.00000000 00000000 20359482 85512916 29686870 97978011 28353744 1162

Code: Select all

PROCEDURE  Do*;
  CONST
    fw  =  20;
    dp  =  16;
    sf  =   6;
    u   =  0.9925092578236596185803364278399385511875152587890625;
    v   =  0.1221694443563053067425272502077859826385974884033203125;
  VAR
    x, y, a, b, c, d  :  REAL;
  BEGIN
    f.SetToEnd;
    f.StrLn ('Compile time arithmetic ...');
    f.StrRealLn ('u                :', u, fw, dp);
    f.StrRealLn ('v                :', v, fw, dp);
    f.StrRealLn ('1 -  u^2 - v^2   :', 1 -  u * u - v * v,  fw, -sf);
    f.StrRealLn ('1 - (u^2 + v^2)  :', 1 - (u * u + v * v), fw, -sf);

    f.Ln;
    f.StrLn ('Run time arithmetic ...');
    x  :=  u;
    y  :=  v;
    f.StrRealLn ('x                :', x, fw, dp);
    f.StrRealLn ('y                :', y, fw, dp);
    f.StrRealLn ('1 -  x^2 - y^2   :', 1 -  x * x - y * y,  fw, -sf);
    f.StrRealLn ('1 - (x^2 + y^2)  :', 1 - (x * x + y * y), fw, -sf);
  END  Do;
Post Reply