Opened 3 years ago

Closed 3 years ago

#2854 closed defect (worksforme)

results in winGRASS7 32bit vs 64bit: should they be identical?

Reported by: hellik Owned by: grass-dev@…
Priority: normal Milestone: 7.0.5
Component: Raster Version: svn-releasebranch70
Keywords: Cc:
CPU: x86-64 Platform: MSWindows 7

Description

taken from the dev ML [1]:

>>Like this you check if the difference is very small and below the
>>acceptable threshold.
>>In C, we use GRASS_EPSILON for these comparisons.
>
> now done with GRASS_EPSILON   1.0e-15
>
> r.mapcalc expression=difftestepsilon = if( abs( accum32bit at accumtest -
> accum64bit at accumtest) <= 1.0e-15, 1 , 100 )
>
> attached a screenshot of this map
>
> accum_diff_wingrass32bitvs64bit.png
> <http://osgeo-org.1560.x6.nabble.com/file/n5243286/accum_diff_wingrass32bitvs64bit.png>

Mhh..

Can you open a ticket with all relevant bits + the difference
screenshot along with a legend and "r.univar -e" output?

tested with the NC sample data set on the same 64bit- win7 box running 32bit and 64bit winGRASS binaries.

GRASS version: 7.0.3RC1                                                         
GRASS SVN Revision: 67443                                                       
Build Date: 2015-12-31                                                          
Build Platform: i386-w64-mingw32                                                
GDAL/OGR: 1.11.3                                                                
PROJ.4: 4.9.2                                                                   
GEOS: 3.5.0                                                                     
SQLite: 3.7.17                                                                  
Python: 2.7.4                                                                   
wxPython: 2.8.12.1                                                              
Platform: Windows-7-6.1.7601-SP1 (OSGeo4W) 

g.region -p -a raster=elevation@PERMANENT align=elevation@PERMANENT             
projection: 99 (Lambert Conformal Conic)
zone:       0
datum:      nad83
ellipsoid:  a=6378137 es=0.006694380022900787
north:      228500
south:      215000
west:       630000
east:       645000
nsres:      10
ewres:      10
rows:       1350
cols:       1500
cells:      2025000

r.watershed elevation=elevation@PERMANENT accumulation=accum_wg32bit            
SECTION 1a (of 4): Initiating Memory.
SECTION 1b (of 4): Determining Offmap Flow.
SECTION 2: A* Search.
SECTION 3a: Accumulating Surface Flow with MFD.
SECTION 3b: Adjusting drainage directions.
SECTION 4: Closing Maps.
(Thu Jan 07 11:30:02 2016) Command finished (40 sec)  

r.univar -e map=accum_wg32bit@watershedtest                                     
total null and non-null cells: 2025000
total null cells: 0
Of the non-null cells:
----------------------
n: 2025000
minimum: -638533
maximum: 330838
range: 969371
mean: -262.231
mean of absolute values: 685.177
standard deviation: 13021.3
variance: 1.69554e+008
variation coefficient: -4965.58 %
sum: -531016814.625176
1st quartile: 2.03496
median (even number of cells): 4.11548
3rd quartile: 9.49
90th percentile: 34.8313
(Thu Jan 07 11:31:41 2016) Command finished (17 sec)
System Info                                                                     
GRASS version: 7.0.3RC1                                                         
GRASS SVN Revision: 67443                                                       
Build Date: 2015-12-31                                                          
Build Platform: x86_64-w64-mingw32                                              
GDAL/OGR: 1.11.3                                                                
PROJ.4: 4.9.2                                                                   
GEOS: 3.5.0                                                                     
SQLite: 3.7.17                                                                  
Python: 2.7.5                                                                   
wxPython: 2.8.12.1                                                              
Platform: Windows-7-6.1.7601-SP1 (OSGeo4W)   

g.region -p                                                                     
projection: 99 (Lambert Conformal Conic)
zone:       0
datum:      nad83
ellipsoid:  a=6378137 es=0.006694380022900787
north:      228500
south:      215000
west:       630000
east:       645000
nsres:      10
ewres:      10
rows:       1350
cols:       1500
cells:      2025000

r.watershed elevation=elevation@PERMANENT accumulation=accum_wg64bit            
SECTION 1a (of 4): Initiating Memory.
SECTION 1b (of 4): Determining Offmap Flow.
SECTION 2: A* Search.
SECTION 3a: Accumulating Surface Flow with MFD.
SECTION 3b: Adjusting drainage directions.
SECTION 4: Closing Maps.
(Thu Jan 07 11:38:49 2016) Command finished (4 sec)  

r.univar -e map=accum_wg64bit@watershedtest                                     
total null and non-null cells: 2025000
total null cells: 0
Of the non-null cells:
----------------------
n: 2025000
minimum: -638533
maximum: 330838
range: 969371
mean: -262.231
mean of absolute values: 685.177
standard deviation: 13021.3
variance: 1.69554e+008
variation coefficient: -4965.58 %
sum: -531016814.625176
1st quartile: 2.03496
median (even number of cells): 4.11548
3rd quartile: 9.49
90th percentile: 34.8313
(Thu Jan 07 11:39:36 2016) Command finished (6 sec) 
r.mapcalc expression=difftestepsilon = if( abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15 , 1 ,100 )
r.univar -e map=difftestepsilon@watershedtest                                   
total null and non-null cells: 2025000
total null cells: 0
Of the non-null cells:
----------------------
n: 2025000
minimum: 1
maximum: 100
range: 99
mean: 32.5001
mean of absolute values: 32.5001
standard deviation: 46.1113
variance: 2126.25
variation coefficient: 141.881 %
sum: 65812680
1st quartile: 1
median (even number of cells): 1
3rd quartile: 100
90th percentile: 100

[1] https://lists.osgeo.org/pipermail/grass-dev/2016-January/078282.html

Attachments (1)

difftestepsilon.png (86.1 KB) - added by hellik 3 years ago.
diff epsilon

Download all attachments as: .zip

Change History (20)

Changed 3 years ago by hellik

Attachment: difftestepsilon.png added

diff epsilon

comment:1 Changed 3 years ago by neteler

Please also add the "r.univar -e ..." output of

r.mapcalc "difftestepsilon_values = if( abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15 , 1 , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )"

to see the actual differences and not 100. Thanks.

comment:2 in reply to:  1 ; Changed 3 years ago by hellik

Replying to neteler:

Please also add the "r.univar -e ..." output of

r.mapcalc "difftestepsilon_values = if( abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15 , 1 , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )"

to see the actual differences and not 100. Thanks.

r.mapcalc expression=difftestepsilon_values = if(abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15, 1 , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )
r.univar -e map=difftestepsilon_values@watershedtest                            
total null and non-null cells: 2025000
total null cells: 0
Of the non-null cells:
----------------------
n: 2025000
minimum: -8.14907e-010
maximum: 1
range: 1
mean: 0.681817
mean of absolute values: 0.681817
standard deviation: 0.465771
variance: 0.216942
variation coefficient: 68.3131 %
sum: 1380679.99999932
1st quartile: 3.55271e-015
median (even number of cells): 1
3rd quartile: 1
90th percentile: 1

comment:3 in reply to:  2 ; Changed 3 years ago by hellik

Replying to hellik:

r.mapcalc expression=difftestepsilon_values = if(abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15, 1 , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )
r.mapcalc expression=difftestepsilon_values2 = if(abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15, null()  , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )
r.univar -e map=difftestepsilon_values@watershedtest,difftestepsilon_values2@watershedtest
total null and non-null cells: 4050000
total null cells: 1380680
Of the non-null cells:
----------------------
n: 2669320
minimum: -8.14907e-010
maximum: 1
range: 1
mean: 0.51724
mean of absolute values: 0.51724
standard deviation: 0.499703
variance: 0.249703
variation coefficient: 96.6094 %
sum: 1380679.99999868
1st quartile: -1.77636e-015
median (even number of cells): 1
3rd quartile: 1
90th percentile: 1

comment:4 in reply to:  3 ; Changed 3 years ago by hellik

Replying to hellik:

Replying to hellik:

r.mapcalc expression=difftestepsilon_values = if(abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15, 1 , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )
r.mapcalc expression=difftestepsilon_values2 = if(abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15, null()  , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )
r.univar -e map=difftestepsilon_values2@watershedtest                           
total null and non-null cells: 2025000
total null cells: 1380680
Of the non-null cells:
----------------------
n: 644320
minimum: -8.14907e-010
maximum: 4.65661e-010
range: 1.28057e-009
mean: -1.12517e-012
mean of absolute values: 1.73591e-012
standard deviation: 0
variance: 0
variation coefficient: -0 %
sum: -7.24971475962732e-007
1st quartile: -3.55271e-015
median (even number of cells): -1.77636e-015
3rd quartile: 3.55271e-015
90th percentile: 1.77636e-014

comment:5 in reply to:  1 ; Changed 3 years ago by hellik

Replying to neteler:

Please also add the "r.univar -e ..." output of

r.mapcalc "difftestepsilon_values = if( abs( accum_wg32bit@watershedtest - accum_wg64bit@watershedtest  ) <= 1.0e-15 , 1 , accum_wg32bit@watershedtest - accum_wg64bit@watershedtest )"

to see the actual differences and not 100. Thanks.

tested with r.slope.aspect

r.slope.aspect elevation=elevation@PERMANENT slope=slope_wg32bit aspect=aspect_wg32bit
r.slope.aspect elevation=elevation@PERMANENT slope=slope_wg64bit aspect=aspect_wg64bit
r.mapcalc expression=diff_slope = slope_wg32bit@slope_test - slope_wg64bit@slope_test

r.univar -e map=diff_slope@slope_test                                           
total null and non-null cells: 2025000
total null cells: 5696
Of the non-null cells:
----------------------
n: 2019304
minimum: 0
maximum: 0
range: 0
mean: 0
mean of absolute values: 0
standard deviation: 0
variance: 0
variation coefficient: nan %
sum: 0
1st quartile: 0
median (even number of cells): 0
3rd quartile: 0
90th percentile: 0

and also tested with r.topidx

r.topidx input=elevation@PERMANENT output=topidx_wg32bit
r.topidx input=elevation@PERMANENT output=topidx_wg64bit
r.mapcalc expression=diff_topidx_epsilon = if( abs( topidx_wg32bit@test_topidx - topidx_wg64bit@test_topidx ) <= 1.0e-15 , null()  , topidx_wg32bit@test_topidx - topidx_wg64bit@test_topidx )
r.mapcalc expression=diff_topidx_epsilon = if( abs( topidx_wg32bit@test_topidx - topidx_wg64bit@test_topidx ) <= 1.0e-15 , null()  , topidx_wg32bit@test_topidx - topidx_wg64bit@test_topidx )

r.univar -e map=diff_topidx_epsilon@test_topidx                                 
total null and non-null cells: 2025000
total null cells: 1920831
Of the non-null cells:
----------------------
n: 104169
minimum: -3.55271e-015
maximum: 3.55271e-015
range: 7.10543e-015
mean: -1.67607e-015
mean of absolute values: 1.7849e-015
standard deviation: 0
variance: 0
variation coefficient: -0 %
sum: -1.74594561030972e-010
1st quartile: -1.77636e-015
median (odd number of cells): -1.77636e-015
3rd quartile: -1.77636e-015
90th percentile: -1.77636e-015

comment:6 in reply to:  5 Changed 3 years ago by hellik

Replying to hellik:

tested with r.slope.aspect

r.slope.aspect elevation=elevation@PERMANENT slope=slope_wg32bit aspect=aspect_wg32bit
r.slope.aspect elevation=elevation@PERMANENT slope=slope_wg64bit aspect=aspect_wg64bit
r.mapcalc expression=diff_slope = slope_wg32bit@slope_test - slope_wg64bit@slope_test

r.univar -e map=diff_slope@slope_test                                           
total null and non-null cells: 2025000
total null cells: 5696
Of the non-null cells:
----------------------
n: 2019304
minimum: 0
maximum: 0
range: 0
mean: 0
mean of absolute values: 0
standard deviation: 0
variance: 0
variation coefficient: nan %
sum: 0
1st quartile: 0
median (even number of cells): 0
3rd quartile: 0
90th percentile: 0
r.mapcalc expression=diff_aspect = aspect_wg32bit@slope_test - aspect_wg64bit@slope_test

r.univar -e map=diff_aspect@slope_test                                          
total null and non-null cells: 2025000
total null cells: 5696
Of the non-null cells:
----------------------
n: 2019304
minimum: 0
maximum: 0
range: 0
mean: 0
mean of absolute values: 0
standard deviation: 0
variance: 0
variation coefficient: nan %
sum: 0
1st quartile: 0
median (even number of cells): 0
3rd quartile: 0
90th percentile: 0

comment:7 in reply to:  4 ; Changed 3 years ago by neteler

Replying to hellik: ...

> minimum: -8.14907e-010
> maximum: 4.65661e-010
> range: 1.28057e-009
> mean: -1.12517e-012
> mean of absolute values: 1.73591e-012
> standard deviation: 0
> variance: 0
> variation coefficient: -0 %

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

Some info pages (for who wants to learn more):

I'm not a CPU precision guru but to me it looks ok. Others?

comment:8 in reply to:  7 Changed 3 years ago by hellik

Replying to neteler:

Replying to hellik: ...

> minimum: -8.14907e-010
> maximum: 4.65661e-010
> range: 1.28057e-009
> mean: -1.12517e-012
> mean of absolute values: 1.73591e-012
> standard deviation: 0
> variance: 0
> variation coefficient: -0 %

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

Some info pages (for who wants to learn more):

I'm not a CPU precision guru but to me it looks ok. Others?

I'm also no CPU guru. :-)

all these tests with 32bit and 64 bit binaries are done on the same machine. I find it interesting that some calculations are identical between 32 bit and 64 bit (e.g. Slope and aspect), others are not (e.g. accumulation by r.watershed or r.Topic).

though, the diff values are very low and results should be treated as identical?

Maybe adding some notes in the manual (if not already there?)

comment:9 in reply to:  7 ; Changed 3 years ago by glynn

Replying to neteler:

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

In theory, it should be possible to obtain identical results on any architecture using IEEE-754 floating-point for calculations using only +, -, *, / and sqrt() (no transcendental functions such as sin, cos, log, exp, etc). But that may have a noticeable performance cost (e.g. using -ffloat-store to discard excess precision, forgoing certain optimisations, etc).

Regarding the use of an "epsilon":

  1. The accuracy of a floating-point value is proportional to its magnitude. For IEEE-754, FLT_EPSILON is ~1.192e-7 while DBL_EPSILON is ~2.22e-16. This represents the difference between 1.0 and the smallest distinct, representable value greater than 1.0.

Consequently, it provides a measure of the worst-case rounding error for values between 1.0 (inclusive) and 2.0 (exclusive). For round-to-nearest, the worst-case rounding error will be plus or minus half that value. For larger or smaller values, the maximum rounding error will be scaled accordingly. I.e. a typical "tolerant comparison" would be along the lines of

    if (fabs(a-b) < N * epsilon * max(fabs(a),fabs(b))) ...
  1. Rounding error occurs at each stage in the computation. The expected absolute error for a complex calculation will be at least the sum of the absolute errors for the individual steps.
  1. When subtractions are involved, you can't rely upon the absolute error in the result being proportional to the magnitude of the result, as subtracting similar values will produce a result whose magnitude may be much smaller than either of the individual values, but whose absolute error is proportional to that of the individual values.

(A concrete example of that last point is that a single-pass variance calculation using sum-of-squares minus sum-squared can yield a negative value for the variance if the deviations are small relative to the mean).

  1. Other common cases where the accuracy of the result can be far worse than that of intermediate results or the inputs include finding roots of high-degree polynomials (an extreme example) and acos(x) where x is close to 1.0 (as x tends to 1.0, the derivative of acos(x) tends to infinity while acos(x) itself tends to zero; the first magnifies the absolute error while the second further magnifies the relative error).

comment:10 Changed 3 years ago by neteler

Milestone: 7.0.3

Ticket retargeted after milestone closed

comment:11 Changed 3 years ago by neteler

Milestone: 7.0.4

Ticket retargeted after 7.0.3 milestone closed

comment:12 in reply to:  9 ; Changed 3 years ago by mmetz

Replying to glynn:

Replying to neteler:

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

In theory, it should be possible to obtain identical results on any architecture using IEEE-754 floating-point for calculations using only +, -, *, / and sqrt() (no transcendental functions such as sin, cos, log, exp, etc). But that may have a noticeable performance cost (e.g. using -ffloat-store to discard excess precision, forgoing certain optimisations, etc).

According to the GCC compiler options used for WinGRASS, WinGRASS does not necessarily adhere to IEEE-754 standards, see https://gcc.gnu.org/wiki/FAQ#PR323

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

I tested this with r.watershed and got differences in the range also observed between 32bit and 64bit architectures. The differences between architectures are thus within the expected range of floating-point results. Identical results for different architectures could maybe obtained with gcc -fexcess-precision=standard -std=c99.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

comment:13 in reply to:  12 Changed 3 years ago by hellik

Replying to mmetz:

Replying to glynn:

Replying to neteler:

Doesn't this qualify for "basically identical results as much as different architectures can achieve"?

In theory, it should be possible to obtain identical results on any architecture using IEEE-754 floating-point for calculations using only +, -, *, / and sqrt() (no transcendental functions such as sin, cos, log, exp, etc). But that may have a noticeable performance cost (e.g. using -ffloat-store to discard excess precision, forgoing certain optimisations, etc).

According to the GCC compiler options used for WinGRASS, WinGRASS does not necessarily adhere to IEEE-754 standards, see https://gcc.gnu.org/wiki/FAQ#PR323

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

I tested this with r.watershed and got differences in the range also observed between 32bit and 64bit architectures. The differences between architectures are thus within the expected range of floating-point results. Identical results for different architectures could maybe obtained with gcc -fexcess-precision=standard -std=c99.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

Identical results for different architectures could maybe

obtained with gcc -fexcess-precision=standard -std=c99.

should this be tested?

comment:14 in reply to:  12 ; Changed 3 years ago by glynn

Replying to mmetz:

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

There should be some set of options which produce the correct result on any architecture which uses IEEE-754.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

This shouldn't be relevant. The source code won't change between architectures, and gcc shouldn't perform such approximations unless specifically requested via -ffast-math or similar (apparently -Ofast enables this, but none of the numbered optimisation levels do).

The main issue which occurs with the default options is excess precision. The x87 FPU uses 80 bits by default; you need to use e.g. -ffloat-store or -fexcess-precision=standard to force the result to be rounded to 64 bits. SSE always rounds to 64 bits.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

The issue is how you test for correctness. You can either build for deterministic calculation so that exact equality comparisons will work, or you can use a tolerance. The latter option requires case-by-case analysis (see point 4 in comment:9).

comment:15 in reply to:  14 ; Changed 3 years ago by hellik

Replying to glynn:

Replying to mmetz:

The results are thus as precise as requested per GCC compiler options for the given architecture. The same GCC compiler options can lead to different results under different architectures.

There should be some set of options which produce the correct result on any architecture which uses IEEE-754.

Generally, with limited fp precision,

(a * b) / c != a * (b / c)

This shouldn't be relevant. The source code won't change between architectures, and gcc shouldn't perform such approximations unless specifically requested via -ffast-math or similar (apparently -Ofast enables this, but none of the numbered optimisation levels do).

The main issue which occurs with the default options is excess precision. The x87 FPU uses 80 bits by default; you need to use e.g. -ffloat-store or -fexcess-precision=standard to force the result to be rounded to 64 bits. SSE always rounds to 64 bits.

In other words, the results obtained with WinGRASS 32bit and WinGRASS 64bit are both correct, even if they differ from each other. The results are as precise as possible / as requested.

The issue is how you test for correctness. You can either build for deterministic calculation so that exact equality comparisons will work, or you can use a tolerance. The latter option requires case-by-case analysis (see point 4 in comment:9).

some tests shows that some examples in manuals (modules and addons) seems not to work due to this issue in winGRASS 7 64-it.

what what be a usable solution?

comment:16 in reply to:  15 Changed 3 years ago by glynn

Replying to hellik:

some tests shows that some examples in manuals (modules and addons) seems not to work due to this issue in winGRASS 7 64-it.

what what be a usable solution?

Fix the examples.

There are reasons why building for deterministic floating point might be worth the performance cost, but examples relying upon 80-bit intermediate precision isn't such a reason, IMHO.

Can you identify the examples in question? Is it just a case of performing approximate-equality comparisons with too low a tolerance?

comment:17 in reply to:  15 Changed 3 years ago by wenzeslaus

Replying to hellik:

some tests shows that some examples in manuals (modules and addons) seems not to work due to this issue in winGRASS 7 64-it.

what what be a usable solution?

For the actual tests as in testsuite, use assertRastersNoDifference with precision parameter for comparing the resulting raster to the reference. Also functions like assertRasterFitsInfo have precision parameter.

comment:18 Changed 3 years ago by martinl

Milestone: 7.0.47.0.5

comment:19 Changed 3 years ago by martinl

Resolution: worksforme
Status: newclosed

Closing, feel free to reopen if needed.

Note: See TracTickets for help on using tickets.