Opened 11 years ago

Closed 10 years ago

#2053 closed defect (fixed)

r.recode: 0.0 or 1.0 as limits do not seem to be taken into account

Reported by: Nikos Alexandris Owned by: grass-dev@…
Priority: major Milestone: 7.0.0
Component: Raster Version: unspecified
Keywords: r.recode, DCELL, CELL, rules, needinfo Cc:
CPU: Unspecified Platform: Unspecified

Description

Attempting to recode a double precision raster map to an integer by either using a "rules" file or directly using "...<< EOF" with the following sequence "0.0:1.0:0:255" does not work. Both stats and histogram of the recoded raster map, e.g. a recoded image derived from a Red-Reflectance image ranging in

min=0
max=0.774115699104528

are flattened out! The only values to be found in the recoded image are 0 and 255. However, using a minimum "from=" value of "0.001" produces the expected recoded image, i.e. using a rules file that contains

0.001:1.0:0:255

produces a recoded image which is composed by many integer values, i.e. ranging from 5 up to 197. The histogram of this recoded image looks "nice" as well.

See also: http://lists.osgeo.org/pipermail/grass-dev/2013-August/065280.html. Tested in GRASS 6.4.4svn, GRASS 7.0.svn revision=57312

Attachments (1)

reclass.diff (1.9 KB ) - added by annakrat 10 years ago.
lib raster rounding changes

Download all attachments as: .zip

Change History (19)

comment:1 by mlennert, 11 years ago

Summary: r.recode is buggy when minimum "from=0.0"r.recode: 0.0 or 1.0 as limits do not seem to be taken into account

I'm having a similar (probably related) issue:

Given a raster map with a classic normalized difference index ranging theoretically from -1 to 1. The exact values are:

> r.univar mymap
minimum: -0.971429
maximum: 0.766129

When I recode using the following rules:

-1.0:0:0.0
0:1.0:1.0

I get

> r.stats -c mybinarymap
0 5804
1 9487108

Changing the rules to

-0.99:0.0:0
0.0:1.0:1

or

-1.0:0.0:0
0.0:0.99:1

I get

> r.stats -c mybinarymap
0 5626374
1 3866538

Moritz

comment:2 by annakrat, 10 years ago

I've just run into this problem too. For some reason, r.recode tries to guess the input type from the recode rules which seems to be nonsense and leads to the errors above. -1.00 is considered as -1 and it reads the values from the input map as integers although the map can be DCELL. A simple change like this (they might be some more elegant way) seems to do what everyone expects:

Index: read_rules.c
===================================================================
--- read_rules.c	(revision 61256)
+++ read_rules.c	(working copy)
@@ -50,7 +50,7 @@
     DCELL oLow, oHigh, nLow, nHigh;
     int line, n;
 
-    in_type = out_type = CELL_TYPE;
+    in_type = Rast_map_type(name, "");
 
     rules = (char **)G_malloc(INCR * sizeof(char *));
     rule_size = INCR;
@@ -98,16 +98,12 @@
 	   lookup the values in the quant table */
 	switch (sscanf(buf, "%lf:%lf:%lf:%lf", &oLow, &oHigh, &nLow, &nHigh)) {
 	case 3:
-	    update_type(&in_type, oLow);
-	    update_type(&in_type, oHigh);
 	    update_type(&out_type, nLow);
 	    update_rules(buf);
 	    Rast_fpreclass_add_rule(&rcl_struct, oLow, oHigh, nLow, nLow);
 	    break;
 
 	case 4:
-	    update_type(&in_type, oLow);
-	    update_type(&in_type, oHigh);
 	    update_type(&out_type, nLow);
 	    update_type(&out_type, nHigh);
 	    update_rules(buf);

It could be even enough if r.recode reads the map values as doubles always and takes care just of the output map type.

comment:3 by annakrat, 10 years ago

BTW, using -d (force DCELL) flag I get correct result, however the resulting map is DCELL. This flag should affect just the output map type.

comment:4 by annakrat, 10 years ago

Keywords: needinfo added
Priority: normalmajor

Any opinion on the suggested change? This is in my view quite serious bug.

in reply to:  4 ; comment:5 by glynn, 10 years ago

Replying to annakrat:

Any opinion on the suggested change? This is in my view quite serious bug.

If it's going to deduce the type from the rules, I suggest checking for a decimal point prior to parsing rather than checking whether the parsed value is exactly representable as an integer. IOW, "0.0:1.0:0:255" will perform a float-int recode while "0:1:0:255" will perform a int-int conversion.

I wouldn't suggest using the input map's type for the input type unless this will produce identical results.

in reply to:  5 ; comment:6 by wenzeslaus, 10 years ago

Replying to glynn:

Replying to annakrat:

Any opinion on the suggested change? This is in my view quite serious bug.

If it's going to deduce the type from the rules, I suggest checking for a decimal point prior to parsing rather than checking whether the parsed value is exactly representable as an integer. IOW, "0.0:1.0:0:255" will perform a float-int recode while "0:1:0:255" will perform a int-int conversion.

I wouldn't suggest using the input map's type for the input type unless this will produce identical results.

I think that what is expected to determine the type of input is the input map. The output should be determined by the rules, since it seems as a common practice - the numbers giving the type. This does not apply the to input map because it already has a type. r.mapcalc has the same behavior unless you do explicit retype (using int() but using number format is not explicit retype in my opinion).

In other words, I consider the current behavior wrong and determining the input type from decimal point does not improve the situation. If rules are 0:1:0:255 and the input is double, the result will be always 0 for the whole map. This is not what I expect. For 0:10:0:255 and input again double from 0 to 10, the output will not look completely wrong as in previous case which is perhaps even more dangerous because I just don't expect the numbers from input map to by trimmed to integers before the computation (although I expect the integer on the output).

I also don't see any case where it would be advantageous to trim the numbers before the recoding itself.

What is your reason for trimming the numbers on input ("to produce identical results")?

in reply to:  6 comment:7 by mlennert, 10 years ago

Replying to wenzeslaus:

Replying to glynn:

Replying to annakrat:

Any opinion on the suggested change? This is in my view quite serious bug.

If it's going to deduce the type from the rules, I suggest checking for a decimal point prior to parsing rather than checking whether the parsed value is exactly representable as an integer. IOW, "0.0:1.0:0:255" will perform a float-int recode while "0:1:0:255" will perform a int-int conversion.

I wouldn't suggest using the input map's type for the input type unless this will produce identical results.

I think that what is expected to determine the type of input is the input map.

Glynn, I imagine your suggestion is meant to leave more flexibility ? I can see a point in that (and thus including an "explicit retype" in r.recode, to use Václav's words).

However, I also agree with Václav that user expectations would be that if the input map is floating point, a rule of -1:1:0:255 would take into account all input values, not only -1, 0, and 1.

So, if we follow Glynn's suggestion, we should at least include a big fat warning message indicating what the module is doing.

Moritz

in reply to:  5 ; comment:8 by glynn, 10 years ago

Replying to glynn:

I wouldn't suggest using the input map's type for the input type unless this will produce identical results.

From examining lib/raster/fpreclass.c, it appears that they will produce identical results (int and float values are simply converted to double), so this is a non-issue.

However, this might be a good time to re-examine how floating-point values are converted to integers both in fpreclass.c and put_row.c.

Both cases simply use a C cast, which is defined as discarding the fractional part, i.e. rounding toward zero (inwards). The GRASS quantisation functions are only used when reading floating-point maps as integer, not when writing.

This is one of the least useful rounding modes, due to its anti-symmetric behaviour (i.e. positive numbers are rounded down, negative numbers are rounded up). Rounding to nearest integer or rounding toward negative infinity (i.e. floor()) are usually preferred.

Another issue with a C cast is that if the result is outside the range of an integer, the behaviour is undefined. Ideally such values should be either converted to null or clamped to the most positive/negative representable value.

in reply to:  8 ; comment:9 by annakrat, 10 years ago

Replying to glynn:

Replying to glynn:

I wouldn't suggest using the input map's type for the input type unless this will produce identical results.

From examining lib/raster/fpreclass.c, it appears that they will produce identical results (int and float values are simply converted to double), so this is a non-issue.

So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?

However, this might be a good time to re-examine how floating-point values are converted to integers both in fpreclass.c and put_row.c.

Both cases simply use a C cast, which is defined as discarding the fractional part, i.e. rounding toward zero (inwards). The GRASS quantisation functions are only used when reading floating-point maps as integer, not when writing.

This is one of the least useful rounding modes, due to its anti-symmetric behaviour (i.e. positive numbers are rounded down, negative numbers are rounded up). Rounding to nearest integer or rounding toward negative infinity (i.e. floor()) are usually preferred.

Floor sounds definitely better. How invasive and difficult would be this 're-examining'?

in reply to:  9 ; comment:10 by glynn, 10 years ago

Replying to annakrat:

So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?

Right. Forcing the input type to DCELL would also work.

Floor sounds definitely better. How invasive and difficult would be this 're-examining'?

It's just a matter of replacing casts such as "(CELL)x" (as well as implicit conversions) with "(CELL)floor(x)" (or "(CELL)round(x), or "(CELL)floor(x+0.5)" as round() is C99).

For fpreclass.c, this applies to Rast_fpreclass_perform_di (L599), Rast_fpreclass_perform_fi (L641), and Rast_fpreclass_perform_ii (L683).

For put_row.c, this applies to convert_and_write_fi (L636) and convert_and_write_di (L652).

in reply to:  10 comment:11 by annakrat, 10 years ago

Replying to glynn:

Replying to annakrat:

So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?

Right. Forcing the input type to DCELL would also work.

OK, done in r62518.

by annakrat, 10 years ago

Attachment: reclass.diff added

lib raster rounding changes

in reply to:  10 ; comment:12 by annakrat, 10 years ago

Replying to glynn:

Replying to annakrat:

So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?

Right. Forcing the input type to DCELL would also work.

Floor sounds definitely better. How invasive and difficult would be this 're-examining'?

It's just a matter of replacing casts such as "(CELL)x" (as well as implicit conversions) with "(CELL)floor(x)" (or "(CELL)round(x), or "(CELL)floor(x+0.5)" as round() is C99).

For fpreclass.c, this applies to Rast_fpreclass_perform_di (L599), Rast_fpreclass_perform_fi (L641), and Rast_fpreclass_perform_ii (L683).

For put_row.c, this applies to convert_and_write_fi (L636) and convert_and_write_di (L652).

I attached the diff for this, I decided for floor(x+0.5) logic (but we could use just floor if someone is against this). I would rather get some approval from other developers for this.

Otherwise this ticket should be solved once r62518 is backported.

in reply to:  12 ; comment:13 by glynn, 10 years ago

Replying to annakrat:

I attached the diff for this, I decided for floor(x+0.5) logic (but we could use just floor if someone is against this). I would rather get some approval from other developers for this.

The advantage of floor() is that it will produce identical results to the existing implementation for positive values.

The main disadvantage of floor() is that minuscule rounding errors can result in the wrong value, e.g. (CELL)floor(1.9999999) is 1 rather than 2.

in reply to:  13 comment:14 by annakrat, 10 years ago

Replying to glynn:

Replying to annakrat:

I attached the diff for this, I decided for floor(x+0.5) logic (but we could use just floor if someone is against this). I would rather get some approval from other developers for this.

The advantage of floor() is that it will produce identical results to the existing implementation for positive values.

The main disadvantage of floor() is that minuscule rounding errors can result in the wrong value, e.g. (CELL)floor(1.9999999) is 1 rather than 2.

Thanks for the summary, but I was mainly asking for opinions which one should we use. I prefer rounding to the nearest integer, even though it can give different results. Anyone has the same/different opinion?

in reply to:  12 ; comment:15 by annakrat, 10 years ago

Replying to annakrat:

Otherwise this ticket should be solved once r62518 is backported.

Backported in r62557.

in reply to:  15 comment:16 by lucadelu, 10 years ago

Replying to annakrat:

Replying to annakrat:

Otherwise this ticket should be solved once r62518 is backported.

Backported in r62557.

Could we close this ticket?

comment:17 by Nikos Alexandris, 10 years ago

Works for me.

comment:18 by mlennert, 10 years ago

Resolution: fixed
Status: newclosed

Works for me as well, so closing.

Moritz

Note: See TracTickets for help on using tickets.