Opened 11 years ago
Closed 10 years ago
#2053 closed defect (fixed)
r.recode: 0.0 or 1.0 as limits do not seem to be taken into account
Reported by: | Nikos Alexandris | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 7.0.0 |
Component: | Raster | Version: | unspecified |
Keywords: | r.recode, DCELL, CELL, rules, needinfo | Cc: | |
CPU: | Unspecified | Platform: | Unspecified |
Description
Attempting to recode a double precision raster map to an integer by either using a "rules" file or directly using "...<< EOF" with the following sequence "0.0:1.0:0:255" does not work. Both stats and histogram of the recoded raster map, e.g. a recoded image derived from a Red-Reflectance image ranging in
min=0 max=0.774115699104528
are flattened out! The only values to be found in the recoded image are 0 and 255. However, using a minimum "from=" value of "0.001" produces the expected recoded image, i.e. using a rules file that contains
0.001:1.0:0:255
produces a recoded image which is composed by many integer values, i.e. ranging from 5 up to 197. The histogram of this recoded image looks "nice" as well.
See also: http://lists.osgeo.org/pipermail/grass-dev/2013-August/065280.html. Tested in GRASS 6.4.4svn, GRASS 7.0.svn revision=57312
Attachments (1)
Change History (19)
comment:1 by , 11 years ago
Summary: | r.recode is buggy when minimum "from=0.0" → r.recode: 0.0 or 1.0 as limits do not seem to be taken into account |
---|
comment:2 by , 11 years ago
I've just run into this problem too. For some reason, r.recode tries to guess the input type from the recode rules which seems to be nonsense and leads to the errors above. -1.00 is considered as -1 and it reads the values from the input map as integers although the map can be DCELL. A simple change like this (they might be some more elegant way) seems to do what everyone expects:
Index: read_rules.c =================================================================== --- read_rules.c (revision 61256) +++ read_rules.c (working copy) @@ -50,7 +50,7 @@ DCELL oLow, oHigh, nLow, nHigh; int line, n; - in_type = out_type = CELL_TYPE; + in_type = Rast_map_type(name, ""); rules = (char **)G_malloc(INCR * sizeof(char *)); rule_size = INCR; @@ -98,16 +98,12 @@ lookup the values in the quant table */ switch (sscanf(buf, "%lf:%lf:%lf:%lf", &oLow, &oHigh, &nLow, &nHigh)) { case 3: - update_type(&in_type, oLow); - update_type(&in_type, oHigh); update_type(&out_type, nLow); update_rules(buf); Rast_fpreclass_add_rule(&rcl_struct, oLow, oHigh, nLow, nLow); break; case 4: - update_type(&in_type, oLow); - update_type(&in_type, oHigh); update_type(&out_type, nLow); update_type(&out_type, nHigh); update_rules(buf);
It could be even enough if r.recode reads the map values as doubles always and takes care just of the output map type.
comment:3 by , 11 years ago
BTW, using -d (force DCELL) flag I get correct result, however the resulting map is DCELL. This flag should affect just the output map type.
follow-up: 5 comment:4 by , 10 years ago
Keywords: | needinfo added |
---|---|
Priority: | normal → major |
Any opinion on the suggested change? This is in my view quite serious bug.
follow-ups: 6 8 comment:5 by , 10 years ago
Replying to annakrat:
Any opinion on the suggested change? This is in my view quite serious bug.
If it's going to deduce the type from the rules, I suggest checking for a decimal point prior to parsing rather than checking whether the parsed value is exactly representable as an integer. IOW, "0.0:1.0:0:255" will perform a float-int recode while "0:1:0:255" will perform a int-int conversion.
I wouldn't suggest using the input map's type for the input type unless this will produce identical results.
follow-up: 7 comment:6 by , 10 years ago
Replying to glynn:
Replying to annakrat:
Any opinion on the suggested change? This is in my view quite serious bug.
If it's going to deduce the type from the rules, I suggest checking for a decimal point prior to parsing rather than checking whether the parsed value is exactly representable as an integer. IOW, "0.0:1.0:0:255" will perform a float-int recode while "0:1:0:255" will perform a int-int conversion.
I wouldn't suggest using the input map's type for the input type unless this will produce identical results.
I think that what is expected to determine the type of input is the input map. The output should be determined by the rules, since it seems as a common practice - the numbers giving the type. This does not apply the to input map because it already has a type. r.mapcalc
has the same behavior unless you do explicit retype (using int()
but using number format is not explicit retype in my opinion).
In other words, I consider the current behavior wrong and determining the input type from decimal point does not improve the situation. If rules are 0:1:0:255
and the input is double, the result will be always 0 for the whole map. This is not what I expect. For 0:10:0:255
and input again double from 0 to 10, the output will not look completely wrong as in previous case which is perhaps even more dangerous because I just don't expect the numbers from input map to by trimmed to integers before the computation (although I expect the integer on the output).
I also don't see any case where it would be advantageous to trim the numbers before the recoding itself.
What is your reason for trimming the numbers on input ("to produce identical results")?
comment:7 by , 10 years ago
Replying to wenzeslaus:
Replying to glynn:
Replying to annakrat:
Any opinion on the suggested change? This is in my view quite serious bug.
If it's going to deduce the type from the rules, I suggest checking for a decimal point prior to parsing rather than checking whether the parsed value is exactly representable as an integer. IOW, "0.0:1.0:0:255" will perform a float-int recode while "0:1:0:255" will perform a int-int conversion.
I wouldn't suggest using the input map's type for the input type unless this will produce identical results.
I think that what is expected to determine the type of input is the input map.
Glynn, I imagine your suggestion is meant to leave more flexibility ? I can see a point in that (and thus including an "explicit retype" in r.recode, to use Václav's words).
However, I also agree with Václav that user expectations would be that if the input map is floating point, a rule of -1:1:0:255 would take into account all input values, not only -1, 0, and 1.
So, if we follow Glynn's suggestion, we should at least include a big fat warning message indicating what the module is doing.
Moritz
follow-up: 9 comment:8 by , 10 years ago
Replying to glynn:
I wouldn't suggest using the input map's type for the input type unless this will produce identical results.
From examining lib/raster/fpreclass.c, it appears that they will produce identical results (int and float values are simply converted to double), so this is a non-issue.
However, this might be a good time to re-examine how floating-point values are converted to integers both in fpreclass.c and put_row.c.
Both cases simply use a C cast, which is defined as discarding the fractional part, i.e. rounding toward zero (inwards). The GRASS quantisation functions are only used when reading floating-point maps as integer, not when writing.
This is one of the least useful rounding modes, due to its anti-symmetric behaviour (i.e. positive numbers are rounded down, negative numbers are rounded up). Rounding to nearest integer or rounding toward negative infinity (i.e. floor()) are usually preferred.
Another issue with a C cast is that if the result is outside the range of an integer, the behaviour is undefined. Ideally such values should be either converted to null or clamped to the most positive/negative representable value.
follow-up: 10 comment:9 by , 10 years ago
Replying to glynn:
Replying to glynn:
I wouldn't suggest using the input map's type for the input type unless this will produce identical results.
From examining lib/raster/fpreclass.c, it appears that they will produce identical results (int and float values are simply converted to double), so this is a non-issue.
So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?
However, this might be a good time to re-examine how floating-point values are converted to integers both in fpreclass.c and put_row.c.
Both cases simply use a C cast, which is defined as discarding the fractional part, i.e. rounding toward zero (inwards). The GRASS quantisation functions are only used when reading floating-point maps as integer, not when writing.
This is one of the least useful rounding modes, due to its anti-symmetric behaviour (i.e. positive numbers are rounded down, negative numbers are rounded up). Rounding to nearest integer or rounding toward negative infinity (i.e. floor()) are usually preferred.
Floor sounds definitely better. How invasive and difficult would be this 're-examining'?
follow-ups: 11 12 comment:10 by , 10 years ago
Replying to annakrat:
So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?
Right. Forcing the input type to DCELL would also work.
Floor sounds definitely better. How invasive and difficult would be this 're-examining'?
It's just a matter of replacing casts such as "(CELL)x" (as well as implicit conversions) with "(CELL)floor(x)" (or "(CELL)round(x), or "(CELL)floor(x+0.5)" as round() is C99).
For fpreclass.c, this applies to Rast_fpreclass_perform_di (L599), Rast_fpreclass_perform_fi (L641), and Rast_fpreclass_perform_ii (L683).
For put_row.c, this applies to convert_and_write_fi (L636) and convert_and_write_di (L652).
comment:11 by , 10 years ago
follow-ups: 13 15 comment:12 by , 10 years ago
Replying to glynn:
Replying to annakrat:
So, if we decide to go with a simple fix, using input map type (as suggested in the code snippet above) would be a solution, right?
Right. Forcing the input type to DCELL would also work.
Floor sounds definitely better. How invasive and difficult would be this 're-examining'?
It's just a matter of replacing casts such as "(CELL)x" (as well as implicit conversions) with "(CELL)floor(x)" (or "(CELL)round(x), or "(CELL)floor(x+0.5)" as round() is C99).
For fpreclass.c, this applies to Rast_fpreclass_perform_di (L599), Rast_fpreclass_perform_fi (L641), and Rast_fpreclass_perform_ii (L683).
For put_row.c, this applies to convert_and_write_fi (L636) and convert_and_write_di (L652).
I attached the diff for this, I decided for floor(x+0.5) logic (but we could use just floor if someone is against this). I would rather get some approval from other developers for this.
Otherwise this ticket should be solved once r62518 is backported.
follow-up: 14 comment:13 by , 10 years ago
Replying to annakrat:
I attached the diff for this, I decided for floor(x+0.5) logic (but we could use just floor if someone is against this). I would rather get some approval from other developers for this.
The advantage of floor() is that it will produce identical results to the existing implementation for positive values.
The main disadvantage of floor() is that minuscule rounding errors can result in the wrong value, e.g. (CELL)floor(1.9999999) is 1 rather than 2.
comment:14 by , 10 years ago
Replying to glynn:
Replying to annakrat:
I attached the diff for this, I decided for floor(x+0.5) logic (but we could use just floor if someone is against this). I would rather get some approval from other developers for this.
The advantage of floor() is that it will produce identical results to the existing implementation for positive values.
The main disadvantage of floor() is that minuscule rounding errors can result in the wrong value, e.g. (CELL)floor(1.9999999) is 1 rather than 2.
Thanks for the summary, but I was mainly asking for opinions which one should we use. I prefer rounding to the nearest integer, even though it can give different results. Anyone has the same/different opinion?
follow-up: 16 comment:15 by , 10 years ago
comment:16 by , 10 years ago
comment:18 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Works for me as well, so closing.
Moritz
I'm having a similar (probably related) issue:
Given a raster map with a classic normalized difference index ranging theoretically from -1 to 1. The exact values are:
When I recode using the following rules:
I get
Changing the rules to
or
I get
Moritz