Opened 10 years ago
Closed 10 years ago
#2466 closed defect (fixed)
r.param.scale slow on Windows when there is no data in raster
Reported by: | annakrat | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 7.0.0 |
Component: | Raster | Version: | unspecified |
Keywords: | r.param.scale, nan | Cc: | |
CPU: | Unspecified | Platform: | MSWindows 8 |
Description
When I run r.param.scale on Windows with raster map with a lot of no data, the computation is significantly slower (90 s on Ubuntu, 50 min on Windows). r.param.scale doesn't propagate nulls. I read somewhere that computing with nan can be slower on some architectures, could this be the reason?
I attached a patch to propagate nans. I also realized that at one point during preprocessing, there are large numbers computed (especially with big cell size and high window size), so I thought double would be more appropriate.
If there are no objection, I would commit it to test it on Windows.
Attachments (1)
Change History (6)
by , 10 years ago
Attachment: | rparamscale.diff added |
---|
follow-up: 2 comment:1 by , 10 years ago
Replying to annakrat:
When I run r.param.scale on Windows with raster map with a lot of no data, the computation is significantly slower (90 s on Ubuntu, 50 min on Windows). r.param.scale doesn't propagate nulls. I read somewhere that computing with nan can be slower on some architectures, could this be the reason?
Normal floating-point operations will be handled directly by the CPU, without the OS getting involved. But it's possible that converting NaN to integer raises an exception which must be handled by the OS.
Aside from performance, it's not guaranteed that converting a NaN to an integer will produce the integer null value (-231), although that happens to be the case on Linux/x86-64.
Code which converts potentially-null values needs to handle nulls explicitly, e.g.
double d = ...; int x; if (Rast_is_d_null_value(&d)) Rast_set_c_null_value(&x, 1); else x = (CELL)floor(d);
Also, conversion should use normally use floor, ceil or rounding (floor(x+0.5)). A C cast rounds toward zero (i.e. downward for positive values, upward for negative values), which is rarely the correct choice.
I attached a patch to propagate nans. I also realized that at one point during preprocessing, there are large numbers computed (especially with big cell size and high window size), so I thought double would be more appropriate.
Are these values ever converted to integer? If so, conversion of out-of-range values needs to be handled explicitly.
follow-up: 3 comment:2 by , 10 years ago
Replying to glynn:
Replying to annakrat:
When I run r.param.scale on Windows with raster map with a lot of no data, the computation is significantly slower (90 s on Ubuntu, 50 min on Windows). r.param.scale doesn't propagate nulls. I read somewhere that computing with nan can be slower on some architectures, could this be the reason?
Normal floating-point operations will be handled directly by the CPU, without the OS getting involved. But it's possible that converting NaN to integer raises an exception which must be handled by the OS.
Aside from performance, it's not guaranteed that converting a NaN to an integer will produce the integer null value (-231), although that happens to be the case on Linux/x86-64.
Code which converts potentially-null values needs to handle nulls explicitly, e.g.
double d = ...; int x; if (Rast_is_d_null_value(&d)) Rast_set_c_null_value(&x, 1); else x = (CELL)floor(d);Also, conversion should use normally use floor, ceil or rounding (floor(x+0.5)). A C cast rounds toward zero (i.e. downward for positive values, upward for negative values), which is rarely the correct choice.
It's not clear to me, what the algorithm should do when there is a null value at one of the cells in the search window. It could just skip the entire computation and write null in the central cell. This would be probably the most correct option but it results in null edges. Or we could assign 0 values but then the result on edges would be distorted by the zeros. Which approach should we implement?
I attached a patch to propagate nans. I also realized that at one point during preprocessing, there are large numbers computed (especially with big cell size and high window size), so I thought double would be more appropriate.
Are these values ever converted to integer? If so, conversion of out-of-range values needs to be handled explicitly.
I don't think so.
follow-up: 4 comment:3 by , 10 years ago
Replying to annakrat:
It's not clear to me, what the algorithm should do when there is a null value at one of the cells in the search window. It could just skip the entire computation and write null in the central cell. This would be probably the most correct option but it results in null edges. Or we could assign 0 values but then the result on edges would be distorted by the zeros. Which approach should we implement?
AFAICT, you either need to accept that nulls will propagate (i.e. if any cell in the window is null, the result will be null), or replace the least-squares fitting algorithm with something which can handle nulls.
FWIW, that module dates back to at least GRASS 4.x, when neither nulls nor floating-point existed.
follow-up: 5 comment:4 by , 10 years ago
Replying to glynn:
AFAICT, you either need to accept that nulls will propagate (i.e. if any cell in the window is null, the result will be null), or replace the least-squares fitting algorithm with something which can handle nulls.
I changed it so that nulls propagate (r62648). I will test on Windows if there will be any speed difference.
comment:5 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Replying to annakrat:
Replying to glynn:
AFAICT, you either need to accept that nulls will propagate (i.e. if any cell in the window is null, the result will be null), or replace the least-squares fitting algorithm with something which can handle nulls.
I changed it so that nulls propagate (r62648). I will test on Windows if there will be any speed difference.
Tested on Windows, it is running fast now. Backported in r62692. Closing ticket.
r.param.scale patch