Opened 5 years ago
Closed 3 years ago
#2343 closed defect (fixed)
GRASS 7: "inf" values break insert statements in v.rast.stats
Reported by: | sbl | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 7.2.0 |
Component: | Python | Version: | svn-releasebranch70 |
Keywords: | v.rast.stats | Cc: | |
CPU: | All | Platform: | All |
Description
When running v.rast.stats on a dataset with small geometries (which in my case were represented by 2 pixels in the raster), the parameter "coeff_var" can result in "inf" value. "inf" breaks the INSERT statement and results in an DBMI error "no such column: inf", which at the end crashes v.rast.stats.
Just like "nan", "inf" could be replaced by NULL. Changing line 263 in v.rast.stats.py into this: if value.lower().endswith('nan') or value.lower().endswith('inf'): should work.
Both GRASS 7.0 and 7.1 are affected. Please find attached patches for both versions...
Attachments (3)
Change History (17)
Changed 5 years ago by
Attachment: | v.rast.stats.g70.diff added |
---|
Changed 5 years ago by
Attachment: | v.rast.stats.g71.diff added |
---|
Changed 5 years ago by
Attachment: | v.rast.stats.g64.diff added |
---|
comment:1 Changed 5 years ago by
comment:2 follow-up: 3 Changed 5 years ago by
Replying to sbl:
"inf" breaks the INSERT statement and results in an DBMI error "no such column: inf", which at the end crashes v.rast.stats.
Just like "nan", "inf" could be replaced by NULL.
FWIW, PostgreSQL supports infinity and NaN values, e.g.
CREATE TABLE test ( value REAL ) ; INSERT INTO test VALUES (REAL 'Infinity'); INSERT INTO test VALUES (REAL '+Infinity'); INSERT INTO test VALUES (REAL '-Infinity'); INSERT INTO test VALUES (REAL 'NaN'); SELECT * FROM test ; value ----------- Infinity Infinity -Infinity NaN (4 rows)
I don't know whether this is part of the SQL standard, or whether it is supported by the other back-ends. It won't work for columns of type NUMERIC.
comment:3 follow-up: 7 Changed 5 years ago by
Replying to glynn:
FWIW, PostgreSQL supports infinity and NaN values, e.g. I don't know whether this is part of the SQL standard, or whether it is supported by the other back-ends. It won't work for columns of type NUMERIC.
Thanks for the hint, I was not aware of that.
Now I investigated a bit and it seems that SQLite (as default DBMI in GRASS 7) also supports NaN and Infinity. However, using quoted literal strings instead of NULL in v.rast.stas (regardless if it is "NaN", "inf", "Infinity", "INF" ...), always resulted in a value 0 in the attribute table (in my example at least). So, supporting NaN or Inf is unfortunately not as straightforward as I hoped...
At the end it is also a question wether NaN or Inf is preferable over NULL in the database... (you can fnd a discussion about it in an SQLite user forum: https://groups.google.com/forum/#!msg/sqlite_users/aQX_WvVwBS8/Tqo_agM8_14J)
comment:4 follow-up: 5 Changed 5 years ago by
Replying to sbl:
When running v.rast.stats on a dataset with small geometries (which in my case were represented by 2 pixels in the raster), the parameter "coeff_var" can result in "inf" value. "inf" breaks the INSERT statement and results in an DBMI error "no such column: inf", which at the end crashes v.rast.stats.
Just like "nan", "inf" could be replaced by NULL. Changing line 263 in v.rast.stats.py into this: if value.lower().endswith('nan') or value.lower().endswith('inf'): should work.
Both GRASS 7.0 and 7.1 are affected. Please find attached patches for both versions...
Could you provide a dataset to test the patches?
Thanks
comment:5 follow-up: 6 Changed 5 years ago by
Replying to lucadelu:
Could you provide a dataset to test the patches?
You can reproduce the issue with these commands:
g.region -p n=1 s=0 e=2 w=0 res=1 r.mapcalc --o expression="rastermap1=1" r.mapcalc --o expression="rastermap2=double(if(x()==0.5,-1,1))" r.to.vect --o --v input=rastermap1 output=vectormap type=area v.rast.stats -c --verbose map=vectormap raster=rastermap2 column_prefix=value
comment:6 Changed 5 years ago by
Replying to sbl:
Replying to lucadelu:
Could you provide a dataset to test the patches?
You can reproduce the issue with these commands:
g.region -p n=1 s=0 e=2 w=0 res=1 r.mapcalc --o expression="rastermap1=1" r.mapcalc --o expression="rastermap2=double(if(x()==0.5,-1,1))" r.to.vect --o --v input=rastermap1 output=vectormap type=area v.rast.stats -c --verbose map=vectormap raster=rastermap2 column_prefix=value
In r62938 I should fixed the error. I change a little bit your patch to cover multiple way to catch Infinity value.
If it is working I'll backport the changeset to GRASS70.
I also applied your patch for GRASS64 version in r62944.
comment:7 Changed 5 years ago by
Replying to sbl:
At the end it is also a question wether NaN or Inf is preferable over NULL in the database...
Well, for rasters, NaN is null (in 6.x, one specific NaN is treated as null, in 7.x any NaN value is treated as null). Also, code which is expecting a definite value probably won't handle a NaN correctly because of their unusual semantics.
Ideally, infinities should be preserved rather than converted to an SQL NULL. If this is just a matter of fixing the int<->string conversions (rather than blindly using e.g. sprintf/sscanf "%f"), then it should be done.
comment:8 follow-up: 9 Changed 5 years ago by
...should r62938 be reverted and implemented differently?
comment:9 Changed 5 years ago by
Replying to neteler:
...should r62938 be reverted and implemented differently?
Maybe. It depends upon how much work is involved.
Infinities are sufficiently rare that it probably doesn't matter how they're handled (e.g. r.mapcalc explicitly checks for division by zero, log(0) etc and returns null). If it's just a matter of writing float_to_string() and string_to_float() functions which deal with them correctly (and we know how to write them), it should be done. If there are other complexities, or if every supported database has a different representation, don't bother.
Storing NaNs? in a database is probably undesirable. I wouldn't assume that all databases will implement the same semantics, or even sane semantics (the fact that they aren't equal to themselves, that all of x<NaN, x>NaN, NaN<x and NaN>x are false, etc tends to wreak havoc with any algorithm which isn't expecting them). Code which retrieves values from a database is more likely to have been written to handle SQL NULLs correctly than to handle NaNs? correctly.
comment:10 Changed 4 years ago by
What to do with this ticket?
could we close it or find another appropriate solution...
comment:11 Changed 4 years ago by
Milestone: | 7.0.0 → 7.1.0 |
---|
Re-targeting the ticket for the time being...
comment:13 Changed 3 years ago by
I would suggest to close this ticket. Writing NULL instead of infinity seems an adequate behaviour to me.
comment:14 Changed 3 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Closing for now.
Feel free to reopen if the current solution of replacing "Inf" "Nan" with NULL is not considered sufficient any more, e.g. because Inf and Nan show up more frequent and there is a practical need / application for distinguishing Inf from NULL.
Applies for GRASS 6.4 too (new patch attached)...