Opened 4 years ago

Closed 2 years ago

#2343 closed defect (fixed)

GRASS 7: "inf" values break insert statements in v.rast.stats

Reported by: sbl Owned by: grass-dev@…
Priority: normal Milestone: 7.2.0
Component: Python Version: svn-releasebranch70
Keywords: v.rast.stats Cc:
CPU: All Platform: All

Description

When running v.rast.stats on a dataset with small geometries (which in my case were represented by 2 pixels in the raster), the parameter "coeff_var" can result in "inf" value. "inf" breaks the INSERT statement and results in an DBMI error "no such column: inf", which at the end crashes v.rast.stats.

Just like "nan", "inf" could be replaced by NULL. Changing line 263 in v.rast.stats.py into this: if value.lower().endswith('nan') or value.lower().endswith('inf'): should work.

Both GRASS 7.0 and 7.1 are affected. Please find attached patches for both versions...

Attachments (3)

v.rast.stats.g70.diff (688 bytes) - added by sbl 4 years ago.
v.rast.stats.g71.diff (688 bytes) - added by sbl 4 years ago.
v.rast.stats.g64.diff (839 bytes) - added by sbl 4 years ago.

Download all attachments as: .zip

Change History (17)

Changed 4 years ago by sbl

Attachment: v.rast.stats.g70.diff added

Changed 4 years ago by sbl

Attachment: v.rast.stats.g71.diff added

Changed 4 years ago by sbl

Attachment: v.rast.stats.g64.diff added

comment:1 Changed 4 years ago by sbl

Applies for GRASS 6.4 too (new patch attached)...

comment:2 in reply to:  description ; Changed 4 years ago by glynn

Replying to sbl:

"inf" breaks the INSERT statement and results in an DBMI error "no such column: inf", which at the end crashes v.rast.stats.

Just like "nan", "inf" could be replaced by NULL.

FWIW, PostgreSQL supports infinity and NaN values, e.g.

CREATE TABLE test ( value REAL ) ;
INSERT INTO test VALUES (REAL 'Infinity');
INSERT INTO test VALUES (REAL '+Infinity');
INSERT INTO test VALUES (REAL '-Infinity');
INSERT INTO test VALUES (REAL 'NaN');
SELECT * FROM test ;
   value   
-----------
  Infinity
  Infinity
 -Infinity
       NaN
(4 rows)

I don't know whether this is part of the SQL standard, or whether it is supported by the other back-ends. It won't work for columns of type NUMERIC.

comment:3 in reply to:  2 ; Changed 4 years ago by sbl

Replying to glynn:

FWIW, PostgreSQL supports infinity and NaN values, e.g. I don't know whether this is part of the SQL standard, or whether it is supported by the other back-ends. It won't work for columns of type NUMERIC.

Thanks for the hint, I was not aware of that.

Now I investigated a bit and it seems that SQLite (as default DBMI in GRASS 7) also supports NaN and Infinity. However, using quoted literal strings instead of NULL in v.rast.stas (regardless if it is "NaN", "inf", "Infinity", "INF" ...), always resulted in a value 0 in the attribute table (in my example at least). So, supporting NaN or Inf is unfortunately not as straightforward as I hoped...

At the end it is also a question wether NaN or Inf is preferable over NULL in the database... (you can fnd a discussion about it in an SQLite user forum: https://groups.google.com/forum/#!msg/sqlite_users/aQX_WvVwBS8/Tqo_agM8_14J)

comment:4 in reply to:  description ; Changed 4 years ago by lucadelu

Replying to sbl:

When running v.rast.stats on a dataset with small geometries (which in my case were represented by 2 pixels in the raster), the parameter "coeff_var" can result in "inf" value. "inf" breaks the INSERT statement and results in an DBMI error "no such column: inf", which at the end crashes v.rast.stats.

Just like "nan", "inf" could be replaced by NULL. Changing line 263 in v.rast.stats.py into this: if value.lower().endswith('nan') or value.lower().endswith('inf'): should work.

Both GRASS 7.0 and 7.1 are affected. Please find attached patches for both versions...

Could you provide a dataset to test the patches?

Thanks

comment:5 in reply to:  4 ; Changed 4 years ago by sbl

Replying to lucadelu:

Could you provide a dataset to test the patches?

You can reproduce the issue with these commands:

g.region -p n=1 s=0 e=2 w=0 res=1 
r.mapcalc --o expression="rastermap1=1"
r.mapcalc --o expression="rastermap2=double(if(x()==0.5,-1,1))"
r.to.vect --o --v input=rastermap1 output=vectormap type=area
v.rast.stats -c --verbose map=vectormap raster=rastermap2 column_prefix=value

comment:6 in reply to:  5 Changed 4 years ago by lucadelu

Replying to sbl:

Replying to lucadelu:

Could you provide a dataset to test the patches?

You can reproduce the issue with these commands:

g.region -p n=1 s=0 e=2 w=0 res=1 
r.mapcalc --o expression="rastermap1=1"
r.mapcalc --o expression="rastermap2=double(if(x()==0.5,-1,1))"
r.to.vect --o --v input=rastermap1 output=vectormap type=area
v.rast.stats -c --verbose map=vectormap raster=rastermap2 column_prefix=value

In r62938 I should fixed the error. I change a little bit your patch to cover multiple way to catch Infinity value.

If it is working I'll backport the changeset to GRASS70.

I also applied your patch for GRASS64 version in r62944.

comment:7 in reply to:  3 Changed 4 years ago by glynn

Replying to sbl:

At the end it is also a question wether NaN or Inf is preferable over NULL in the database...

Well, for rasters, NaN is null (in 6.x, one specific NaN is treated as null, in 7.x any NaN value is treated as null). Also, code which is expecting a definite value probably won't handle a NaN correctly because of their unusual semantics.

Ideally, infinities should be preserved rather than converted to an SQL NULL. If this is just a matter of fixing the int<->string conversions (rather than blindly using e.g. sprintf/sscanf "%f"), then it should be done.

comment:8 Changed 4 years ago by neteler

...should r62938 be reverted and implemented differently?

comment:9 in reply to:  8 Changed 4 years ago by glynn

Replying to neteler:

...should r62938 be reverted and implemented differently?

Maybe. It depends upon how much work is involved.

Infinities are sufficiently rare that it probably doesn't matter how they're handled (e.g. r.mapcalc explicitly checks for division by zero, log(0) etc and returns null). If it's just a matter of writing float_to_string() and string_to_float() functions which deal with them correctly (and we know how to write them), it should be done. If there are other complexities, or if every supported database has a different representation, don't bother.

Storing NaNs? in a database is probably undesirable. I wouldn't assume that all databases will implement the same semantics, or even sane semantics (the fact that they aren't equal to themselves, that all of x<NaN, x>NaN, NaN<x and NaN>x are false, etc tends to wreak havoc with any algorithm which isn't expecting them). Code which retrieves values from a database is more likely to have been written to handle SQL NULLs correctly than to handle NaNs? correctly.

comment:10 Changed 3 years ago by lucadelu

What to do with this ticket?

could we close it or find another appropriate solution...

comment:11 Changed 3 years ago by sbl

Milestone: 7.0.07.1.0

Re-targeting the ticket for the time being...

comment:12 Changed 3 years ago by neteler

Milestone: 7.1.07.2.0

Milestone renamed

comment:13 Changed 2 years ago by mlennert

I would suggest to close this ticket. Writing NULL instead of infinity seems an adequate behaviour to me.

comment:14 Changed 2 years ago by sbl

Resolution: fixed
Status: newclosed

Closing for now.

Feel free to reopen if the current solution of replacing "Inf" "Nan" with NULL is not considered sufficient any more, e.g. because Inf and Nan show up more frequent and there is a practical need / application for distinguishing Inf from NULL.

Note: See TracTickets for help on using tickets.