Opened 6 years ago

Last modified 11 months ago

#2499 new defect

v.in.ogr in does not handle "nan" in attribute tables properly

Reported by: sbl Owned by: grass-dev@…
Priority: normal Milestone: 7.8.3
Component: Vector Version: unspecified
Keywords: v.in.ogr, DBMI, SQLite, nan Cc:
CPU: Unspecified Platform: Unspecified

Description

v.in.ogr in GRASS 7.0 (r58048) does not seem to handle "nan" in attribute tables (of shape files) properly. Looks like it is treated like a name of an attribute column.

When I try to import a shape file I get the following error message:

DBMI-SQLite driver error:
Error in sqlite3_prepare():
no such column: nan

After converting the shape-file to sqlite-format (ogr2ogr) in advance of the import to GRASS, everything works fine. Maybe "nan" got replaced by another type of NoData? value during that operaton...

Change History (14)

comment:1 Changed 5 years ago by martinl

Milestone: 7.0.07.0.5

comment:2 Changed 4 years ago by neteler

Milestone: 7.0.57.0.6

comment:3 Changed 3 years ago by neteler

Milestone: 7.0.67.0.7

comment:4 Changed 22 months ago by martinl

Milestone: 7.0.77.6.1

comment:5 Changed 20 months ago by martinl

Milestone: 7.6.17.6.2

Ticket retargeted after milestone closed

comment:6 Changed 12 months ago by neteler

Is this still relevant?

comment:7 Changed 11 months ago by alexbruy

Seems still true with 7.8.2. For example, when I run v.net.centrality it fails during saving results with the following message

Flattening the graph...
Graph was built
Computing degree centrality measure
Computing betweenness and/or closeness centrality measure
0..5..10..15..20..25..30..35..40..45..50..55..60..65..70..75..80..85..90..95..Computing eigenvector centrality measure
Writing data into the table...
3..7..11..15..19..23..26..30..34..38..42..46..50..53..57..61..65..69..73..76..80..84..88..92..96..100
DBMI-SQLite driver error:
Error in sqlite3_prepare():
no such column: nan
DBMI-SQLite driver error:
Error in sqlite3_prepare():
no such column: nan
ERROR: Cannot insert new record: INSERT INTO output748b41e8dcf84a96b7d78ba17246cc20 VALUES(27,0.100000,0.000000,30.000000,-nan)

Using network.shp from QGIS test dataset at https://github.com/qgis/QGIS/tree/master/python/plugins/processing/tests/testdata/custom/grass7

comment:8 Changed 11 months ago by neteler

Milestone: 7.6.27.8.3
Summary: GRASS70: v.in.ogr in does not handle "nan" in attribute tables properlyv.in.ogr in does not handle "nan" in attribute tables properly

For some discussion, see also #2343

comment:9 Changed 11 months ago by mmetz

This is not a problem of v.in.ogr, but of SQLite, and possibly other GRASS db drivers.

A solution would be to change all GRASS modules that update attribute tables and replace nan/-nan with NULL.

Another solution would be to replace nan/-nan with NULL in db_execute_immediate() which is probably easier than changing all modules but might have unexpected side effects because SQL statements passed to the db drivers can be quite complex.

comment:10 in reply to:  7 Changed 11 months ago by mmetz

Replying to alexbruy:

Seems still true with 7.8.2. For example, when I run v.net.centrality it fails

I could not reproduce the problem, trying all output options of v.net.centrality. Please provide the exact command(s) you used.

comment:11 Changed 11 months ago by sbl

Here is how it was handeled in Pandas: https://github.com/pandas-dev/pandas/pull/8208

As a simple example for reproduction (though not with v.net.centrality) you could do:

db.execute sql="CREATE TABLE test (column double precision);"
db.execute sql="INSERT INTO test VALUES (-nan);"
# in contrast to
db.execute sql="INSERT INTO test VALUES (NULL);"

SQLite seems to have a workaround: http://system.data.sqlite.org/index.html/tktview/e06c4caff3c433c80616ae5c6df63fc830825e59 But that does not seem to be available in default packages of SQLite and other DBMIs seem to struggle With nan/NaN as well… So, this does not seem to be trivial...

comment:12 in reply to:  11 Changed 11 months ago by mmetz

Replying to sbl:

Here is how it was handeled in Pandas: https://github.com/pandas-dev/pandas/pull/8208

As a simple example for reproduction (though not with v.net.centrality) you could do:

db.execute sql="CREATE TABLE test (column double precision);"
db.execute sql="INSERT INTO test VALUES (-nan);"
# in contrast to
db.execute sql="INSERT INTO test VALUES (NULL);"

SQLite seems to have a workaround: http://system.data.sqlite.org/index.html/tktview/e06c4caff3c433c80616ae5c6df63fc830825e59 But that does not seem to be available in default packages of SQLite and other DBMIs seem to struggle With nan/NaN as well… So, this does not seem to be trivial...

The SQLite workaround with GetAllAsText seems ugly. What works is

db.execute sql="INSERT INTO test VALUES ('-nan');"

but then SQLite converts '-nan' to -0 (zero) which is wrong. Same for inf/-inf. Using NULL is also not correct, because NULL means "no entry", whereas nan and inf are numbers that can be represented as floating point numbers.

comment:13 Changed 11 months ago by sbl

In addition, the different DMBS behave differently with regards to nan/inf...

This is how PostgreSQL handles NaN / Inf in double columns:

CREATE TEMPORARY TABLE test_insert_nan (double_column double precision);
INSERT INTO test_insert_nan VALUES ('nan');
INSERT INTO test_insert_nan VALUES ('-nan');
INSERT INTO test_insert_nan VALUES ('inf');
INSERT INTO test_insert_nan VALUES ('-inf');
SELECT * FROM test_insert_nan;
double_column
double precision
"NaN"
"NaN"
"Infinity"
"-Infinity"
SELECT * FROM test_insert_nan WHERE double_column > 0;
double_column
double precision
"NaN"
"NaN"
"Infinity"
SELECT * FROM test_insert_nan WHERE double_column < 0;
double_column
double precision
"-Infinity"

However, in PostgreSQL does not insert nan / inf strings columns of type numeric...

comment:14 in reply to:  13 Changed 11 months ago by mmetz

Replying to sbl:

In addition, the different DMBS behave differently with regards to nan/inf...

This is how PostgreSQL handles NaN / Inf in double columns:

[...]

SELECT * FROM test_insert_nan WHERE double_column > 0;
double_column
double precision
"NaN"
"NaN"
"Infinity"

this is wrong because nan is not > 0. nan is also not < 0 and not equal 0. nan is also not equal nan. Apparently a bug in how PostgreSQL handles NaN.

Regarding v.in.ogr, I guess that setting nan/inf to NULL is the safest option.

GRASS modules that create attribute tables should not insert nan/inf if these values should not occur according to the logic of the module, as e.g. v.net.centrality. If nan/inf does occur, the respective module needs to be fixed.

Note: See TracTickets for help on using tickets.