Opened 12 years ago

Closed 9 years ago

Last modified 9 years ago

#198 closed defect (invalid)

v.in.ascii: column scanning is borked

Reported by: hamish Owned by: grass-dev@…
Priority: critical Milestone: 6.4.2
Component: Vector Version: svn-develbranch6
Keywords: v.in.ascii Cc: martinl
CPU: All Platform: All

Description

Hi,

this bug is related to the old RT bugs 2763 and 5209.

http://intevation.de/rt/webrt?serial_num=2763 http://intevation.de/rt/webrt?serial_num=5209

and the clumsy empty last-column work-around in v.in.gpsbabel:

http://trac.osgeo.org/grass/browser/grass/trunk/scripts/v.in.gpsbabel/v.in.gpsbabel#L298

"FIXME: if last field (comments) is empty it causes a not-enough fields error in v.in.ascii"

The column type scanning step in v.in.ascii's points mode no longer accepts empty columns as NULL, and imported tables have columns truncated. Note that passing empty values in double columns works in GRASS 6.2.3!

It would be nice to allow numeric columns as empty or 'NULL' for an empty record, and allow "nan" or "inf" without the scanning function deciding that the column contains strings. (For varchar columns the word 'NULL' should not be stripped however)

Input file:

cat << EOF > test.dat
cat|x|y|name|value|count
1|2.3|4.5|Foo|3.1415|4
2|2.4|4.6|Bar|||
EOF

Import without column declaration:

G64svn> v.in.ascii in=test.dat out=test_null_import skip=1 \
           cat=1 x=2 y=3 --verbose

Scanning input for column types...
Maximum input row length: 25
Maximum number of columns: 6
Minimum number of columns: 6
Column: 1 type: integer
Column: 2 type: double
Column: 3 type: double
Column: 4 type: string length: 3
Column: 5 type: string length: 0
Column: 6 type: string length: 0
Importing points...
Populating table...
Building topology for vector map <test_null_import>...
2 primitives registered      
Building areas:  100%
0 areas built      
0 isles built
Attaching islands: 
Attaching centroids:  100%
Topology was built
Number of nodes     :   2
Number of primitives:   2
Number of points    :   2
Number of lines     :   0
Number of boundaries:   0
Number of centroids :   0
Number of areas     :   0
Number of isles     :   0
v.in.ascii complete.

G64svn> v.info -c test_null_import
Displaying column types/names for database connection of layer 1:
INTEGER|int_1
DOUBLE PRECISION|dbl_1
DOUBLE PRECISION|dbl_2
CHARACTER|str_1
  • what happened to columns 5 and 6?
 Column: 5 type: string length: 0
 Column: 6 type: string length: 0
  • Columns 5 and 6 incorrectly scanned as (empty) "string" type.

Also, I am not sure if hiding the column scanning result behind --verbose mode is advisable, given that it is buggy and it is the first line of defense when the input file contains typos.

Import with column declaration:

G64svn> v.in.ascii in=test.dat out=test_null_import skip=1 \
          cat=1 x=2 y=3 --verbose \
          columns='cat int, x double, y double, name varchar(10), value double, count int'


Scanning input for column types...
Maximum input row length: 25
Maximum number of columns: 6
Minimum number of columns: 6
Column: 1 type: integer
Column: 2 type: double
Column: 3 type: double
Column: 4 type: string length: 3
Column: 5 type: string length: 0
Column: 6 type: string length: 0
WARNING: Table <test_null_import> linked to vector map <test_null_import>
         does not exist
ERROR: Column number 5 defined as double has string values
  • in addition to previous errors the "table does not exist" warning's meaning is a mystery.

changing the empty "||" to "|NULL|" doesn't help, the scanning step declares it as a string column (length: 4) and refuses to continue.

this is important code, so tread with greatest care.....

Hamish

Attachments (2)

v.in.ascii.patch (610 bytes) - added by mmetz 11 years ago.
patch for missing values
srs.txt (770 bytes) - added by mmetz 9 years ago.
las2txt output

Download all attachments as: .zip

Change History (14)

Changed 11 years ago by mmetz

Attachment: v.in.ascii.patch added

patch for missing values

comment:1 Changed 11 years ago by mmetz

Try attached patch for the missing values problem. NULL, nan or inf is still not recognized. There is however still a nonsense warning for completely empty columns declared double, but import is successful.

Markus M

comment:2 Changed 11 years ago by hamish

Milestone: 6.4.06.4.1

patch applied in 6.5 and 7; looks like it's fine but deferring backport to relbr64 until 6.4.1 to allow more testing.

Hamish

comment:3 Changed 10 years ago by (none)

Milestone: 6.4.1

Milestone 6.4.1 deleted

comment:4 Changed 10 years ago by hamish

Milestone: 6.4.1

comment:5 in reply to:  2 ; Changed 9 years ago by martinl

Cc: martinl added

Replying to hamish:

patch applied in 6.5 and 7; looks like it's fine but deferring backport to relbr64 until 6.4.1 to allow more testing.

it's already in relbr64. So can we close the ticket?

comment:6 in reply to:  5 Changed 9 years ago by martinl

Resolution: fixed
Status: newclosed

Replying to martinl:

Replying to hamish:

patch applied in 6.5 and 7; looks like it's fine but deferring backport to relbr64 until 6.4.1 to allow more testing.

it's already in relbr64. So can we close the ticket?

Closing for now.

comment:7 Changed 9 years ago by mmetz

Milestone: 6.4.16.4.2
Resolution: fixed
Status: closedreopened

Still not working. Test data are LiDAR laz data available here

http://liblas.org/samples/

The file I used is srs.laz

The commands

las2txt -i srs.laz -o srs.ascii --parse xyztiaunrcCpedRGB --delimiter "|"

# check ascii file
head srs.ascii 
289814.15|4320978.61|170.76|499450.80599405|260|||6|0|2|Ground|0|0|0|0|0|0
289814.64|4320978.84|170.76|499450.80600805|280|||6|0|2|Ground|0|0|0|0|0|0
289815.12|4320979.06|170.75|499450.80602205|280|||6|0|2|Ground|0|0|0|0|0|0

# import in GRASS
las2txt -i srs.laz --stdout --parse xyztiaunrcCpedRGB --delimiter "|" | v.in.ascii in=- out=srs_ascii -z x=1 y=2 z=3 --o

# only the first 5 columns were imported

# check table contents
v.db.select srs_ascii where="cat = 1"
cat|dbl_1|dbl_2|dbl_3|dbl_4|int_1
1|289814.15|4320978.61|170.76|499450.80599405|260

Markus M

comment:8 in reply to:  7 ; Changed 9 years ago by mmetz

Resolution: invalid
Status: reopenedclosed

Replying to mmetz:

Still not working. Test data are LiDAR laz data available here

http://liblas.org/samples/

The file I used is srs.laz

[snip]

# only the first 5 columns were imported

It's not v.in.ascii, it's G_getl2() that fails to fetch the whole line, probably because of some obscure encoding of the output of las2txt which I am not able to figure out, or las2txt writes weird characters for empty fields.

Closing as invalid.

comment:9 in reply to:  8 ; Changed 9 years ago by hamish

Replying to mmetz:

It's not v.in.ascii, it's G_getl2() that fails to fetch the whole line, probably because of some obscure encoding of the output of las2txt which I am not able to figure out, or las2txt writes weird characters for empty fields.

Hi,

instead of piping to v.in.ascii can you save to a file which we can have a peek at in hexdump? what version of las2txt? does the same happen with the snake lidar sample data from the grass wiki lidar page or just this dataset?

(if you found it others probably will too)

Hamish

Changed 9 years ago by mmetz

Attachment: srs.txt added

las2txt output

comment:10 in reply to:  9 ; Changed 9 years ago by mmetz

Replying to hamish:

Replying to mmetz:

It's not v.in.ascii, it's G_getl2() that fails to fetch the whole line, probably because of some obscure encoding of the output of las2txt which I am not able to figure out, or las2txt writes weird characters for empty fields.

Hi,

instead of piping to v.in.ascii can you save to a file which we can have a peek at in hexdump? what version of las2txt? does the same happen with the snake lidar sample data from the grass wiki lidar page or just this dataset?

las2txt version: libLAS 1.6.1 with GeoTIFF 1.3.0 GDAL 1.8.0 LASzip 1.2.0

The same happens with "Serpent Mound Model LAS Data.las" from the grass wiki lidar page.

Attached is the las2txt output for srs.laz.

I am pretty sure this problem is caused by las2txt which does not check if a given attribute exists. If it does not exist, some weird value is written.

Markus M

comment:11 in reply to:  10 ; Changed 9 years ago by hamish

Replying to mmetz:

Attached is the las2txt output for srs.laz.

I am pretty sure this problem is caused by las2txt which does not check if a given attribute exists. If it does not exist, some weird value is written.

correct. columns 6 and 7 are not empty.

as viewed in less:

289814.15|4320978.61|170.76|499450.80599405|260|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289814.64|4320978.84|170.76|499450.80600805|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289815.12|4320979.06|170.75|499450.80602205|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289815.60|4320979.28|170.74|499450.80603605|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289816.08|4320979.50|170.68|499450.80605005|260|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289816.56|4320979.71|170.66|499450.80606405|240|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289817.03|4320979.92|170.63|499450.80607806|240|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289817.53|4320980.16|170.62|499450.80609206|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289818.01|4320980.38|170.61|499450.80610606|280|^@|^@|6|0|2|Ground|0|0|0|0|0|0
289818.50|4320980.59|170.58|499450.80612006|260|^@|^@|6|0|2|Ground|0|0|0|0|0|0

^@ means the null char.

I think it is reasonable for G_getl2() to stop on null terminators, and there's nothing more to do here but file a bug with las2txt.

Hamish

comment:12 in reply to:  11 Changed 9 years ago by mmetz

Replying to hamish:

Replying to mmetz:

Attached is the las2txt output for srs.laz.

I am pretty sure this problem is caused by las2txt which does not check if a given attribute exists. If it does not exist, some weird value is written.

correct. columns 6 and 7 are not empty.

[snip]

I think it is reasonable for G_getl2() to stop on null terminators, and there's nothing more to do here but file a bug with las2txt.

I would rather call this a user error that I did because the proper way to do it would be to investigate the .la[s|z] file first with lasinfo, decide what attributes I want to import based on the attributes available and then set the --parse options accordingly. Or use v.in.lidar which does it all automatically;-)

Markus M

Note: See TracTickets for help on using tickets.