Opened 15 years ago

Closed 15 years ago

Last modified 15 years ago

#809 closed defect (fixed)

v.db.addtable consistently fails in winGrass

Reported by: JonBall Owned by: grass-dev@…
Priority: critical Milestone: 6.4.0
Component: Vector Version: svn-releasebranch64
Keywords: v.db.addtable, wingrass Cc:
CPU: x86-64 Platform: MSWindows Vista

Description

Platform: Vista grass obtained from: Trento Italy site grass binary for platform: as Downloaded GRASS Version: 6.4.0SVN (r39626) 2009-10-27

v.db.addtable consistently fails but returns no error message. When I try to use this command at the Windows Grass Command Line (Windows shell), I always get an empty shell prompt:


GRASS 6.4.0svn (spearfish60)> v.db.addtable map=mysoils layer=2 col="left intege r,right integer" "C:/GRASS-6-SVN/msys/bin/sh.exe"-2.04$


The only apparent way to proceed it to type 'exit' at the shell prompt and the command will terminate without creating a table. I am trying to reproduce the script in the documentation for v.to.db (http://grass.itc.it/grass64/manuals/html64_user/v.to.db.html - "Upload category numbers of left and right area, to an attribute table of boundaries common for the areas").

This error is consistent whatever data I try and use, whether it is the demo Spearfish60, North-Carolina or some other data.

I have reproduced the same error on two independent installations of winGrass on two different machines (both using the same pre-compiled binaries).

I have tried switching to sqlite (same error). I have tried re-installing winGrass.

Attachments (1)

Vlib_field_dbln_spaces.diff (1.4 KB ) - added by hamish 15 years ago.
patch to deal with spaces in dbln DBs

Download all attachments as: .zip

Change History (27)

comment:1 by martinl, 15 years ago

Priority: blockercritical
Version: svn-trunksvn-releasebranch64

comment:2 by hellik, 15 years ago

tested in MSYS-shell:

GRASS 6.4.0svn (nc_spm_08):c:/osgeo4w/apps/grass/bin > v.db.addtable map=mysoils layer=2 col="left integer, right int
eger"
Using vector map name extended by layer number as table name: mysoils_2
Creating table with columns (cat integer, left integer, right integer)
The table <mysoils_2> is now part of vector map <mysoils> and may be
deleted or overwritten by GRASS modules
Select privileges were granted on the table
Reading features...
 100%
Updating database...
 100%
0 categories read from vector map (layer 2)
0 records updated/inserted (layer 2)
Current attribute table links:
layer <1> table <mysoils> in database <C:\gisdata\grassdata/nc_spm_08/user1/dbf/> through driver <dbf> with key <cat>
layer <2> table <mysoils_2> in database <C:\gisdata\grassdata/nc_spm_08/user1/dbf/> through driver <dbf> with key <cat>
Vector map <mysoils@user1> is connected by:
GRASS 6.4.0svn (nc_spm_08):c:/osgeo4w/apps/grass/bin > 

comment:3 by hamish, 15 years ago

Keywords: v.db.addtable wingrass added; addtable removed

comment:4 by hamish, 15 years ago

hopefully fixed by r40271. please test, as like hellik I can't reproduce it.

I notice that v.db.droptable is not available from the GUI menu?

Hamish

comment:5 by hamish, 15 years ago

hmm, there is still a quoting bug in v.to.db called by v.db.addtable:

Reading features...
 100%
and: no such driver available
WARNING: unable to start driver <and>
ERROR: Unable to open database <C:\Documents> by driver <and>

(the mapset is in C:\Documents and Settings\)

ah, here we go: Line 634 of lib/vector/Vlib/field.c scanning the dbln file does not like spaces in the db path string-

  ndef = sscanf(buf, "%s %s %s %s %s", fldstr, tab, col, db, drv);

Hamish

in reply to:  5 comment:6 by hamish, 15 years ago

Replying to hamish:

ah, here we go: Line 634 of lib/vector/Vlib/field.c scanning the dbln file does not like spaces in the db path string-

  ndef = sscanf(buf, "%s %s %s %s %s", fldstr, tab, col, db, drv);

(I've got no idea how to fix that)

comment:7 by hamish, 15 years ago

note you have to be using layer 2 or greater to see this. for layer 1 the DB path is still the unparsed "$GISDBASE/$LOCATION_NAME/$MAPSET/dbf/". On the next iteration for layer 2 it is converted to the real expanded path.

Hamish

in reply to:  7 ; comment:8 by mmetz, 15 years ago

Replying to hamish:

note you have to be using layer 2 or greater to see this. for layer 1 the DB path is still the unparsed "$GISDBASE/$LOCATION_NAME/$MAPSET/dbf/". On the next iteration for layer 2 it is converted to the real expanded path.

I think this is a bug. v.db.addtable should always use the default, unparsed database connection settings because there is no database option in v.db.addtable.

If a table already exists in layer 1 and a new table is to be added for another layer, v.db.addtable uses the definition for layer 1 as returned by v.db.connect -g which parses "$GISDBASE/$LOCATION_NAME/$MAPSET/dbf/". Suggested fix would be to always use the default unparsed database connection, i.e. skip the check in v.db.addtable lines 119 - 136 (grass64).

Markus M

in reply to:  8 ; comment:9 by hamish, 15 years ago

Hamish:

hmm, there is still a quoting bug in v.to.db called by v.db.addtable:

...

ah, here we go: Line 634 of lib/vector/Vlib/field.c scanning the dbln file does not like spaces in the db path string-

  ndef = sscanf(buf, "%s %s %s %s %s", fldstr, tab, col, db, drv);

note you have to be using layer 2 or greater to see this.

I'm guessing G_tokenize(buf, ' ') + the same approach taken by the wxGUI for this some weeks ago and pull off db as tokens 4 to (ntok-1) and then drv with the last of G_number_of_tokens()?

I suppose a similar thing could be done with strrchr() to chop the drv off the string first, set strrchr() to '\0', then to get db go back and read %[^\n] to the end of the remaining line?

any recommendations from C string parsing experts?

Replying to mmetz:

I think this is a bug. v.db.addtable should always use the default, unparsed database connection settings because there is no database option in v.db.addtable.

note that layer 2 parsing is done by libvect (called by v.to.db) doing the damage here as it scans through the layers, if the shell script does expansion or not, it is more general & will fail with any dbln with a space in the path. the script expansion just makes this bug be triggered more often.

# finally we have to add cats into the attribute DB to make modules such as v.what.rast happy:
# (creates new row for each vector line):

what you mention here is a secondary "feature" (which probably doesn't help) who's main draw back is that it makes the vector map less portable.

If a table already exists in layer 1 and a new table is to be added for another layer, v.db.addtable uses the definition for layer 1 as returned by v.db.connect -g which parses "$GISDBASE/$LOCATION_NAME/$MAPSET/dbf/". Suggested fix would be to always use the default unparsed database connection, i.e. skip the check in v.db.addtable lines 119 - 136 (grass64).

hmmm ok;

g.copy v=roads,myroads
v.db.addtable myroads layer=2 col="left integer, right integer"

dbln:
1 myroads cat $GISDBASE/$LOCATION_NAME/$MAPSET/dbf/ dbf
2 myroads_2 cat /home/hamish/grassdata/spearfish60/user1/dbf/ dbf

if a layer 1 exists* it uses the default mapset/VAR values from db.connect, see lines 124-125 and 131-132. it only reuses (&expands) layer 1's DB path definition on line 128. A secondary question is should the new layer prefer to use the mapset's default DB or the map's existing layer 1 DB? I don't know the answer to that, I think we just have to pick one.

Is there a command print the unexpanded version or do we just resort to raw parsing of the dbln file to get it?

[*] (is it possible for there to be no layer 1 but do have a layer >1? if so will that cause problems for this script?)

Hamish

in reply to:  9 comment:10 by hamish, 15 years ago

Replying to hamish:

I'm guessing G_tokenize(buf, ' ') + the same approach taken by the wxGUI for this some weeks ago and pull off db as tokens 4 to (ntok-1) and then drv with the last of G_number_of_tokens()?

I suppose a similar thing could be done with strrchr() to chop the drv off the string first, set strrchr() to '\0', then to get db go back and read %[^\n] to the end of the remaining line?

an advantage of the G_tokenize() approach is that we can check for a minimum number of tokens.

if a layer 1 exists* it uses the default mapset/VAR values from db.connect,

that should read "does not exist"

Hamish

comment:11 by hamish, 15 years ago

stuff to audit, both for spaces in path names and for correct column parsing when fs=':' and a path like "C:\Documents and Settings" is being parsed.

lib/db/
dbmi_base/login.c:      ret = sscanf(buf, "%[^ ] %[^ ] %[^ ] %[^ ]", dr, db, usr, pwd);
dbmi_base/dbmscap.c:    if (sscanf(buf, "%[^:]:%[^:]:%[^:\n]", name, startup, comment) == 3)
dbmi_base/dbmscap.c:    else if (sscanf(buf, "%[^:]:%[^:\n]", name, startup) == 2)


lib/vector/
Vlib/field.c:   ndef = sscanf(buf, "%s %s %s %s %s", fldstr, tab, col, db, drv);

Hamish

in reply to:  11 ; comment:12 by mmetz, 15 years ago

Replying to hamish:

stuff to audit, both for spaces in path names and for correct column parsing when fs=':' and a path like "C:\Documents and Settings" is being parsed.

I had a look at the vector lib routines for reading/writing database connections (dblinks in the vector libs).

The database is written to dbln without parsing any variables in it; that makes it possible to copy entire locations or provide sample datasets with vectors and attribute tables. Any variables in the database info are parsed by Vect_get_dblink() which calls Vect_subst_var().

v.db.addtable calls v.db.connect which calls Vect_get_dblink() -> any variables like $GISDBASE/$LOCATION_NAME/$MAPSET are already substituted.

The only way I see to get the database used by an existing layer *without* parsing any variables is raw parsing of the dbln file. AFAICT, vector modules use default database connection settings when adding a new table, also when e.g. adding a table for layer 2 and there is already a table connected to layer 1. Therefore it would not be unusual behaviour if v.db.addtable also always uses the default database connections.

Markus M

in reply to:  12 ; comment:13 by hamish, 15 years ago

Replying to hamish:

stuff to audit, both for spaces in path names and for correct column parsing when fs=':' and a path like "C:\Documents and Settings" is being parsed.

Replying to mmetz:

I had a look at the vector lib routines for reading/writing database connections (dblinks in the vector libs).

The database is written to dbln without parsing any variables in it; that makes it possible to copy entire locations or provide sample datasets with vectors and attribute tables. Any variables in the database info are parsed by Vect_get_dblink() which calls Vect_subst_var().

However, there is no reason to assume that the user has not manually connected the vector to a real pathname. And in those cases any code which reads the dbln file will need to be able to deal with databases with spaces in the path name.

suggestion: This is really a failure of the dbln file format. In grass 7 change the dbln file format to use '|' as the field sep. Include code in both the g7 and g6.5 the library functions to quietly read either during the transition period. We have done something similar for the color table format which used to be "value red green blue" but is now "value red:green:blue".

v.db.addtable calls v.db.connect which calls Vect_get_dblink() -> any variables like $GISDBASE/$LOCATION_NAME/$MAPSET are already substituted.

The only way I see to get the database used by an existing layer *without* parsing any variables is raw parsing of the dbln file.

I came to the same conclusion and checked in that exact fix to svn a couple of hours ago.

AFAICT, vector modules use default database connection settings when adding a new table, also when e.g. adding a table for layer 2 and there is already a table connected to layer 1. Therefore it would not be unusual behaviour if v.db.addtable also always uses the default database connections.

For now I have followed the path of least change and left it using what the map is already using. One thing I see in favor of having it use the mapset's VAR setting instead is that you can control the VAR setting. You can't control what the map already has in it unless you rebuild the map (which is perhaps not what you want). Another plausible possibility is to add database= and driver= options to v.db.addtable.

Hamish

in reply to:  13 comment:14 by hamish, 15 years ago

Replying to hamish:

Another plausible possibility is to add database= and driver= options to v.db.addtable.

fwiw if it is to change I prefer the VAR option as it keeps v.db.addtable simple.

in reply to:  13 ; comment:15 by mmetz, 15 years ago

Replying to hamish:

Replying to mmetz:

I had a look at the vector lib routines for reading/writing database connections (dblinks in the vector libs).

The database is written to dbln without parsing any variables in it; that makes it possible to copy entire locations or provide sample datasets with vectors and attribute tables. Any variables in the database info are parsed by Vect_get_dblink() which calls Vect_subst_var().

However, there is no reason to assume that the user has not manually connected the vector to a real pathname. And in those cases any code which reads the dbln file will need to be able to deal with databases with spaces in the path name.

Right, and it should also be broken on Linux if there are whitespaces in the path name when manually connecting a table. A mystery that no one discovered that yet...

suggestion: This is really a failure of the dbln file format. In grass 7 change the dbln file format to use '|' as the field sep. Include code in both the g7 and g6.5 the library functions to quietly read either during the transition period. We have done something similar for the color table format which used to be "value red green blue" but is now "value red:green:blue".

OK. And use G_tokenize() I assume.

You can't control what the map already has in it unless you rebuild the map (which is perhaps not what you want). Another plausible possibility is to add database= and driver= options to v.db.addtable.

Makes sense, and should not make the code too complicated. Something for g7?

Markus M

by hamish, 15 years ago

Attachment: Vlib_field_dbln_spaces.diff added

patch to deal with spaces in dbln DBs

comment:16 by hamish, 15 years ago

I've just attached to this ticket a patch against 6.5 which allows spaces in pathnames for hardcoded database paths using G_tokenize().

please review.

previously there was behavior in Vect_read_dblinks() which allowed the dbln file to only contain the layer number and table name if you'd already defined the others in previous layer links.

I've tried to preserve that, but if the path has spaces I don't think it is possible to distinguish between if the last entry is supposed to be the driver or the end of the path name. Therefore we lose a tiny bit of functionality here, but I don't think anyone knew about or was using that trick anyway.

Hamish

in reply to:  15 comment:17 by hamish, 15 years ago

Replying to mmetz:

Right, and it should also be broken on Linux if there are whitespaces in the path name when manually connecting a table.

I just tested, it is.

A mystery that no one discovered that yet...

I suspect they probably have, we've just never put all the pieces together before. Especially Mac users who are more likely to see spaces in path names.

suggestion: This is really a failure of the dbln file format. In grass 7 change the dbln file format to use '|' as the field sep. Include code in both the g7 and g6.5 the library functions to quietly read either during the transition period.

...

OK. And use G_tokenize() I assume.

in G_parser() is a G_tokenize() call which defines the sep as a two-character array, which would be an easy way to temporarily allow either | or " " as the sep.

You can't control what the map already has in it unless you rebuild the map (which is perhaps not what you want). Another plausible possibility is to add database= and driver= options to v.db.addtable.

Makes sense, and should not make the code too complicated. Something for g7?

or now or 6.4.1, if it is decided that inability to set a different DB for new tables with v.db.addtable is actually a bug. (I guess you can already adjust it to the desired value v.db.connect after creation, but that's a bit ugly)

Hamish

in reply to:  16 ; comment:18 by mmetz, 15 years ago

Replying to hamish:

I've just attached to this ticket a patch against 6.5 which allows spaces in pathnames for hardcoded database paths using G_tokenize().

please review.

Tested, works fine. One minor bug though: in the new v.db.addtable line 129

eval `g.findfile elem="vector/$GIS_OPT_MAP" file=dbln`

does not work with a fully qualified name, the default settings in $MAPSET/VAR are used. Maybe something like

eval `g.findfile elem=vector file=$GIS_OPT_MAP`
eval `g.findfile elem=vector/$name file=dbln mapset=$mapset`

?

previously there was behavior in Vect_read_dblinks() which allowed the dbln file to only contain the layer number and table name if you'd already defined the others in previous layer links.

I've tried to preserve that, but if the path has spaces I don't think it is possible to distinguish between if the last entry is supposed to be the driver or the end of the path name. Therefore we lose a tiny bit of functionality here, but I don't think anyone knew about or was using that trick anyway.

In grass64+, Vect_write_dblinks() always writes the full connection definition, so I would not worry too much. I have not looked at previous versions though to see if for layers > 1, only layer number and table name were written if the rest was identical to layer 1.

For grass7, I would suggest to use fs="|" as default (anything but whitespace) whenever dblinks are written or printed, e.g. Vect_write_dblinks() and v.db.connect -p

Markus M

in reply to:  18 ; comment:19 by hamish, 15 years ago

Replying to mmetz:

Replying to hamish:

I've just attached to this ticket a patch against 6.5 which allows spaces in pathnames

...

Tested, works fine.

committed in 6.5svn and trunk.

One minor bug though: in the new v.db.addtable line 129

[g.findfile]

does not work with a fully qualified name,

interestingly it pretty much works if you just chop off the @mapset part because g.findfile searches the full mapset path as a module would.

Anyway, I've ripped this code out now in favor of using the VAR file's default DB settings for the mapset. So you are able to now control it instead of being locked to what the map was already using. Committed in 6.5svn and trunk.

For grass7, I would suggest to use fs="|" as default (anything but whitespace) whenever dblinks are written or printed, e.g. Vect_write_dblinks() and v.db.connect -p

Done and done. I'm sure we'll hear about it if there are any problems, but after some light testing it seems to work ok.

the only thing left to decide is when to backport this stuff to 6.4.

Hamish

in reply to:  11 ; comment:20 by hamish, 15 years ago

Replying to hamish:

stuff to audit:

 lib/db/
 dbmi_base/login.c:      ret = sscanf(buf, "%[^ ] %[^ ] %[^ ] %[^ ]", dr, db, usr, pwd);
 dbmi_base/dbmscap.c:    if (sscanf(buf, "%[^:]:%[^:]:%[^:\n]", name, startup, comment) == 3)
 dbmi_base/dbmscap.c:    else if (sscanf(buf, "%[^:]:%[^:\n]", name, startup) == 2)

(still needs to be done)

Hamish

in reply to:  20 ; comment:21 by hamish, 15 years ago

Replying to hamish:

stuff to audit:

  lib/db/dbmi_base/login.c
  dbmi_base/login.c:      ret = sscanf(buf, "%[^ ] %[^ ] %[^ ] %[^ ]", dr, db, usr, pwd);

This one will fail if the database path or passwd has spaces in it. In grass 6 this file is ~/.grasslogin6. In grass 7 it is in ~/.grass7/dblogin.

  • What this means is that Postgre, MySQL, etc will currently fail on WinGrass unless the databases are installed in a dir without spaces in them- which you may not have easy control over. I don't use those, so have nothing to test with. Hopefully someone who does use a DB that runs a server can have a look at this.

Because it is a different file for grass 7 we don't have to worry about transitioning. So I've gone ahead and converted that to use "|" as the sep already.

GRASS 6 will have to worry about forward/backward compatibility, so needs the full G_tokenize() treatment like vector/MAP/dbln got.


  dbmi_base/dbmscap.c:    if (sscanf(buf, "%[^:]:%[^:]:%[^:\n]", name, startup, comment) == 3)
  dbmi_base/dbmscap.c:    else if (sscanf(buf, "%[^:]:%[^:\n]", name, startup) == 2)

these two are in old commented out code from grass 5 & not of concern.

Hamish

ps - see also #629 "WinGRASS: spaces in pathnames"

in reply to:  21 comment:22 by hamish, 15 years ago

Replying to hamish:

   lib/db/dbmi_base/login.c
   dbmi_base/login.c:      ret = sscanf(buf, "%[^ ] %[^ ] %[^ ] %[^ ]", dr, db, usr, pwd);

This one will fail if the database path or passwd has spaces in it. In grass 6 this file is ~/.grasslogin6. In grass 7 it is in ~/.grass7/dblogin.

  • What this means is that Postgre, MySQL, etc will currently

fail on WinGrass unless the databases are installed in a dir without spaces in them-

...

Because it is a different file for grass 7 we don't have to worry about transitioning. So I've gone ahead and converted that to use "|" as the sep already.

GRASS 6 will have to worry about forward/backward compatibility, so needs the full G_tokenize() treatment like vector/MAP/dbln got.

It turns out that db.login doesn't actually need a backend DB running to work, so is a bit easier to test.

AFAICT the only solution is to ditch the existing grass 6 file format for ~/.grasslogin6 and replace it with one which uses '|' as the field separator. There is simply no way to tell the database path, username, and passwd apart if either the db or pw has spaces in it because the name and pw are optional. e.g.: Does the path have a single space in it or was the password omitted? No way to tell.

the ~/.grasslogin6 format is:

driver <space> database path [<space> username [<space> passwd]]

where both name and pw are optional.

IMO the lesser of evils is to change the fs to '|' now, and write in the release notes that you will have to re-login.

A way to do this is to write a new .grasslogin64 file which contains the new format and leave any existing .grasslogin6 files alone (and ignored). This is my preferred option and in the interest of keeping the release schedule on track I've provisionally committed it to 6.5svn for evaluation & comment (r40369). It is the same as what is in trunk now, except for the .file name.

Hamish

ps- I backported the G_tokenize() fix for dbln to the 6.4rb earlier today.

in reply to:  19 ; comment:23 by mmetz, 15 years ago

Replying to hamish:

Replying to mmetz:

One minor bug though: in the new v.db.addtable line 129 [g.findfile] does not work with a fully qualified name,

interestingly it pretty much works if you just chop off the @mapset part because g.findfile searches the full mapset path as a module would.

The @mapset part is already chopped off previously, you can replace line 129

eval `g.findfile elem="vector/$GIS_OPT_MAP" file=dbln`

with

eval `g.findfile elem="vector/$MAP_NAME" file=dbln mapset="$MAPSET"`

in order to get identical behaviour for fully qualified map names and map names without the @mapset part. Currently in relbr64 you get different results depending on whether you use a fully qualified name or not.

Anyway, I've ripped this code out now in favor of using the VAR file's default DB settings for the mapset. So you are able to now control it instead of being locked to what the map was already using. Committed in 6.5svn and trunk.

That's probably the best solution.

the only thing left to decide is when to backport this stuff to 6.4.

You have already backported the Vect_read_dblinks() fix, I would opt to backport v.db.addtable as well.

Markus M

in reply to:  23 comment:24 by hamish, 15 years ago

Replying to mmetz:

I would opt to backport v.db.addtable as well.

done. (now new table is created from VAR setting, not what was used previously in the map['s layer 1?])

Hamish

comment:25 by hamish, 15 years ago

Resolution: fixed
Status: newclosed

comment:26 by hamish, 15 years ago

db.login file change backported to 6.4 in r40446.

Note: See TracTickets for help on using tickets.