Opened 11 years ago

Closed 9 years ago

Last modified 9 years ago

#7 closed defect (fixed)

Location wizard: should predefine DB connection for new location

Reported by: neteler Owned by: hamish
Priority: major Milestone: 6.4.0
Component: Default Version: svn-trunk
Keywords: Cc: pkelly, cmbarton, grass-dev@…
CPU: Unspecified Platform: Unspecified

Description

The Location wizard does not predefine the DB connection for new location which leads to problems with several modules (needs to generate MAPSET/VAR file). Probably we should predefine SQLite here.

Change History (17)

comment:1 in reply to:  description Changed 11 years ago by cmbarton

Replying to neteler:

The Location wizard does not predefine the DB connection for new location which leads to problems with several modules (needs to generate MAPSET/VAR file). Probably we should predefine SQLite here.

I agree. However, except for xy locations, all are created using g.proj. So this is an issue for g.proj. That is, the location wizard does not operate 'inside' a location where it could use GRASS commands to do this.

Michael

comment:2 Changed 11 years ago by neteler

Cc: pkelly cmbarton added
Version: unspecifiedsvn-trunk

Paul,

could you please take a look? It affects g.proj which should call functionality from Vect_default_field_info() to define the VAR file for the DBMI connection in a newly created location (see lib/vector/Vlib/field.c). I guess that implementation in g.proj takes 10 minutes max... :)

Markus

comment:3 Changed 11 years ago by neteler

Milestone: 6.3.06.4.0

comment:4 Changed 11 years ago by hamish

As mentioned before, I do not think that the VAR file should be a part of the default MAPSET folder specification.

see http://thread.gmane.org/gmane.comp.gis.grass.devel/22676

e.g. for users who mainly work with rasters this is unneeded filesystem noise.

C modules which require a DB to be set should be exiting with an error if none is found (suggesting to run db.connect). Scripts like v.db.add* can run that automatically if needed. Any other module bugs can be fixed, exiting with an error is not much work if that doesn't happen already, just check a return code.

Since I commented out the auto-set to DBF in 6.3cvs's init.sh 3 months ago there have been no complaints about the change. This is not done in 6.2 either, no complaints there.

Hamish

comment:5 Changed 11 years ago by neteler

Component: Pythondefault
Milestone: 6.4.06.3.0

I disagree and think that new locations should be valid locations - or - *all* modules should be able to work with locations lacking the VAR file. I see no point in having this trivial thing lacking and then make the user stumble about several commands not functioning.

The "file system noise" is in average 70 bytes (!) per MAPSET which can be just ignored as a problem.

If we have a default DBMI driver (which is yet DBF), we also have to set default settings. No complaints in 6.2 probably because use cases are different.

Scripts like v.db.add* cannot run that automatically because we don't want to hardcode DBF (or whatever driver) many times scattered all over the source code but only in a single place (which was the init script and I still complain that this was removed leading to broken DBMI settings - see the archive for this).

Markus

comment:6 in reply to:  5 Changed 11 years ago by cmbarton

Replying to neteler:

I disagree and think that new locations should be valid locations - or - *all* modules should be able to work with locations lacking the VAR file. I see no point in having this trivial thing lacking and then make the user stumble about several commands not functioning.

The "file system noise" is in average 70 bytes (!) per MAPSET which can be just ignored as a problem.

If we have a default DBMI driver (which is yet DBF), we also have to set default settings. No complaints in 6.2 probably because use cases are different.

Scripts like v.db.add* cannot run that automatically because we don't want to hardcode DBF (or whatever driver) many times scattered all over the source code but only in a single place (which was the init script and I still complain that this was removed leading to broken DBMI settings - see the archive for this).

Markus

I agree with Markus. All locations should be equal. This extra file is irrelevant to those working only with raster, essentially invisible, and takes up no space. The alternative is fine also (i.e., the lack of VAR causes no problem with any module; it is simply created when needed). But we should not be creating locations that have hidden gotchas.

Michael

comment:7 Changed 11 years ago by hamish

re. create VAR file for each mapset if it doesn't exist (see also default DB -> SQLite)

Markus:

I disagree and think that new locations should be valid locations -

..

Michael:

But we should not be creating locations that have hidden gotchas.

In that case, wouldn't all locations / datasets created with earlier versions of grass be considered invalid? (if so we need to re-add auto-creation of VAR in init.sh)

Markus:

or - *all* modules should be able to work with locations lacking the VAR file.

This is the solution I would like to pursue. Can anyone give examples of which modules break? AFAIK they have all been fixed (v.db.add*). Do any remain?

No complaints in 6.2 probably because use cases are different.

How so? I don't understand what changed. Changes to/new v.db.add*??

For fixing C modules it should be as hard as testing a return value and calling G_fatal_error() with "DB unset. Please run db.select". For scripts it should be as simple as either calling db.select in the script if unset or just letting the C module's G_fatal_error() spit out the appropriate error.

I see no point in having this trivial thing lacking and then make the user stumble about several commands not functioning.

..

If we have a default DBMI driver (which is yet DBF), we also have to set default settings.

..

Scripts like v.db.add* cannot run that automatically because we don't want to hardcode DBF (or whatever driver) many times scattered all over the source code but only in a single place

I agree it should be set in a single place and it is annoying to go through the step manually.

v.in.ascii (.c) creates dbf/ and sets it as default if it doesn't exist: [new mapset] G6.3svn> echo "1 2 foo bar" | v.in.ascii out=foomap fs=space x=1 y=2 Scanning input for column types... Maximum input row length: 12 Maximum number of columns: 4 Minimum number of columns: 4 WARNING: Default driver / database set to:

driver: dbf database: $GISDBASE/$LOCATION_NAME/$MAPSET/dbf/

Importing points... ...

this happens in.c (aka v.in.ascii's main.c),

Fi = Vect_default_field_info(&Map, 1, NULL, GV_1TABLE);

which is from lib/vector/Vlib/field.c Vect_default_field_info()

(hardcoded to DBF)

so any C module which creates a new table should use/check that? AFAICT that seems to be the case for the C modules. grass63/vector$ grep -rI Vect_default_field_info * | \

grep -v \.svn | cut -f1 -d/

If db.select + default values is not sufficient for a single place there could be a maintenance script which tests and sets the default DB when needed. ? v.db.add* scripts could call that. I am happy to author that if it is decided to go that way.

(which was the init script and I still complain that this was removed leading to broken DBMI settings - see the archive for this).

  • I hadn't understood the reason for putting it in init.sh was that was supposed to be the central default DB set point. If that is to be the case it's better to put that in a support script than in init.sh
  • Initially removed due to nasty scripting bug which overwrote VAR

unconditionally, so if anyone decides to revert the removal, please take care not to reintroduce that.

The "file system noise" is in average 70 bytes (!) per MAPSET which can be just ignored as a problem.

I don't know if the joke translates well from English, but there is an old saying "cleanliness is next to impossible." But still a worthy goal.

Vect_default_field_info() limits it to scripts which don't create tables without using other C modules, and we can code a support script module to handle those cases with only 1 line a module.

Hamish

comment:8 Changed 11 years ago by neteler

Comment (by hamish):

In that case, wouldn't all locations / datasets created with earlier versions of grass be considered invalid? (if so we need to re-add auto-creation of VAR in init.sh)

No, because it was there unless you took it out...

Markus:

or - *all* modules should be able to work with locations lacking the VAR file.

This is the solution I would like to pursue. Can anyone give examples of which modules break? AFAIK they have all been fixed (v.db.add*). Do any remain?

AFAIK all db.* modules and scripts at least. Also I suspect the v.* scripts to face a problem.

...

v.in.ascii (.c) creates dbf/ and sets it as default if it doesn't exist:

g.proj does not for example. Likewise possibly v.in.ogr and r.in.gdal (and other modules which generate locations)

so any C module which creates a new table should use/check that? AFAICT that seems to be the case for the C modules. grass63/vector$ grep -rI Vect_default_field_info * | \

grep -v \.svn | cut -f1 -d/

Please also check all pure db.* modules.

...

The "file system noise" is in average 70 bytes (!) per MAPSET which can be just ignored as a problem.

I don't know if the joke translates well from English, but there is an old saying "cleanliness is next to impossible." But still a worthy goal.

I think we have quite some more severe other problems... think about all the todo's concerning buffer sizes.

Vect_default_field_info() limits it to scripts which don't create tables without using other C modules, and we can code a support script module to handle those cases with only 1 line a module.

Still someone has to check all modules/scripts... And as mentioned: the db.* modules do not call Vect_*().

Markus

comment:9 Changed 11 years ago by neteler

Hi,

any news here? With the upcoming release we'll get tons of requests if we don't solve this problem...

I still don't understand why Hamish insists to do it the complicated way.

Markus

comment:10 Changed 11 years ago by hamish

Cc: grass-dev@… added
Milestone: 6.3.06.4.0
Owner: changed from grass-dev@… to hamish
Status: newassigned

for 6.3.0 I have re-enabled automatic VAR creation if it doesn't exist.

more soon.....

Hamish

comment:11 Changed 11 years ago by hamish

Eric wrote:

I created a new location using today's svn source, and noticed that db.connect doesn't have its database parameter pre-populated with $GISDBASE/$LOCATION_NAME/$MAPSET/dbf as it used to; it is now blank. Possible bug? Would it possible to get the module to at least populate $GISDBASE/$LOCATION_NAME/$MAPSET for this parameter?

db.connect takes its default driver, database, schema, and group settings from the $MAPSET/VAR file by way of the db_get_default_*_name() fns. (lib/db/dbmi_base/default_name.c) This is a circular chicken-or-egg situation. (the default is to set it back to what it already is, which is a redundant task)

Martin:

yes, see http://trac.osgeo.org/grass/ticket/7

It seems to me that the VAR file should defined for newly created mapset by default to avoid possible problems.

that approach deals with the symptoms, but doesn't address the underlying problem: what to do when the default DB settings are unknown. To be robust it needs to be dealt with at module run-time and it needs to fail in a safe way if the VAR file is somehow damaged.

A secondary issue is abstracting the default DB settings back to one place so that when we make the switch to sqlite-as-default, we don't have to change ad hoc VAR file & dbf/ directory creation code in a dozen places.

  • For C code this was already condensed into Vect_default_field_info() but

could be abstracted further*,

  • For shell scripts we can add a new flag to db.connect* to check if the

DB is set, and if not then set it. This would then be called from all scripts which create vector tables outside of v.* modules, and could be used in init.sh too. (although if we have done everything correctly then we won't need to put it in init.sh or any other mapset creation code; thus I suggest we keep it out of init.sh)

[*] now in SVN/trunk as DB_DEFAULT_DRIVER, db_set_default_connection(), and 'db.connect -c'. (r30545) Init.sh and various scripts are now updated to use that.

Eric:

Thanks for the pointer. I'll hardcode my database paths for now.

see the db.connect help page example for DBF (I usually just cut and paste that to the command line when needed). It accepts variable LOCATION, MAPSET, you don't need to hardcode it. (Be sure to keep the path with $VARIABLES in single quotes!)

Martin:

anyway if you run some command which calls Vect_default_field_info() from Vlib, VAR file is created and default driver set up to dbf.

e.g. g.copy vect=

i.e. any C vector module which goes down that street should do the trick. The vector lib will set the default database if it goes to find it and finds it unset.

Paul:

Also I posted a suggested solution to the dev list a while ago, but there were no comments.

could you post a link to that in the bug report?

It seems to me that the VAR file should defined for newly created mapset by default to avoid possible problems.

For every mapset, or just for each location? If for every mapset, should it be inherited from whatever the setting for PERMANENT is, or just blindly set to dbf every time a new mapset is created? If it does have to be created for every mapset and inherited from PERMANENT, presumably the state of the DB connection for PERMANENT can't be determined without using Vectlib functions? But all the mapset/location-handling functions are in GISlib, and it would really not be a good idea to have GIS dependent on Vectlib functions (I think). So this is really quite complicated. Perhaps the functions that deal with the default database settings could be moved out of Vectlib and into GISlib.

whoa, keep it simple. if it is requested and not set, then set it. as long as C vector modules are funneled through a single vector lib function which checks & sets, and scripts doing low level stuff don't make assumptions, we should be ok.

Michael:

I'm with Marcus and Martin on this. The potential for problems is high if this is not done; the cost to do it is very small. Can we just go ahead and make this change to g.proj?

But it's not high. I commented out the auto VAR file creation sometime before Christmas, and this is the first actual-usage complaint I've heard. And it's not even a complaint, Eric just noticed that it wasn't set and was wondering why.

[now I've made it live again in init.sh, but I still don't really think it should be there; at least the 'mkdir dbf/' stuff is out of init.sh now]

AFAIK we are not (primarily) talking about g.proj, we are talking about Init.sh. If mapset/location creation is a problem all we have to is keep the 'db.connect -c' in init.sh and it will get the VAR file the first time the mapset is accessed.

Paul:

Attached is a patch (completely untested) that illustrates my idea for solving the problem.

now obviated?

I think there should also be something in lib/gis/make_mapset.c (used by g.mapsets) but there are some complications there in relation to inheriting the DB settings; see my earlier mail.

(*g.mapset* not *g.mapsets*)

.... I still don't think this belongs in any mapset creation code and don't think a VAR file should be included in the minimum requirements for the definition of a valid mapset (ie 'dir/WIND').

Hamish

comment:12 Changed 11 years ago by pkelly

comment:13 Changed 11 years ago by hamish

Michael:

If we want this to be easily changeable in the future, it should be dealt with in a central place

[...]

so that changing it once will switch from a dbf to sqlite default.

yes, now set by line 9 of include/dbmi.h: http://trac.osgeo.org/grass/browser/grass/trunk/include/dbmi.h?rev=30545

This has come up periodically over the past year, as lack of a VAR file seems to cause different scattered modules to fail, and is usually puzzling to users as to what is going wrong.

Name something that still fails and I'll fix it. I don't think there are any, but maybe there still is one- I fully accept the need for an audit.

The main worry I had was for 'v.in.ogr location=' where it will attempt to populate a DB table before ever opening the mapset (and so the init.sh 'db.connect -c' quote-unquote "solution" hasn't had the chance to run). But I've just tested that and it correctly creates the VAR file on the fly using db_set_default_connection().

There really shouldn't be a problem with C modules as the vector and db libs already had code to automatically create the VAR file as needed through the Vect_default_field_info() fn. (further investigation required! probably with tools/sql.sh* working backwards from db_set_default_connection()).

# a poor test, but ok as a first pass...
$ cd vector/

# find C modules which create a DB link:
$ ADDDB=`grep -rIl Vect_map_add_dblink * | grep -v '/.svn/'`

# the following will be safe & auto generate VAR as needed:
$ SAFE=`grep -rIl Vect_default_field_info * | grep -v '/.svn/'`

# find "A" not in "B" (or B not in A); stuff to investigate:
$ echo $ADDDB $SAFE | tr ' ' '\n' | sort | uniq -u

lidar/v.lidar.edgedetection/main.c
v.db.connect/main.c
v.transform/trans_digit.c  # ok: default_field() w/o add_dblink()

Ideally the check/create VAR lib fn should be called from the core "create new DB" lib fn, if that is not already the case. Thus the db test belongs in Vect_map_add_dblink(), Vect_add_dblink(), or Vect_check_dblink(), or Vect_write_dblinks() ? By the low-levelness of the last 3 I'd guess put it in Vect_map_add_dblink(), *if* Vect_map_add_dblink() is ever called without calling Vect_map_add_dblink() first. (and if there is a case like that maybe it is a bug?)

[*] can we rename tools/sql.sh with a more descriptive name? sql_lib_tree.sh or something? Is it trivial/hard to adapt that to run for SQLite instead of PostgreSQL?

It WAS a problem from some scripts which created DB tables on their own (db.execute, v.db.add*) but AFAIK those are few and have now all been fixed for some time. I've now updated those to call 'db.connect -c'.

It sounds like this can be fixed 2 ways: change g.proj to create a VAR file when it creates a location

this is the cart leading the horse and means the VAR creation code must be scattered in a bunch of places. Better to funnel the task into one place which everything which creates a new table must go past, ie as close as possible to when it is actually needed. For the C modules this is currently via Vect_default_field_info(), which is for:

\brief Get default information about link to database for new dblink

(this is how locations are created in the location wizard)

they needn't be and probably shouldn't be. (unless 'v.in.ogr -c locat=', but then that's already covered by v.in.ogr)

or change modules to gracefully deal with the lack of a VAR file.

this is the approach which has been taken and AFAIK is fully in place.

The ongoing discussion about which of these two ways to fix it has left this unfixed for months.

No, AFAIK it has been fixed for months already, but hacks were still left here and there in the code. Now I've cleaned those up. (r30545, r30546, r30547)

I'm agnostic about which way is best, except that I don't think that we should hack it into the location wizard wxPython code.

It shouldn't need to be in any mapset/location creation code. It should be dealt with: in a single place, at the time of DB creation, if needed.

Thus I think it will be safe to again re-disable the 'db.connect -c' call in init.sh in SVN/trunk. (due to lingering doubts in 6.3.0 Init.sh is now set to create VAR+dbf/ if not there)

I hope this answers Paul's post of Feb 19 too,

Hamish

comment:14 Changed 11 years ago by neteler

Here a related bug:

# Location without dbf/ subdir:
v.in.sites mysites out=mysites
Input format: dimension: 2   strings: 0   FP:0
dbmi: Protocol error
ERROR: Cannot open database $GISDBASE/$LOCATION_NAME/$MAPSET/dbf/

comment:15 Changed 9 years ago by hamish

CPU: Unspecified
Platform: Unspecified
Resolution: fixed
Status: assignedclosed

everything seems to run smoothly enough now. closing bug, noise & all.

comment:16 Changed 9 years ago by cmbarton

Great. I'll look into porting this to 6.5 and 7 soon. It can't be backported because these are out of sync with 6.4, so I'll have to insert them. I just noticed that the my location wizard doesn't launch at all in 7. So I'll need to check on that too.

Michael

comment:17 in reply to:  16 Changed 9 years ago by hamish

Replying to cmbarton:

Great. I'll look into porting this to 6.5 and 7 soon. It can't be backported because these are out of sync with 6.4, so I'll have to insert them. I just noticed that the my location wizard doesn't launch at all in 7. So I'll need to check on that too.

I think this was post was intended for #842, not this report which had to do with the VAR file.

but yes, in my tests so far utm+6.4 plain textbox are now working fine. In hindsight the spinbox might have just been a mostly a gimic after all. typing in two numbers is not much of a burden.

thanks- the wx location wizard has really become something to be proud of.

Hamish

Note: See TracTickets for help on using tickets.