Opened 7 years ago

Last modified 12 months ago

#2230 new defect

'g.region -p' writes new WIND file, causes race condition for parallel jobs

Reported by: hamish Owned by: grass-dev@…
Priority: normal Milestone: 6.4.6
Component: Default Version: svn-develbranch6
Keywords: g.region read-only Cc:
CPU: x86-64 Platform: Linux

Description

Hi,

In tracking down some strangeness I've just noticed that 'g.region -p' or 'g.region -g' writes out a new (unchanged) WIND file whenever it runs.

This is a problem when you have many jobs working in parallel and a bunch of them starting at the same time all call 'eval g.region -g' at the start of their scripts. About 5% of the time I'm hitting the file before it is completely written and the unfortunate g.region G_fatal_error()s with a missing projection parameter or similar.

one solution is to not open the WIND file for writing if just read-only command line terms are given, another is to write to a .tmp file and move it into place atomically.

Hamish

Change History (6)

comment:1 in reply to:  description Changed 7 years ago by neteler

Replying to hamish:

Hi,

In tracking down some strangeness I've just noticed that 'g.region -p' or 'g.region -g' writes out a new (unchanged) WIND file whenever it runs.

one solution is to not open the WIND file for writing if just read-only command line terms are given

IMHO yes: read-only flags should behave like as such. Otherwise race conditions may occur.

comment:2 in reply to:  description Changed 7 years ago by glynn

Replying to hamish:

In tracking down some strangeness I've just noticed that 'g.region -p' or 'g.region -g' writes out a new (unchanged) WIND file whenever it runs.

It does this whenever it runs without the -u flag.

One problem is that printing and setting are not mutually exclusive, i.e. you can both modify and print the current region. For example, the command

g.region -p res=10

will set the resolution to 10 units, write the modified region to the WIND file (or the region named by $WIND_OVERRIDE), and print the updated region.

Other problems are:

  • The region is adjusted (by G_adjust_Cell_head3) before writing it, so the WIND file may actually be changed by running g.region with no arguments (or just with -p/-g).
  • The region may not even be read from the WIND file (or the region named by $WIND_OVERRIDE). If GRASS_REGION is set, the region will be read from that variable, but writes still go to WIND or $WIND_OVERRIDE.

In short, simply skipping the G_put_window() call based upon which arguments are given (other than -u) would constitute a change to existing behaviour. You can't assume that the updated region will be identical to the previous region based solely upon which arguments are given.

I think that the issue is essentially that people don't realise that g.region always sets the region unless -u is given. Even without any options which explicitly modify the region, it uses $GRASS_REGION if that is set, and adjusts the region before writing it. IOW, "g.region -p" doesn't "print the region", it "sets and prints the region"; "g.region -pu" prints (and only prints) the region.

comment:3 Changed 7 years ago by wenzeslaus

So, looking at r59383, r59386, r59387 and r59391, what is the set of g.region flags which should be used by default when scripting. If the region (WIND file) cannot be changed by default, why g.region is changing it? If the region (WIND file) needs to be changed by default, we should put a big warning to documentation about scripting (which can be always parallelized).

I think that the issue is essentially that people don't realise that g.region always sets the region unless -u is given.

Again, as in previous paragraph, this is a documentation issue. This is not expected behavior and it is not properly documented.

Also I don't understand why it is necessary to write the WIND file in g.region if the content was not changed.

comment:4 in reply to:  3 Changed 7 years ago by glynn

Replying to wenzeslaus:

what is the set of g.region flags which should be used by default when scripting.

If you only want to read the region, use the "-u" flag.

This is not expected behavior and it is not properly documented.

Agreed; I overlooked it when writing the grass.script.core Python module.

Also I don't understand why it is necessary to write the WIND file in g.region if the content was not changed.

It's not simple to tell whether the content was changed, i.e. whether the region which is printed will match the contents of the WIND file.

When the region is changed explicitly (via the -d flag or any option other than save=), it's reasonable to assume that the updated region doesn't match the WIND file. But the absence of those options doesn't necessarily mean that writing to WIND (or $WIND_OVERRIDE) will be a no-op.

If the behaviour was changed so that WIND was only written when one of those options was used, that would technically be a change to the long-standing behaviour, but we would probably get away with it. Probably. Maybe that's close enough; I have no particular opinion on that.

However: we'd need to add a -f (force) flag, otherwise there'd be no way to copy the $GRASS_REGION value to WIND/$WIND_OVERRIDE, or to perform just the adjustment (G_adjust_Cell_head3).

comment:5 Changed 5 years ago by neteler

Milestone: 6.4.46.4.6

comment:6 Changed 12 months ago by wenzeslaus

Keywords: read-only added
Note: See TracTickets for help on using tickets.