#2986 closed defect (fixed)
r.mapcalc with same variable on LHS and RHS
Reported by: | mankoff | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 7.0.4 |
Component: | Docs | Version: | 7.0.3 |
Keywords: | r.mapcalc | Cc: | mankoff@… |
CPU: | Unspecified | Platform: | Unspecified |
Description ¶
It appears the following is valid syntax:
r.mapcalc "x = x + 1" --o
But a GRASS developer has said (source),
Note that you shouldn't use r.mapcalc like this (read from and write in the same map): r.mapcalc "old = old + 10" --overwrite
If this is the case, this seems like it important and should be included in the documentation.
Change History (17)
comment:1 by , 9 years ago
by , 9 years ago
Attachment: | r.mapcalc.RHS_LHS.diff added |
---|
mention LHS = RHS issue in KNOWN ISSUES section of docs
comment:2 by , 9 years ago
You're right it is in the docs. I think I checked old versions (6.4) and did not see it, and it seems important enough it should be highlighted. I've attached a patch adding this to the KNOWN ISSUES section of the docs.
follow-up: 4 comment:3 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:4 by , 9 years ago
follow-up: 6 comment:5 by , 9 years ago
Just to be clear, LHS = LHS
does work. I've never actually had it crash or return NULL or what appears to be invalid data when I've done this. But perhaps it isn't guaranteed to work...
follow-up: 7 comment:6 by , 9 years ago
Replying to mankoff:
Just to be clear,
LHS = RHS
does work. I've never actually had it crash or return NULL or what appears to be invalid data when I've done this. But perhaps it isn't guaranteed to work...
It works by coincidence rather than by design.
The data (cell/fcell) file and null bitmap are written to temporary files, which are renamed into place when the map is closed. The metadata files (cellhd, quant, histogram, categoriess, colour table, history, etc), are stored in memory and written when the map is closed. So the output map shouldn't change while it's being read.
A specific case which won't quite work is "r.mapcalc 'map = map'", because that copies the categories, colour table and history from the input to the output, but closing the output map will delete or replace those before the copy occurs. If the logic is ever extended to cover more cases, those will similarly fail.
Actually, I'm not sure that it will work on Windows, because at the point that the data and null files are renamed, the original files are still open for read (r.mapcalc never actually closes the input maps). That should be easy to fix, though (call close_maps() from within execute() before the loop which closes the output maps).
comment:7 by , 9 years ago
Replying to glynn:
Replying to mankoff:
Just to be clear,
LHS = RHS
does work.
Something else which doesn't quite work as expected is that once a name appears on the LHS of an assignment, it's marked as being a variable rather than a map. Which means that you can't use any modifiers on it (whether the neighbourhood modifier, the category modifier or any colour channel modifiers), as those can only be used on maps, not variables.
follow-up: 9 comment:8 by , 9 years ago
I'm still confused. The specific case of r.mapcalc 'map=map'
mentioned 2 comments above does seem to work for me in v7.0.3 (see code example below). Also, the last comment appears to say that everything on the LHS, even if a new unique LHSname, will be a variable and not a map (I'm not even sure what "variables" are in GRASS). But the documentation says that, "result is the name of a raster map layer", and I thought the whole point of r.mapcalc
was to produce new raster maps.
Finally, there is no evidence, at least from r.info
, of results changing type. See for example results from,
grass70 -text ./nc_spm_08_grass7/PERMANENT r.mapcalc "LHS = 1" --o r.info LHS > rinfo.1 r.mapcalc "LHS = LHS" --o r.info LHS > rinfo.2 r.mapcalc "LHS = LHS * 2" --o r.info LHS > rinfo.3 diff3 rinfo.1 rinfo.2 rinfo.3
Finally, would it be possible to change this behavior, so that r.mapcalc 'map = map * 2'
is valid syntax? It seems like it would be incredibly useful to support this standard programming syntax. If it isn't a major code issue, I'll open a ticket with this feature request.
follow-up: 12 comment:9 by , 9 years ago
Replying to mankoff:
I'm still confused. The specific case of
r.mapcalc 'map=map'
mentioned 2 comments above does seem to work for me in v7.0.3 (see code example below).
If the map had categories or quantisation rules, they will be discarded. A colour table should also be discarded but isn't because Rast_remove_colors(mapname,"") doesn't work (it compares the mapset literally with the current mapset without checking for the empty string).
Also, the last comment appears to say that everything on the LHS, even if a new unique LHSname, will be a variable and not a map (I'm not even sure what "variables" are in GRASS).
The result of any expression can be assigned to a variable; that's what "=" does. If the assignment is a top-level expression (i.e. not a sub-expression), the value is written to an output map with the same name as the variable. Assignments other than at the top level (e.g. within eval(), which exists for this specific purpose) don't generate an output map. E.g. the i.pansharpen script uses the following r.mapcalc expression:
eval(k = "$ms1" + "$ms2" + "$ms3") "$outr" = 1.0 * "$ms3" * "$panmatch3" / k "$outg" = 1.0 * "$ms2" * "$panmatch2" / k "$outb" = 1.0 * "$ms1" * "$panmatch1" / k
Here, k is just a variable; because it occurs as a sub-expression of the eval() call rather than as a top-level expression, it doesn't result in an output map named "k". Its value is used in the subsequent expressions, avoiding the need to repeat the calculation for each one.
Regardless of whether a variable is written to an output map, it's still a variable (i.e. something which is calculated for each row), not an input map (thus you cannot use any modifiers on it). If you use a name for both a variable and an input map, the variable hides the input map for all expressions following the assignment which creates the variable.
Finally, would it be possible to change this behavior, so that
r.mapcalc 'map = map * 2'
is valid syntax?
No. It's already syntactically valid, the issue is that modifying a map while it is being read is undefined behaviour which only "sort-of" works (e.g. it may not work at all if r.external and r.external.out are in use such that the input and output maps refer to the same data file). That's an issue with the core GRASS libraries, and not something which r.mapcalc can realistically work around.
follow-up: 11 comment:10 by , 9 years ago
Would it be possible to support this by making r.mapcalc
a wrapper, which replaces the LHS
with something unique (if it is detected on the RHS
, and then as a final step, setting the user result
to the LHS-unique variable?
follow-up: 15 comment:11 by , 9 years ago
Replying to mankoff:
Would it be possible to support this by making
r.mapcalc
a wrapper, which replaces theLHS
with something unique (if it is detected on theRHS
, and then as a final step, setting the userresult
to the LHS-unique variable?
It can't realistically be implemented as a wrapper, and I'm not sure that the behaviour would even be desirable. If change is warranted, it would be better to just check for the situation where an output map will overwrite any of the input maps, and report it as an error.
follow-up: 13 comment:12 by , 9 years ago
Replying to glynn:
The result of any expression can be assigned to a variable; that's what "=" does.
Do I understand it correctly that re-assigning a variable inside sub-expression like in the following example is OK?
r.mapcalc "a = eval(x = 1., x = x * 3, x / 2)"
In other words, can I use the above without any bad consequences?
comment:13 by , 9 years ago
Replying to wenzeslaus:
Do I understand it correctly that re-assigning a variable inside sub-expression like in the following example is OK?
I think so.
The parser replaces variable names with a reference to the last binding for that variable, and bindings are appended to the list after the binding is parsed (so if the RHS contains a reference to the LHS, it will refer to the previous binding rather than the one being parsed).
So in that example, you have two distinct variables named "x". The first sub-expression assigns the first one, the second sub-expression reads the first and assigns to the second, the third sub-expression reads the second.
follow-up: 16 comment:15 by , 8 years ago
Replying to glynn:
I'm not sure that the behaviour would even be desirable.
Although, in GRASS or GIS in general, we usually create new data under new names, from general programming perspective a = a + 1
makes sense. The following, which is an equivalent of a = a + 1
, actually works:
r.neighbors in=test1 out=test1 size=25 --o
I'm not sure if it is actually legal. We should make it clear. Is it module dependent? What about G_check_input_output_name()
?
Doing this with vectors gives an error (Output vector map <firestations> is used as input):
v.extract in=test2 cats=10,11,12,13 output=test2 --o
I assume this applies to all vector modules (and that Vect_check_input_output_name()
should be used to ensure it).
If change is warranted, it would be better to just check for the situation where an output map will overwrite any of the input maps, and report it as an error.
Warning for start and perhaps an (fatal) error in the future are appropriate. We should either support it or forbid it. (I don't know how people interpret "the behavior is undefined" - it is common in C/C++ programming, but I'm not so sure about GIS).
comment:16 by , 8 years ago
Replying to wenzeslaus:
Is it module dependent?
Yes. It relies upon the module having read anything relevant from the input map before it writes the corresponding data for the output map.
The cell, fcell and null files are written using tempfile-and-rename, i.e. write the data to a temporary file, rename it into place within Rast_close(). Support files are written directly, so it's necessary to order reads and writes correctly (and take into account the fact that Rast_close() writes or removes some support files).
For GDAL-linked maps (r.external, r.external.out), the behaviour is up to GDAL, but I wouldn't expect in-place modification to work.
What about
G_check_input_output_name()
?
Currently, only 5 modules appear to use it: i.pca, r.carve, r.relief, r.slope.aspect, r.param.scale. If a module has multiple inputs and multiple outputs, the number of possible combinations which need to be checked could be large.
Such notice is already present in documentation (introduced in r66548 on 20 Oct 2015). See: https://grass.osgeo.org/grass71/manuals/r.mapcalc.html
See also: #2770; https://lists.osgeo.org/pipermail/grass-dev/2011-July/054983.html
Either this bug should be closed or it should be turned into a wish - to not allow same LHS variable to appear on RHS. As I'm not so much into mapcalc language details, I can not see it as a problem, but Glynn will shed more light on such option.
Mankoff, if you consider current documentation to be inadequate, provide a patch with better wording.