Opened 18 months ago

Closed 18 months ago

Last modified 12 months ago

#2986 closed defect (fixed)

r.mapcalc with same variable on LHS and RHS

Reported by: mankoff Owned by: grass-dev@…
Priority: major Milestone: 7.0.4
Component: Docs Version: 7.0.3
Keywords: r.mapcalc Cc: mankoff@…
CPU: Unspecified Platform: Unspecified

Description

It appears the following is valid syntax:

r.mapcalc "x = x + 1" --o

But a GRASS developer has said (source),

Note that you shouldn't use r.mapcalc like this (read from and write in the same map):

r.mapcalc "old = old + 10" --overwrite

If this is the case, this seems like it important and should be included in the documentation.

Attachments (1)

r.mapcalc.RHS_LHS.diff (756 bytes) - added by mankoff 18 months ago.
mention LHS = RHS issue in KNOWN ISSUES section of docs

Download all attachments as: .zip

Change History (17)

comment:1 Changed 18 months ago by marisn

Such notice is already present in documentation (introduced in r66548 on 20 Oct 2015). See: https://grass.osgeo.org/grass71/manuals/r.mapcalc.html

... expression is any legal arithmetic expression involving existing raster map layers (except result itself) ...

See also: #2770; https://lists.osgeo.org/pipermail/grass-dev/2011-July/054983.html

Either this bug should be closed or it should be turned into a wish - to not allow same LHS variable to appear on RHS. As I'm not so much into mapcalc language details, I can not see it as a problem, but Glynn will shed more light on such option.

Mankoff, if you consider current documentation to be inadequate, provide a patch with better wording.

Changed 18 months ago by mankoff

Attachment: r.mapcalc.RHS_LHS.diff added

mention LHS = RHS issue in KNOWN ISSUES section of docs

comment:2 Changed 18 months ago by mankoff

You're right it is in the docs. I think I checked old versions (6.4) and did not see it, and it seems important enough it should be highlighted. I've attached a patch adding this to the KNOWN ISSUES section of the docs.

comment:3 Changed 18 months ago by marisn

Resolution: fixed
Status: newclosed

Documentation has been improved and changes have been backported to 7.0 (r68268 and r68269).

Thank you for reporting. Closing as solved.

comment:4 in reply to:  3 Changed 18 months ago by neteler

Replying to marisn:

Documentation has been improved and changes have been backported to 7.0 (r68268 and r68269).

FWIW, also backported to G6.4.svn to r68277

Thank you for reporting. Closing as solved.

comment:5 Changed 18 months ago by mankoff

Just to be clear, LHS = LHS does work. I've never actually had it crash or return NULL or what appears to be invalid data when I've done this. But perhaps it isn't guaranteed to work...

Last edited 18 months ago by mankoff (previous) (diff)

comment:6 in reply to:  5 ; Changed 18 months ago by glynn

Replying to mankoff:

Just to be clear, LHS = RHS does work. I've never actually had it crash or return NULL or what appears to be invalid data when I've done this. But perhaps it isn't guaranteed to work...

It works by coincidence rather than by design.

The data (cell/fcell) file and null bitmap are written to temporary files, which are renamed into place when the map is closed. The metadata files (cellhd, quant, histogram, categoriess, colour table, history, etc), are stored in memory and written when the map is closed. So the output map shouldn't change while it's being read.

A specific case which won't quite work is "r.mapcalc 'map = map'", because that copies the categories, colour table and history from the input to the output, but closing the output map will delete or replace those before the copy occurs. If the logic is ever extended to cover more cases, those will similarly fail.

Actually, I'm not sure that it will work on Windows, because at the point that the data and null files are renamed, the original files are still open for read (r.mapcalc never actually closes the input maps). That should be easy to fix, though (call close_maps() from within execute() before the loop which closes the output maps).

comment:7 in reply to:  6 Changed 18 months ago by glynn

Replying to glynn:

Replying to mankoff:

Just to be clear, LHS = RHS does work.

Something else which doesn't quite work as expected is that once a name appears on the LHS of an assignment, it's marked as being a variable rather than a map. Which means that you can't use any modifiers on it (whether the neighbourhood modifier, the category modifier or any colour channel modifiers), as those can only be used on maps, not variables.

comment:8 Changed 18 months ago by mankoff

I'm still confused. The specific case of r.mapcalc 'map=map' mentioned 2 comments above does seem to work for me in v7.0.3 (see code example below). Also, the last comment appears to say that everything on the LHS, even if a new unique LHSname, will be a variable and not a map (I'm not even sure what "variables" are in GRASS). But the documentation says that, "result is the name of a raster map layer", and I thought the whole point of r.mapcalc was to produce new raster maps.

Finally, there is no evidence, at least from r.info, of results changing type. See for example results from,

grass70 -text ./nc_spm_08_grass7/PERMANENT
r.mapcalc "LHS = 1" --o
r.info LHS > rinfo.1
r.mapcalc "LHS = LHS" --o
r.info LHS > rinfo.2
r.mapcalc "LHS = LHS * 2" --o
r.info LHS > rinfo.3
diff3 rinfo.1 rinfo.2 rinfo.3

Finally, would it be possible to change this behavior, so that r.mapcalc 'map = map * 2' is valid syntax? It seems like it would be incredibly useful to support this standard programming syntax. If it isn't a major code issue, I'll open a ticket with this feature request.

comment:9 in reply to:  8 ; Changed 18 months ago by glynn

Replying to mankoff:

I'm still confused. The specific case of r.mapcalc 'map=map' mentioned 2 comments above does seem to work for me in v7.0.3 (see code example below).

If the map had categories or quantisation rules, they will be discarded. A colour table should also be discarded but isn't because Rast_remove_colors(mapname,"") doesn't work (it compares the mapset literally with the current mapset without checking for the empty string).

Also, the last comment appears to say that everything on the LHS, even if a new unique LHSname, will be a variable and not a map (I'm not even sure what "variables" are in GRASS).

The result of any expression can be assigned to a variable; that's what "=" does. If the assignment is a top-level expression (i.e. not a sub-expression), the value is written to an output map with the same name as the variable. Assignments other than at the top level (e.g. within eval(), which exists for this specific purpose) don't generate an output map. E.g. the i.pansharpen script uses the following r.mapcalc expression:

eval(k = "$ms1" + "$ms2" + "$ms3")
"$outr" = 1.0 * "$ms3" * "$panmatch3" / k
"$outg" = 1.0 * "$ms2" * "$panmatch2" / k
"$outb" = 1.0 * "$ms1" * "$panmatch1" / k

Here, k is just a variable; because it occurs as a sub-expression of the eval() call rather than as a top-level expression, it doesn't result in an output map named "k". Its value is used in the subsequent expressions, avoiding the need to repeat the calculation for each one.

Regardless of whether a variable is written to an output map, it's still a variable (i.e. something which is calculated for each row), not an input map (thus you cannot use any modifiers on it). If you use a name for both a variable and an input map, the variable hides the input map for all expressions following the assignment which creates the variable.

Finally, would it be possible to change this behavior, so that r.mapcalc 'map = map * 2' is valid syntax?

No. It's already syntactically valid, the issue is that modifying a map while it is being read is undefined behaviour which only "sort-of" works (e.g. it may not work at all if r.external and r.external.out are in use such that the input and output maps refer to the same data file). That's an issue with the core GRASS libraries, and not something which r.mapcalc can realistically work around.

comment:10 Changed 18 months ago by mankoff

Would it be possible to support this by making r.mapcalc a wrapper, which replaces the LHS with something unique (if it is detected on the RHS, and then as a final step, setting the user result to the LHS-unique variable?

comment:11 in reply to:  10 ; Changed 18 months ago by glynn

Replying to mankoff:

Would it be possible to support this by making r.mapcalc a wrapper, which replaces the LHS with something unique (if it is detected on the RHS, and then as a final step, setting the user result to the LHS-unique variable?

It can't realistically be implemented as a wrapper, and I'm not sure that the behaviour would even be desirable. If change is warranted, it would be better to just check for the situation where an output map will overwrite any of the input maps, and report it as an error.

comment:12 in reply to:  9 ; Changed 16 months ago by wenzeslaus

Replying to glynn:

The result of any expression can be assigned to a variable; that's what "=" does.

Do I understand it correctly that re-assigning a variable inside sub-expression like in the following example is OK?

r.mapcalc "a = eval(x = 1., x = x * 3, x / 2)"

In other words, can I use the above without any bad consequences?

comment:13 in reply to:  12 Changed 16 months ago by glynn

Replying to wenzeslaus:

Do I understand it correctly that re-assigning a variable inside sub-expression like in the following example is OK?

I think so.

The parser replaces variable names with a reference to the last binding for that variable, and bindings are appended to the list after the binding is parsed (so if the RHS contains a reference to the LHS, it will refer to the previous binding rather than the one being parsed).

So in that example, you have two distinct variables named "x". The first sub-expression assigns the first one, the second sub-expression reads the first and assigns to the second, the third sub-expression reads the second.

comment:14 Changed 12 months ago by wenzeslaus

In 69730:

r.mapcalc: mention the LHS/RHS limits also in Notes and provide workaround (see #2986, r66548, r68268)

comment:15 in reply to:  11 ; Changed 12 months ago by wenzeslaus

Replying to glynn:

I'm not sure that the behaviour would even be desirable.

Although, in GRASS or GIS in general, we usually create new data under new names, from general programming perspective a = a + 1 makes sense. The following, which is an equivalent of a = a + 1, actually works:

r.neighbors in=test1 out=test1 size=25 --o

I'm not sure if it is actually legal. We should make it clear. Is it module dependent? What about G_check_input_output_name()?

Doing this with vectors gives an error (Output vector map <firestations> is used as input):

v.extract in=test2 cats=10,11,12,13 output=test2 --o

I assume this applies to all vector modules (and that Vect_check_input_output_name() should be used to ensure it).

If change is warranted, it would be better to just check for the situation where an output map will overwrite any of the input maps, and report it as an error.

Warning for start and perhaps an (fatal) error in the future are appropriate. We should either support it or forbid it. (I don't know how people interpret "the behavior is undefined" - it is common in C/C++ programming, but I'm not so sure about GIS).

comment:16 in reply to:  15 Changed 12 months ago by glynn

Replying to wenzeslaus:

Is it module dependent?

Yes. It relies upon the module having read anything relevant from the input map before it writes the corresponding data for the output map.

The cell, fcell and null files are written using tempfile-and-rename, i.e. write the data to a temporary file, rename it into place within Rast_close(). Support files are written directly, so it's necessary to order reads and writes correctly (and take into account the fact that Rast_close() writes or removes some support files).

For GDAL-linked maps (r.external, r.external.out), the behaviour is up to GDAL, but I wouldn't expect in-place modification to work.

What about G_check_input_output_name()?

Currently, only 5 modules appear to use it: i.pca, r.carve, r.relief, r.slope.aspect, r.param.scale. If a module has multiple inputs and multiple outputs, the number of possible combinations which need to be checked could be large.

Note: See TracTickets for help on using tickets.