Opened 5 years ago
Last modified 5 years ago
#4009 new defect
'where' option in v.to.rast can select wrong feature for raster attribute, in areas where features overlap
Reported by: | florisvdh | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 7.8.3 |
Component: | Vector | Version: | svn-releasebranch76 |
Keywords: | v.to.rast sql where | Cc: | |
CPU: | x86-64 | Platform: | Linux |
Description
I use GRASS 7.6.1 on Linux Mint 18.1, i.e. Ubuntu Xenial (16.04) based. (Note, this is my first post here, I'm a beginning GRASS user, mostly working with R)
I applied something like:
v.to.rast input=polygon_layer output=output_x1 where="field1 LIKE 'x1'" \ use=attr attribute_column=field2 memory=800 --overwrite
Importantly, field1
in polygon_layer
has several possible values such as x1
, x2
, x3
and so on (81 different values in my usecase).
Moreover, in my usecase several features (polygons) have identical geometry, i.e. many sets of 2 or more polygons exist with 100% overlap among their own polygons (i.e. identical polygons), while each of these polygons has its own specific attributes: different values of field1
and so on. The problem that I met occurs for those features; possibly the same problem will occur for overlapping areas between polygons in general.
While the where
option in v.to.rast
is effective in localizing the correct areas, i.e. where field1
is x1
, the problem is that the attribute value (field2
) may come from one of the other (overlapping) features at that place, e.g. where field1
is x2
. Which feature of an overlapping set is used for the attribute probably depends on the order of those overlapping features.
From what I've seen, it appears that v.to.rast
will select the field2
value from the same feature regardless of the value for field1
in the above where
option, as long as one of those overlapping features meets the where
condition.
If this effectively is a bug in the program, maybe it happens in other modules as well, where the where
option is available.
Note: I should also mention that I used this in a simple parallelization approach in a loop (setting field1
to different values), by starting the commands as background processes until a certain number of them is running (following an example I found in the GRASS wiki). That's why the memory option was used explicitly.
Currently I worked around this problem by splitting polygon_layer
according to the value of field1
, using v.extract
(using just a simple loop here as the simple parallelization doesn't work with this one).
Attachments (1)
Change History (9)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
Below are a few characteristics of the map. Note, the actual values of field 1
are not 'x1', 'x2' etc. but specific codes. Given the size of the data, I truncated the output.
$ v.info polygon_layer -c Displaying column types/names for database connection of layer <1>: INTEGER|cat TEXT|field1 DOUBLE PRECISION|field2 $ v.db.select map=polygon_layer | wc -l 85051 $ v.db.select map=polygon_layer where="field1 LIKE 'rbbhc'" | wc -l 4720 $ v.db.select map=polygon_layer where="field1 LIKE 'rbbhc'" | head cat|field1|field2 132|rbbhc|20 146|rbbhc|70 151|rbbhc|30 152|rbbhc|30 153|rbbhc|30 154|rbbhc|30 155|rbbhc|30 156|rbbhc|30 157|rbbhc|30
comment:3 by , 5 years ago
Milestone: | → 7.8.3 |
---|
comment:4 by , 5 years ago
I think I have a fix for this bug, but I would like to test the fix first with your data. Can you provide a small spatial extract of your data that can be used to reproduce the problem? Thanks!
by , 5 years ago
Attachment: | polygon_layer_debug.gpkg added |
---|
Clipped the original polygon_layer for a small region, for debugging
comment:5 by , 5 years ago
@mmetz: see the attachment.
A few examples of sets of polygons which have the same polygon but different attributes:
- cat 21936 and 21937
- cat 21829 and 21830
- cat 22114 and 22115
The original data set from which it was derived (involved more steps apart from clipping), is at https://doi.org/10.5281/zenodo.3540740 .
Some additional output to show region and a few steps:
$ g.region -p projection: 99 (Belge 1972 / Belgian Lambert 72) zone: 0 datum: bel72 ellipsoid: international north: 244030.1 south: 153054.1 west: 22029.6 east: 258861.6 nsres: 32 ewres: 32 rows: 2843 cols: 7401 cells: 21041043 $ g.region w=196065 e=199009 s=188877 n=190989 save=debug $ g.region -p projection: 99 (Belge 1972 / Belgian Lambert 72) zone: 0 datum: bel72 ellipsoid: international north: 190989 south: 188877 west: 196065 east: 199009 nsres: 32 ewres: 32 rows: 66 cols: 92 cells: 6072 $ v.clip -r input=polygon_layer output=polygon_layer_debug $ v.out.ogr input=polygon_layer_debug output=polygon_layer_debug.gpkg
follow-up: 8 comment:7 by , 5 years ago
Interesting, thanks!
I wonder how this relates to ticket 1798, which initiated the 'where' option in quite a number of GRASS modules. Hence, are other modules affected as well?
comment:8 by , 5 years ago
Replying to florisvdh:
Interesting, thanks!
I wonder how this relates to ticket 1798, which initiated the 'where' option in quite a number of GRASS modules. Hence, are other modules affected as well?
All modules using Vect_cats_in_constraint() need to be checked, I am on it.
Replying to florisvdh:
This definitely should not happen. Could you provide the output of:
?