#3361 closed enhancement (fixed)
v.select: very slow using GEOS operators
Reported by: | mlennert | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 7.4.2 |
Component: | Vector | Version: | svn-trunk |
Keywords: | v.select GEOS within slow | Cc: | |
CPU: | Unspecified | Platform: | Unspecified |
Description
I have not made similar tests with the other operators, but using the within operator v.select is very slow.
First I create a buffer around the NC railroads map:
v.buffer railroads dist=5000 out=rail5000
Then v.select:
time v.select ain=boundary_municp bin=rail5000 op=within out=select real 2m13.989s user 1m57.888s sys 0m15.956s
Using the following script, I get the identical result much faster (maybe using v.distance is another option, but I haven't tried that):
g.copy vect=boundary_municp,munic v.db.addcolumn munic col="totalarea double precision" v.to.db munic op=area col=totalarea v.overlay ain=munic bin=rail5000 op=and out=munic_and_buffer v.db.addcolumn munic_and_buffer col="area double precision" v.to.db munic_and_buffer op=area col=area sleep 1 v.extract boundary_municp cat=$(db.select -c sql="select a_cat from munic_and_buffer where round(area,1)/round(a_totalarea,1)=1" | awk '{printf"%s,", $1}') output=select_bis
Time for running entire script:
real 0m14.611s user 0m6.084s sys 0m5.084s
I stumbled across this because a student had a within operation that kept on running for hours and hours, and using an equivalent of the above script we were able to get the same result within minutes.
I imagine that by going through GEOS we lose the spatial index, or that there are other significant overheads, and that this is what causes such a serious slowdown. This is such a difference, however, that I wonder if there is anything we could do to optimize v.select's GEOS operators ? Or is the only solution to implement the same operators natively ? Maybe a nice GSoC project ?
I'm classifying this as an enhancement, but I'm pretty close to considering such long operation time as soon as there is a significant amount of data as a bug...
Change History (8)
comment:1 by , 8 years ago
Summary: | v.select: very slow on within (GEOS) operator → v.select: very slow using within (GEOS) operator |
---|
comment:2 by , 7 years ago
Milestone: | 7.4.0 → 7.4.1 |
---|
comment:3 by , 7 years ago
Summary: | v.select: very slow using within (GEOS) operator → v.select: very slow using GEOS operators |
---|
Actually, it is not only within. Comparing the native 'overlap' operator with its GEOS equivalent, the 'intersects' operator, I get significant time difference:
time v.select -c ain=boundary_municp bin=rail5000 op=overlap out=select_overlap real 0m27.363s user 0m12.836s sys 0m14.696s
time v.select -c ain=boundary_municp bin=rail5000 op=intersects out=select_intersects real 1m12.190s user 0m56.844s sys 0m15.511s
follow-up: 7 comment:5 by , 7 years ago
Replying to mmetz:
In 72705:
Assuming that the result will be a subset of map A, selected by features from map B, the code re-organization results in a substantial speed-up. v.select is now nearly as fast as the alternative in the description.
The results of operator=overlap
and the GEOS-equivalent operator=intersects
are identical, but the speed difference based on the example in the description
v.select ain=boundary_municp bin=rail5000 out=select op=overlap/intersects
is astonishing, as of trunk r72705.
comment:6 by , 7 years ago
Milestone: | 7.4.1 → 7.6.0 |
---|
comment:7 by , 7 years ago
Replying to mmetz:
Replying to mmetz:
In 72705:
Assuming that the result will be a subset of map A, selected by features from map B, the code re-organization results in a substantial speed-up. v.select is now nearly as fast as the alternative in the description.
The results of
operator=overlap
and the GEOS-equivalentoperator=intersects
are identical, but the speed difference based on the example in the descriptionv.select ain=boundary_municp bin=rail5000 out=select op=overlap/intersectsis astonishing, as of trunk r72705.
As reported on the grass-users list, working with r72716, I actually get different results depending on whether I use intersects or overlap, when working with atype=areas and btype=lines. Don't know if this result is expected. I can provide the data privately if useful.
comment:8 by , 7 years ago
Milestone: | 7.6.0 → 7.4.2 |
---|
Ticket retargeted after milestone closed