Opened 12 years ago

Closed 12 years ago

#1706 closed defect (fixed)

TopoGeo_AddPolygon not adding polygons

Reported by: pramsey Owned by: strk
Priority: medium Milestone: PostGIS 2.0.0
Component: topology Version: master
Keywords: Cc:

Description

I just took a polygon file
that had contiguous non-overlapping polygons and did this:

cded=# select TopoGeo_AddPolygon('sheds', st_geometryn(geom,1), 5.0)
from sheds where gis_tag like 'VICT%';

ERROR:  Spatial exception - geometry intersects edge 39
CONTEXT:  PL/pgSQL function "topogeo_addlinestring" line 175 at assignment
SQL statement "SELECT array_cat(edges, array_agg(x)) FROM ( select
topology.TopoGeo_addLinestring(atopology, rec.geom, tol) as x ) as
foo"
PL/pgSQL function "topogeo_addpolygon" line 27 at assignment

This is a non-overlapping set of data, originally generated from
vector clean ArcINFO coverages...

select a.gis_tag, b.gis_tag from sheds a, sheds b where
st_overlaps(a.geom,b.geom) and a.gis_tag like 'VICT%' and b.gis_tag
like 'VICT%';

Attachments (6)

shed_vict_red.zip (2.8 KB ) - added by strk 12 years ago.
first reduction. 2 geometries. Stil too many vertexes.
before_snapping.png (2.9 KB ) - added by strk 12 years ago.
before snapping
after_snapping.png (3.1 KB ) - added by strk 12 years ago.
after snapping (note: the length of the spike is about 3 units)
diff_selfnoded_edge_spike.png (1.8 KB ) - added by strk 12 years ago.
difference between new (snapped) and old edge
int_new_old.png (3.2 KB ) - added by strk 12 years ago.
intersection (yellow) and difference (green) between old and new (snapped)
wrong_endpoint_snap.png (3.1 KB ) - added by strk 12 years ago.
edge (red) snapped (yellow) to its own endpoint

Download all attachments as: .zip

Change History (25)

comment:1 by strk, 12 years ago

Note that the order in which things are added matters, so first thing to aim for would be reproducing it always, with an ORDER BY clause.

comment:2 by strk, 12 years ago

Ok, I can reproduce sorting gid both sides. I've reduced the input to 4 records. Can I upload it here ?

comment:3 by strk, 12 years ago

Have it down to 3 simple polygons. It's interesting to note that up to tolerance=2 it works fine, tolerance=3 starts with the error message. ST_Snap is harmful!

by strk, 12 years ago

Attachment: shed_vict_red.zip added

first reduction. 2 geometries. Stil too many vertexes.

comment:4 by strk, 12 years ago

Alright, the self-noded boundary of the second polygon being added is fine.

Then we find existing edges that intersect the new line and we find the first polygon boundary edge.

When we go on to compute the difference between the self-noded boundary and the existing edge.

At that point we snap the new edge to the existing one and here the error is introduced.

Look at the pictures. The first one shows the old edge (green) and the new edge (red) before snapping. The second shows the same thing _after_ snapping the new edge to the old edge. You can see the snapping is drastically wrong. I mean… if it snapped to _closest_ point rather than to first point under threshold it would have done a much better work.

That also explains why a lower tolerance works better (snaps better). I really think we should fix ST_Snap to snap to the closest vertex for this to work better.

by strk, 12 years ago

Attachment: before_snapping.png added

before snapping

by strk, 12 years ago

Attachment: after_snapping.png added

after snapping (note: the length of the spike is about 3 units)

comment:5 by strk, 12 years ago

New edge (red) before snapping to old edge (green): before snapping

New edge (red) after snapping to old edge (green): after snapping (note: the length of the spike is about 3 units)

by strk, 12 years ago

difference between new (snapped) and old edge

by strk, 12 years ago

Attachment: int_new_old.png added

intersection (yellow) and difference (green) between old and new (snapped)

comment:6 by strk, 12 years ago

Alright, and the final crash is due to the edge being added being snapped again to its endpoints, which surpsisingly results to a drift from original shape !

Actually I wonder why we're snapping again to the endpoints..

by strk, 12 years ago

Attachment: wrong_endpoint_snap.png added

edge (red) snapped (yellow) to its own endpoint

comment:7 by strk, 12 years ago

And here's some more pictures.

This is the intersection between the new edge (snapped) and the old one (in yellow)

intersection (yellow) and difference (green) between old and new (snapped)

And here's the above intersection (now in red) after snapping self against its own endpoints (becomes the yellow one).

edge (red) snapped (yellow) to its own endpoint

Note that the drift from the original is below 3 units (everything succeeds with a tolerance of 2)

With this new informations I might have enough data to try at reproducing at smaller scale (with less data). For sure ST_Snap instability has a big role in this.

comment:8 by strk, 12 years ago

I shall add that the distance between vertices of the input geometries is sometimes < the given tolerance (5) which probably also affects operation. I suspect passing the inputs trough ST_SnapToGrid with a grid size == tolerance would fix this case.

comment:9 by strk, 12 years ago

I confirm ST_SnapToGrid with tolerance=5 fixes the reduced case, didn't test the full case.

comment:10 by strk, 12 years ago

About the reason why we're snapping again to endpoints (see comment:6) this is because when adding endpoints using TopoGeo_AddPoint we might end up snapping these endpoints to existing nodes again. This is indeed very unlikely since we're snapping the input geometries before going there, and when adding points we'd eventually be snapping _existing_edges_ to the new point added. Anyway since the noding we compute depends on GEOS and GEOS doesn't give us control over tolerance we can't really tell what will happen.

In any case all we're trying to do is make sure our endpoints still match with the node identifiers that TopoGeo_AddPoint returned us. So we do not want to snap _segments_ but only _points_ and in particular only _endpoints_. By skipping the ST_Snap call and directly adding the inserted endpoints back this case seems to be fixed in an excelent way.

I'm running more tests and more importantly trying to reproduce the issue with smaller dataset for the sake of having a regression test for it.

comment:11 by strk, 12 years ago

Bingo, handled to syntetize the issue:

SELECT TopoGeo_AddLineString('city_data', 'LINESTRING(20 20, 20 10, 10 10, 9 12, 10 20)');
SELECT TopoGeo_addLineString('city_data', 'LINESTRING(10 0, 10 10, 15 10, 20 10)', 4);

comment:12 by strk, 12 years ago

Oh, I realized TopoGeo_AddPoint, when splitting an edge, is splitting by a projected point rather than by the actual given point. I bet that's where the precision is going. Precision is about these two lines not being the same when I expected them to be:

=# with inp as (SELECT '01020000000300000000000000000034400000000000002440000000000000244000000000000024400000000000002240FFFFFFFFFFFF2740'::geometry as e, '01020000000400000000000000000022400000000000002840000000000000244000000000000024400000000000002E40000000000000244000000000000034400000000000002440'::geometry as n ) select st_astext(e) as e, st_astext(n) as n, st_equals(e,n), st_hausdorffdistance(e,n), st_astext(st_symdifference(e,n)) as sd, st_equals(st_snap(n,e,1e-14), e) as snapequals from inp;
-[ RECORD 1 ]--------+-------------------------------------------
e                    | LINESTRING(20 10,10 10,9 12)
n                    | LINESTRING(9 12,10 10,15 10,20 10)
st_equals            | f
st_hausdorffdistance | 1.77635683940025e-15
sd                   | MULTILINESTRING((10 10,9 12),(9 12,10 10))
snapequals           | t

comment:13 by strk, 12 years ago

Above you can see there are two very close-by vertexes on (9,12) but not quite the same. That vertex was the split point for the first edge added, found by noding the snapped verso of second edge against the first edge.

comment:14 by strk, 12 years ago

Uhm but then again it does make sense for TOpoGeo_AddPoint to project the point, following the rule that whoever was there first (the edge) should drive operations (the point being added must snap to the existing edge, not the other way around)

comment:15 by strk, 12 years ago

Eventually the issue is with ST_ClosestPoint not returning the _exact_ existing vertex if there is one nearby. In this case it's probably returning something pretty close to one but different.

comment:16 by strk, 12 years ago

with inp as (SELECT '01020000000300000000000000000034400000000000002440000000000000244000000000000024400000000000002240FFFFFFFFFFFF2740'::geometry as e, '01020000000400000000000000000022400000000000002840000000000000244000000000000024400000000000002E40000000000000244000000000000034400000000000002440'::geometry as n ) select st_astext(e) as e, st_astext(n) as n, st_equals(e,n), st_hausdorffdistance(e,n), st_astext(st_symdifference(e,n)) as sd, st_equals(st_snap(n,e,1e-14), e) as snapequals, st_equals(st_endpoint(n), st_startpoint(e)) as new_end_equals_exs_start, st_equals(st_startpoint(n), st_endpoint(e)) as new_start_equals_exs_end, st_equals(st_snap(n,st_endpoint(e),5), e) as snap_to_lastpoint_equals from inp;

The above shows that altough the problem is vertex (9,12) not being exactly the same, snapping the new edge to the existing edge makes them equal but snapping the new edge to _only_ the (9,12) vertex of the old edge does _not_ make them equal.

comment:17 by strk, 12 years ago

New edge snapped to old edge is also pretty weird:

select st_equals('LINESTRING(10 10,9 12,10 10,15 10,20 10)', 'LINESTRING(20 10,10 10,9 12)');

comment:18 by strk, 12 years ago

New idea: the problem is in ModEdgeSplit:

psql:bug.sql:1: NOTICE:  ModEdgeSplit: newedge1 endpoint does not match split point
psql:bug.sql:1: NOTICE:  ModEdgeSplit: newedge2 startpoint does not match split point

comment:19 by strk, 12 years ago

Resolution: fixed
Status: newclosed

I'll call this fixed with r9540 (and r9539).

You will still get some spaghetti topology, but that's due to ST_Snap being an inappropriate tool to deal with tolerances. See #1712 for one (among a few) possible ways to improve it.

Generally speaking, try not to abuse tolerance. Should be fine as long as no segment are shorter than the given tolerance. We're probably missing a function to tell us about that (min/max segment length would be good stats for vectors).

Note: See TracTickets for help on using tickets.