Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#3000 closed enhancement (fixed)

Edge linking does not use index

Reported by: strk Owned by: strk
Priority: medium Milestone: PostGIS 2.2.0
Component: topology Version: master
Keywords: performance Cc: Björn Harrtell

Description

I've found sequencial scans on the edge table during topology building which are due to the UPDATE queries to edges performed to set edge linking. The queries are of this form:

UPDATE topo_ulfareale.edge_data SET next_right_edge = -1175, abs_next_right_edge = 1175 WHERE edge_id != 1175 A
ND next_right_edge = -1114;

But being there no index on next_right_edge the planner goes for a sequencial scan.

See also #2993 for skipping the UPDATE completely (but the SELECT query would still be needed so this ticket has its own right)

Change History (7)

comment:1 by strk, 9 years ago

Ticket #3010 is proposing to add indexes on next_right_edge and next_left_edge

comment:2 by strk, 9 years ago

Cc: Björn Harrtell added

As indexes make database size bigger and slow down insertions/updates (see https://github.com/postgis/postgis/pull/30#issuecomment-68251146) an alternative would be to seek for the edge to be updated within the set of edges that should have been already fetched to check for intersection, or use the endpoint nodes indexes. That is, rewriting the update sql query.

comment:3 by strk, 9 years ago

The SQL query is in the ST_ModEdgeSplit function (and similar one appears in ST_NewEdgeSplit). This means the penalty only occurs on edge splitting, which should only happen when adding an edge which intersects an existing one.

Hint for testing: use ST_ModEdgeSplit and ST_ModEdgeHeal iteratively while changing query/indexes.

comment:4 by strk, 9 years ago

This patch modifies the SQL query of ST_ModEdgeSplit and ST_ModEdgeHeal to add a filter on start_node and end_node to get the opportunity to use existing indexes. Speeds up my syntetic test from 0.4 seconds to 0.064 seconds (splitting and re-healing an edge in a ~500k edges topo).

http://strk.keybit.net/tmp/useIndexOnSplitAndHealModEdge.patch

Björn, could you test it against your use case ? The patch will need to be finished to replicate the code in the "NewEdge?" versions of the functions.

comment:5 by strk, 9 years ago

Resolution: fixed
Status: newclosed

I've committed the improvement in trunk as of r13166 and in 2.1 branch as of r13167.

comment:6 by Björn Harrtell, 9 years ago

For posteriority - Preliminary results show that I get similar speedup with this patch as with additional indexes on edge_data (which are undesirable).

Screenshot of a time profile of the statements run at loading ~12000 unclean polygon into a new topology (run with r13167): https://cloud.githubusercontent.com/assets/141030/5582520/2759ed00-906b-11e4-89c6-48e00ebb30e3.png

comment:7 by strk, 9 years ago

Thanks. _ST_AddFaceSplit is on my radar already.

It's interesting to see 5 seconds spent in 111k SELECTs into "topo"."face" with simple "face_id" = X operation. Given you loaded 12k polygons, that's 10x selects for each input row (but this would be better discussed on the mailing list than here).

Note: See TracTickets for help on using tickets.