Opened 9 years ago
Closed 8 years ago
Last modified 8 years ago
#3000 closed enhancement (fixed)
Edge linking does not use index
|Reported by:||strk||Owned by:||strk|
I've found sequencial scans on the edge table during topology building which are due to the UPDATE queries to edges performed to set edge linking. The queries are of this form:
UPDATE topo_ulfareale.edge_data SET next_right_edge = -1175, abs_next_right_edge = 1175 WHERE edge_id != 1175 A ND next_right_edge = -1114;
But being there no index on next_right_edge the planner goes for a sequencial scan.
See also #2993 for skipping the UPDATE completely (but the SELECT query would still be needed so this ticket has its own right)
Change History (7)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
As indexes make database size bigger and slow down insertions/updates (see https://github.com/postgis/postgis/pull/30#issuecomment-68251146) an alternative would be to seek for the edge to be updated within the set of edges that should have been already fetched to check for intersection, or use the endpoint nodes indexes. That is, rewriting the update sql query.
comment:3 by , 8 years ago
The SQL query is in the ST_ModEdgeSplit function (and similar one appears in ST_NewEdgeSplit). This means the penalty only occurs on edge splitting, which should only happen when adding an edge which intersects an existing one.
Hint for testing: use ST_ModEdgeSplit and ST_ModEdgeHeal iteratively while changing query/indexes.
comment:4 by , 8 years ago
This patch modifies the SQL query of ST_ModEdgeSplit and ST_ModEdgeHeal to add a filter on start_node and end_node to get the opportunity to use existing indexes. Speeds up my syntetic test from 0.4 seconds to 0.064 seconds (splitting and re-healing an edge in a ~500k edges topo).
Björn, could you test it against your use case ? The patch will need to be finished to replicate the code in the "NewEdge?" versions of the functions.
comment:5 by , 8 years ago
|Status:||new → closed|
comment:6 by , 8 years ago
For posteriority - Preliminary results show that I get similar speedup with this patch as with additional indexes on edge_data (which are undesirable).
Screenshot of a time profile of the statements run at loading ~12000 unclean polygon into a new topology (run with r13167):
comment:7 by , 8 years ago
Thanks. _ST_AddFaceSplit is on my radar already.
It's interesting to see 5 seconds spent in 111k SELECTs into "topo"."face" with simple "face_id" = X operation. Given you loaded 12k polygons, that's 10x selects for each input row (but this would be better discussed on the mailing list than here).
Ticket #3010 is proposing to add indexes on next_right_edge and next_left_edge