Opened 6 weeks ago

Last modified 5 weeks ago

#5848 new enhancement

expose a lower level function of TopoGeo_AddLineString that returns more info

Reported by: Lars Aksel Opsahl Owned by: strk
Priority: medium Milestone: PostGIS 3.5.3
Component: topology Version: 3.4.x
Keywords: Cc: Lars Aksel Opsahl

Description

We need to keep track of each edges source when we merge multiple table into a common topology.

The problem with TopoGeo_AddLineString is that it only return edge id's representing the line added.

Lets say we have a empty topology and add a line from table "one", we then return edge id 1 so we know that edge 1 is from table "one".

Then we add a line from table "two" crossing edge 1, we then return 2 new edge id's so we have control of the edges representing line from table "two".

The problem is that we now have lost control of what edges representing line from table "one", because edge 1 has been split into two new edges as side effect.

In the low level code of TopoGeo_AddLineString you have control of all this side effects.

Is it it Ok to expose a new method that returns side effects like edges that has be splited and so on and not only the edges representing the line added ?

Change History (8)

comment:1 by Lars Aksel Opsahl, 6 weeks ago

The return struct for the side effects need contain info that shows what edges that now represents edge 1 for instance something like this

[
[0,[3,4]],
[1,[1,2]] 
]

The first line(0) is always representing the new line.

The next line is representing the side effects and tells us that edge 1 has been split into edge 1 and 2.

Version 1, edited 6 weeks ago by Lars Aksel Opsahl (previous) (next) (diff)

comment:2 by strk, 6 weeks ago

The reason why those functions automatically update the TopoGeometry objects definition is specifically to allow you to "track" that information. By saving a TopoGeometry, rather than the "edge identifier" you will be able to query its composition and automatically find which edges it is composed by at any time.

What you might be missing is a notification of WHEN the TopoGeometry object you are holding changes composition. Note that the composition of that object might be changed also from a different transaction than the one in your session.

Every now and then I think that we could be using the asynchronous notifications support, to let various listeners know when the definition of a TopoGeometry changes ( https://www.postgresql.org/docs/current/sql-notify.html )

That need of being notified may serve other purposes like think triggers on the table holding TopoGeometry. The literal value of a TopoGeometry - for example: '(1,2,3,4)' - never changes, but the actual composition (stored in the topology relation table) can very well change and the only option at the moment to know when something happens is by adding user triggers to the relation table.

What I'm afraid of is that even if you have a function that tells you what happened to the topology edges you'll still be missing some events. For this maybe a trigger on the relation table is even better than a listen/notify system.

in reply to:  2 comment:3 by Lars Aksel Opsahl, 6 weeks ago

Replying to strk:

The reason why those functions automatically update the TopoGeometry objects definition is specifically to allow you to "track" that information. By saving a TopoGeometry, rather than the "edge identifier" you will be able to query its composition and automatically find which edges it is composed by at any time.

The reason why I do not TopoGeometry is performance in the resolve overlap and gap code.

What you might be missing is a notification of WHEN the TopoGeometry object you are holding changes composition. Note that the composition of that object might be changed also from a different transaction than the one in your session.

I run single thread in different grid cell with no connecting edges so this should not be a problem.

Every now and then I think that we could be using the asynchronous notifications support, to let various listeners know when the definition of a TopoGeometry changes ( https://www.postgresql.org/docs/current/sql-notify.html )

Thanks, interesting, I need to checkout this for running parallel.

That need of being notified may serve other purposes like think triggers on the table holding TopoGeometry. The literal value of a TopoGeometry - for example: '(1,2,3,4)' - never changes, but the actual composition (stored in the topology relation table) can very well change and the only option at the moment to know when something happens is by adding user triggers to the relation table.

What I'm afraid of is that even if you have a function that tells you what happened to the topology edges you'll still be missing some events. For this maybe a trigger on the relation table is even better than a listen/notify system.

This will always bee an issue with database some others might change a row after you have updated it, that's a db locking issue. In this case have control on this.

comment:4 by Lars Aksel Opsahl, 6 weeks ago

For now I will do more checks after lines are added where pick out edges that are missing origin table info in my local data structure and then add this origin info in a post operation.

comment:5 by strk, 6 weeks ago

I'm open the the introduction of a method to report all side-effects. How to encode the return code is non-trivial.

Addition of a line may:

  1. Add one or more edges
  2. Split one or more edges
  3. Modify one or more edges shape (due to snapping)
  4. Remove one or more edges (experimental edge merging, since 3.5.0)
  5. Split one or more faces
  6. Turn a previously isolated node into a non-isolated node

A lot of things to report ! Maybe it could be a table-returning function, so that the return set could be appended to a log table by the caller, if needed.

comment:6 by Lars Aksel Opsahl, 5 weeks ago

Nice maybe we also can reuse some this in functions like this https://gitlab.com/nibioopensource/pgtopo_update_sql/-/blob/develop/src/sql/topo_update/function_02_add_border_split_surface.sql?ref_type=heads

For resolve overlap and gap we use work around for now.

Before we start on this issue we need to get control of the rest of the projects for this year so we keep this on hold for now.

How many hours do you think this issue may take to implement ?

comment:7 by strk, 5 weeks ago

The most time consuming part would be defining a clear specification for the function - what details to return for each possible operation - and making sure that those details would be enough to solve your problem.

As stated:

The problem is that we now have lost control of what edges representing line from table "one", because edge 1 has been split into two new edges as side effect.

The problem is specifically solved by using TopoGeometry.

Are you sure you could make your calling function faster than TopoGeometry handling if you have the split information ? If so, would it be worth improving TopoGeometry handling instead ?

comment:8 by strk, 5 weeks ago

For the record: latest plpgsql version of TopoGeo_addLinestring can be found in the "stable-2.1" branch of PostGIS, in case we want to prototype a table-returning version with other details, or a version which organizes the splits in a completely different way/order to aim for further speed improvements.

Note: See TracTickets for help on using tickets.