Opened 15 years ago

Closed 15 years ago

#137 closed enhancement (fixed)

ST_ShortestLine, ST_LongestLine, ST_max_distance, ST_DFullyWithin (included in #231)

Reported by: post...@… Owned by: nicklas
Priority: medium Milestone: PostGIS 1.5.0
Component: postgis Version:
Keywords: Cc:

Description (last modified by robe)

This is not a defect report ! But I had no choise but that.

It's a suggestion about some change in : st_distance(geometry,geometry) and a new function shortestline(geometry, geometry) and a try to get st_max_distance(geometry,geometry) work.

The reason for putting them together here is that they all use the same functions.

I'm new to this som you have to excuse some amateur mistakes.

I've struggled to make to patches to attach.

The on shortestline is working and adds the function shortestline like: Create OR REPLACE FUNCTION shortestline(geometry,geometry) RETURNS geometry AS '$libdir/postgis-1.4', 'LWGEOM_shortestline2d' LANGUAGE 'c' IMMUTABLE STRICT;

it also increases the speed of st_distance , at least in the few tests I have done.

The other patch maxdistance.patch is not working. I really don't understand why, but I think the idea of approach is good :-).

/Nicklas

Attachments (7)

ShortestLongest.patch (57.1 KB ) - added by nicklas 15 years ago.
shortLong_doc.patch (4.7 KB ) - added by nicklas 15 years ago.
Documentation patch for shortestline and longestline
shortest_longest_fullywithin.zip (11.0 KB ) - added by nicklas 15 years ago.
regression tests included
shortest_longest_fullywithin_new_poly_poly_20090716.patch (2 bytes ) - added by nicklas 15 years ago.
shortest_longest_fullywithin_new_poly_poly_20090716.zip (214 bytes ) - added by nicklas 15 years ago.
states.backup (5.6 MB ) - added by robe 15 years ago.
PostgreSQL backup of US States table
shortest_longest_fullywithin_new_poly_poly.zip (11.1 KB ) - added by nicklas 15 years ago.

Change History (28)

comment:1 by post...@…, 15 years ago

There were a lot of bugs in my first posting. Now everything seems to work as it should in my small tests. This patch includes:

1) a new approach to st_distance (and st_dwithin) 2) st_maxdistance that I think seems to work properly 3) st_shortestline a new function that returns a line from the points where st_distance is measured. 4) st_longestline, the same as shortestline but to the line from maxdistance.

I have had some problem with my compilation bahaving differnt from time to time but noe I think it seems to work.

There is a lot of naming that is just temporary and I don't know what conventions to follow.

hope for some feedback Nicklas Avén

comment:2 by pramsey, 15 years ago

Check out the STYLE document at the source root.

comment:3 by post...@…, 15 years ago

Some news

I've done som changes som now I think the st_maxdistance function handles polygons with holes. It ignores every intersection and just continue to find the to points most far from each other in the two input geometries. A nice result from that is that if you input the same geometry both as geometry1 and geometry2 you get the longest distance from one part of the geometry to another part.

I have also atached some images showing my results with st_shortestline and st_longestline. The length of these lines is the same as the distances of st_distance and st_maxdistance. I just made images of some polygonexamples, but I thing it seems to work propably with points and lines too.

I have also tried to follow the Style document and I have done some renaming and commenting. I guess my renaming makes the patch so much bigger now.

the syntax for the new functions is just: st_maxdistance(geometry,geometry) st_shortestline(geometry,geometry) st_longestline(geometry,geometry)

Nicklas Avén

comment:4 by lr1234567, 15 years ago

Nicklas this all sounds good. I'm hoping to try this out on a couple of use cases once I have the other testing we are doing done for our frozen release.

One small thing and people can correct me if they feel this is unnecessary.

Can you split this out. By that I mean.

1) The st_max_distance function I consider a bug fix since we have a non-working stub function for it already, not a new function so given our new unspoken policy it is probably fine to push this to 1.3.7, 1.4.1 2) st_distance enhancements since it doesn't affect exposed api can also be considered 1.3.7 , 1.4.1 material

3) but exposing the other new functions even though it sounds like they support max distance, are new functions and those would require we wait till 1.5 or gasp 2.0.

Is there anyway you can separate these out even if you just don't include the new functions in postgis.inc.sql.c.

Mark, Paul or Kevin can correct me if I am wrong on this request.

Thanks, Regina

comment:5 by lr1234567, 15 years ago

Slight clarification. This will sound kind of odd.

My feeling is include the new functions as dependency functions, but don't expose them in the .sql file. It sounds petty I guess, but would make I think dropping in a new binary easier for people.

Thanks, Regina

comment:6 by post...@…, 15 years ago

Regina, sounds ok I think

I have commented out the longest and shortest line-functions in postgis.inc.sql.c

Then I have made I very small change in moving some declarations because of a warning that ISO C90 don't want mixed declarations and code. So now I get no new warnings when I'm compiling.

One question:

In the function st_shortestline, if the two geometries intersect the returned line has both the start and the end-point in the first intersection-point. I thing that's a logical result and useble with st_startpoint or st_endpoint to identify the intersection if distance is 0. But there are severel occasions when the distance is 0 but there is no intersection-point like one geometry inside polygon. Now you get a line with starting and ending at origo. That's not good. Is a empty geometry the best or…

/Nicklas Avén

comment:7 by robe, 15 years ago

Nicklas, I think in those cases, its better to return null rather than an empty geometry. The main reasons I think its best is that I think that's the way the other functions in PostGIS work when no valid geometry can be given (such as when you ST_Simplify something too much).

The other reason is that if someone is using this to insert/update a geometry column, then most likely they have a constraint that allows null or a linestring or multilinestring. If you return empty geometry, they would need to do checking to check if its empty before they can insert — which adds another level of annoyance.

The 3rd reason is that most geos functions don't like GEOMETRY collections and barf when they see one — so if it is used as a preprocessor for something else — it may halt execution. (and an empty geometry I think is considered a geometry collection — or hmm maybe we really do support the concept of an empty LINESTRING though not sure what GEOS does with such a thing). I think Paul may have been working on that.

Alternatively I suppose you could return a point (hmm or maybe not since you would still have the annoyance of checking and the function is after all called the shortest_line).

comment:8 by post...@…, 15 years ago

Here is a new patch in wich I try to preserve the order of incoming geometries. If that order isn't preserved st_shortestline will be quite useless.

I'm not totally sure that I have succeeded to think of all possibilities where the order will be mixed up. But I will look through it some more.

In this patch the bug discussed in #146 should be fixed to.

Now, let me just give an example of the use of the function st_shortestline, why I think it should be in postgis. It is efficient and simple to use for snapping. Let's say you have an GPS-point taken on a road some where. The point will not be exactly on the road beacause of precision issues both to the GPS and the map. To find the closest point to the gps-point on the road-line using st_shortestline we just do: SELECT st_endpoint(ST_shortestline(g1,g2)) As pointonroad from (select ST_geomFromEWKT('POINT(2 4)') As g1, ST_geomFromEWKT('LINESTRING(1 2, 2 5)') As g2) a;

Here the order of inputing geometrys to st_shortestline is of importance.

/Nicklas Avén

comment:9 by robe, 15 years ago

Description: modified (diff)
Milestone: 2.0

comment:10 by robe, 15 years ago

Owner: set to robe
Status: newassigned

comment:11 by robe, 15 years ago

Type: taskpatch

comment:12 by robe, 15 years ago

Summary: st_distance, shortestline (, st_maxdistance)st_distance, shortestline

Change title and deleted attachments per Nick's request. He will put maxdistance patch in the maxdistance bug report

by nicklas, 15 years ago

Attachment: ShortestLongest.patch added

by nicklas, 15 years ago

Attachment: shortLong_doc.patch added

Documentation patch for shortestline and longestline

comment:13 by nicklas, 15 years ago

Summary: st_distance, shortestlineST_ShortestLine, ST_LongestLine, ST_max_distance, ST_DFullyWithin
Type: patchenhancement

Here we go again

I'm sorry to admit that the patch is still not tested properly. I will try to get time to understand more about how to build the tests, but if someone have the possibility to help it would be great.

In the patch is the functions: ST_ShortestLine(geometry, geometry) ST_LongestLine(geometry, geometry) ST_Max_Distance(geometry, geometry) ST_DFullyWithin(geometry, geometry, double precision)

I have tried to reflect the correct in doc/reference.xml postgis.sql.in.c uninstall_postgis.sql.in.c

The new approach to the distance-calculations seems to be more efficient in iterating through the vertexes. When using it on big geometries with many vertexes it will be significant faster, but on small geometries there is almost no difference.

/Nicklas

comment:14 by nicklas, 15 years ago

OBS! please ignore the patches with old dates. Maybe someone can remove them. /Nicklas

by nicklas, 15 years ago

regression tests included

comment:15 by nicklas, 15 years ago

The patch: shortest_longest_fullywithin_new_poly_poly.zip includes a quite wild rewrite of the old: distance2d_poly_poly in measures.c

it makes a big difference on ugly polygons. Before, if there was many holes it did the same iteration for every hole. In my tests it reduces querytime to between 10-15 % on the real ugly ones. I have added a patch in ticket #63 for the 1.4 branch /Nicklas

comment:16 by robe, 15 years ago

Nicklas,

Sorry I haven't been very encouraging looking at the patches. I promise once things settle down and we hmm finally release 1.4.0 (HINT HINT). I'll pay more attention to your patch.

I do have one for you to try out. I took united states (fairly high resolution version) and unioned each together to get an individual record for each state. To demonstrate the cascade union power of 1.4, but now ST_Distance is hmm - incredibly slow.

So here is one for you to chump on:

SELECT a.state As st_a, b.state As st_b, ST_Distance(a.the_geom, b.the_geom) As dist_m

FROM states AS a

CROSS JOIN states AS b

WHERE a.state = 'Maine' and b.state = 'Rhode Island';

The above query takes 4.7 seconds on our current 1.3 and 1.4 distance implementations and returns

131103.250239575

If I do distance between Texas and California — lets just say its so painful, I don't even have the patience to wait for it to complete.

So I would be interested in seeing what timings you get with these and the answers with your revised routines.

I'll attach the data set shortly.

by robe, 15 years ago

Attachment: states.backup added

PostgreSQL backup of US States table

comment:17 by robe, 15 years ago

Update — I mustered enough patience to wait

—yields — 745222.745755735 meters - 90407ms

SELECT a.state As st_a, b.state As st_b, ST_Distance(a.the_geom, b.the_geom) As dist_m

FROM states AS a

CROSS JOIN states AS b

WHERE a.state = 'California' and b.state = 'Texas';

comment:18 by nicklas, 15 years ago

I get the hint :-) It seems like a long way for me to get that make check going in windows, but I will give it a try later today…

About your states the result wasn't that impressing. I thought it would be a little bit better, but the real gain is when there is holes in the polygons that can be sorted away. I also think there is tuning to be done. But it is faster :-)

My results:

the first query: postgresql 8.3 postgis 1.3.6 with no patch 4371ms result 131103.250239575 postgresql 8.4 postgis 1.4.0SVN with no patch 3232ms result 131103.250239575 postgresql 8.4 postgis 1.4.0SVN with patch 2327ms result 131103.250239575

the second query: postgresql 8.3 postgis 1.3.6 with no patch 89485ms result 745222.745755735 postgresql 8.4 postgis 1.4.0SVN with no patch 63197ms result 745222.745755735 postgresql 8.4 postgis 1.4.0SVN with patch 44423ms result 745222.745755735

Beside from this I noted something very strange. Consequently the cpu is behaving differently in the different upsets. In this laptop there is a Intel Core 2 Duo (Conroe). I don't have any tools to analyze this more precise, but in the task manager I can see that in the first case with postgresql 8.3 and postgis 1.3.6 the load is balanced between the cores. Total cpu-usage is about 50 % and it uses about 50 % per core, but in the two cases with postgresql 8.4 with postgis 1.4, both with and without the patch it takes all the power from one of the cores. Still getting 50 % together, but taking it all from one of them. Then I thought that this came from something in postgresql of course, but running a patched version of the trunk is getting the same behavior as pg 8.3 postgis 1.3. Now it is agin balancing the cores and using about 50% from each.

I don't know if the graphics of the task manager is trustable but anyway it shows a difference between 1.4 and the other two, 1.3. and 1.5trunk.

can postgis in anyway affect the cpu-behavior??

/Nicklas

comment:19 by nicklas, 15 years ago

sorry, my little table with results get messed up here because the newline where igored. With som good will it's readable I hope.

comment:20 by nicklas, 15 years ago

Owner: changed from robe to nicklas
Status: assignednew

comment:21 by nicklas, 15 years ago

Resolution: fixed
Status: newclosed
Summary: ST_ShortestLine, ST_LongestLine, ST_max_distance, ST_DFullyWithinST_ShortestLine, ST_LongestLine, ST_max_distance, ST_DFullyWithin (included in #231)
Note: See TracTickets for help on using tickets.