Opened 16 years ago

Last modified 16 years ago

#51 closed enhancement (fixed)

Enhancement to ST_Summary to include info about invalid geometries

Reported by: robe Owned by: robe
Priority: low Milestone: PostGIS 1.4.0
Component: postgis Version: 1.4
Keywords: Cc:

Description

Such a function would definitely be nice to see. Perhaps a set-returning function that returns the problematic locations (ie. a geometry collection of the points or lines of intersections) and a brief text field describing the error (ie. 'self-intersecting boundary', 'hole outside polygon', 'shared edge in multipolygon', …)

Add it to the wish list :)

— Kevin

Obe, Regina wrote:

For some reason I always expected the summary function or at least have an accompanying function that does that would flag the details of invalid geometries as part of its inspection and give info similar to what ST_Invalid NOTICE gives. Why I would think such a silly thing is beyond me, or maybe such a function exists and I haven't noticed.

Is it a very costly check? Seems like it would be useful. The problem with the ST_IsValid notice is that it comes at you as one continuous stream - so if you have a whole slew of geometries, you have no idea which notice goes with what. I got one of those such cases from nowhere with 100 of 6000 someodd geometries being invalid and I was hoping to generate a report detailing to the provider why there dataset sucks and what exactly is wrong with it.

Thanks, Regina

Change History (9)

comment:1 by pramsey, 16 years ago

ST_IsValidReason in trunk for testing (requires GEOS 3.1+)

comment:2 by robe, 16 years ago

Not sure I'm missing something - but 3.1 compiles but my make check fails and all my databases still register as 1.3.4. This is testing on Open SUSE 11.0/Geos 3.1.0 latest SVN/Postgis 1.4 SVN.

When I look at regress log it gives this error

DETAIL: Creating a shell type definition. psql:lwpostgis.sql:55: ERROR: could not load library '/usr/lib/postgresql/lwpostgis.so': /usr/lib/postgresql/lwpostgis.so: undefined symbol: GEOSisValidReason

This is on my OpenSUSE 11.0 box and this is the first time I compiled postgis 1.4 on this box so could be something wrong with my install.

Also I was going to ask this before but 1.3.4 when you do all the configure stuff, it lists the paths of the things its going to use. 1.4 doesn't do that. Is there a reason why 1.4 doesn't do that? I thought it a nice feature of 1.3.4.

comment:3 by pramsey, 16 years ago

(a) you need to 'svn up' your GEOS, make and make install it before you compile your PostGIS (b) don't forget that Mark changed the install location in 1.4 (I think we might want to talk about it, it's a potential nightmare) so that the 'real' lwpostgis.sql is located in share/contrib now, not a level lower. if you install the lower you, you end up with a 1.3 database (since probably that's what you put there)

comment:4 by robe, 16 years ago

Yap figured that out. I thought I had done that, but I guess I was mistaken. I was using the sites svn nightly build which normally works, but I guess your changes were too new that they hadn't hit that. Anyrate it compiles now.

Once I test out - I'll put an entry in the docs and note that its available for PostGIS 1.4+ and requires GEOS 3.1.0+ (emm why don't we have something on GEOS 3.1.0 version info that denotes it hasn't been released yet.)

Thanks, Regina

comment:5 by robe, 16 years ago

Okay tested and documented. Also verified correctly throws an error rather than crashing when fed circular strings and also returns null when fed null.

comment:6 by mcayland, 16 years ago

pramsey: Nice work - but unfortunately r3341 breaks the build for GEOS 3.0! The fix is to wrap the entire function in an #if POSTIGS_GEOS_VERSION.. #endif section so that it only gets compiled with GEOS ≥ 3.1. I'm fairly snowed under at the moment, so feel free to add this if you manage to get a chance.

About the lwpostgis.sql install location: this is handled automatically by PGXS, so it's not something we have direct control over :( Then again, for me it was a bugfix anyway since the 1.3 hacked-up Makefiles always used to drop it in /usr/share which was totally wrong anyway…

comment:7 by pramsey, 16 years ago

Wrapped.

comment:8 by kneufeld, 16 years ago

I'm just getting around to testing this. Good work Paul. It does seem like a good start in a collection of geometry cleanup functions.

For my use cases, however, it would need to be a set returning function that returns all the locations where a geometry falls down … as geometries.

ie. select st_isvalidreason(

'POLYGON (( 0 0, 0 4, 4 4, 4 0, 0 0 ),

( 3 5, 2 5, 2 6, 3 6, 3 5 ), ( 2 2, 2 3, 3 3, 5 2, 2 2 ))'::geometry);

st_isvalidreason


Self-intersection [4 2.5]

(1 row)

should also report a 'Hole lies outside shell [3 5]' … but this is expected since this function is not a set returning function.

We already have ST_Dump, ST_DumpRings, (and ST_DumpPoints on the wishlist). It would be really nice to add ST_DumpInValidReason that returns a 'setof geometry_dump'. IE. return all the self-intersection points/lines, all holes that are outside, etc.

— Queries like this would then be possible. SELECT feat_id, (ST_DumpInValidReason(the_geom)).geom FROM myfeatures WHERE NOT ST_IsValid(the_geom);

Martin says that this information is available in JTS, just not exposed yet, so it would be some work to get it into PostGIS. But, does this sound like a do-able thing to add to the wishlist? I know I've definitely needed something like this in the past. Is this overloading the is_valid concept too much? I'm just thinking that having an exhaustive set of cleaning functions would be a very useful thing to add to PostGIS.

In the mean time, is it possible the prototype of this function change to something that returns a geometry instead of a textual description? Perhaps both .. a single instance of CREATE TYPE invalid_reason AS

(description text,

geom geometry);

— Kevin

comment:9 by pramsey, 16 years ago

The GEOS library only holds one reason at the moment (the last?)… so that's what you get for now. If Martin can provide a patch to expose all the reasons, we can move to returning more. I don't mind returning a complex type instead of just the string, if you're too damn lazy to parse out the error point :)

Note: See TracTickets for help on using tickets.