Opened 11 years ago

Closed 11 years ago

#332 closed defect (fixed)

WKT geometry tag match and mismatch

Reported by: mloskot Owned by: pramsey
Priority: medium Milestone: PostGIS 1.5.0
Component: postgis Version: master
Keywords: Cc:

Description

Depending on the manner of geometry construction from Well-Known-Text representation, WKT geometry tag is case sensitive or not (POLYGON vs polygon).

Here are details:

  • Construction using ST_GeomFromText is case-sensitive
    test=# SELECT ST_GeomFromText('polygon((0 0,4 0,4 4,0 4,0 0))');
    ERROR:  Invalid OGC WKT (does not start with P,L,M,C or G)
    
  • Construction using explicit cast to GEOMETRY is not case-sensitive
    test=# SELECT 'polygon((0 0,4 0,4 4,0 4,0 0))'::geometry;
                                                                                              geometry
    
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     010300000001000000050000000000000000000000000000000000000000000000000010400000000000000000000000000000104000000000000010400000000000000000000000000000104000000000000000000000000000000000
    (1 row)
    
    test=# SELECT ST_AsText('polygon((0 0,4 0,4 4,0 4,0 0))'::geometry);
               st_astext
    --------------------------------
     POLYGON((0 0,4 0,4 4,0 4,0 0))
    (1 row)
    

Environment:

  • OS: Windows
  • PostgreSQL
    test=# select version();
                               version
    -------------------------------------------------------------
     PostgreSQL 8.4.0, compiled by Visual C++ build 1400, 32-bit
    (1 row)
    
  • PostGIS installed from Regina's packages
    test=# SELECT postgis_full_version();
                                      postgis_full_version
    ----------------------------------------------------------------------------------------
     POSTGIS="1.5.0SVN" GEOS="3.2.0-CAPI-1.6.0" PROJ="Rel. 4.6.1, 21 August 2008" USE_STATS
    (1 row)
    

Change History (10)

comment:1 Changed 11 years ago by pimpaa

I'm not sure if this is the desired behavior, but when using

SELECT * FROM AddGeometryColumn?('public','foo','the_geom',-1,'point',2);

gives me an error too.

Only works with UPPERCASE geometry descriptors (windows, postgresql 8.3, postgis 1.4.0)

comment:2 Changed 11 years ago by pramsey

Milestone: PostGIS 1.5.0

The behavior comes from here

http://trac.osgeo.org/postgis/browser/trunk/postgis/lwgeom_ogc.c#L1045

The intent to is to only allow true OGC WKT in via the FromText?() function, even though other forms of text geometry (hex-encoded strings, EWKT that starts with 'SRID=') can be capably parsed by the input parser. This little roadblock seems to violate the "be liberal in what you accept" precept. I'm more inclined to remove it than to patch it up for case sensitivity. Other developer thoughts?

comment:3 Changed 11 years ago by mloskot

The OGC SFS documented in 99-049 does not specify the handling of case of WKT grammar elements, so it is not incorrect to assume implementors can handle it differently.

However, more recent implementation specification published as document OGC 06-103r3 does specify it precisely in 7.2.1 BNF Introduction:

The text representation of the instantiable Geometry Types implemented shall conform to this grammar. Well known text is case insensitive.

This suggests that the current inconsistent behaviour might be also invalid - it depends if and to which version of OGC specification PostGIS claims to conform.

comment:4 Changed 11 years ago by pramsey

I'm not suggesting the case sensitivity might be valid, I'm suggesting the anal retentive check inside the GeomFromText?() function that ensures that we don't handle EWKT might be a bridge to far, and we should excise it and just pass the text directly to the parser for handling, as the 'text'::geometry cast already does.

comment:5 Changed 11 years ago by mcayland

I think that we should keep the check for the moment, since at the moment the behaviour is very clear: if it works with GeomFromText?() then it's standards-compliant, if it doesn't then it's clearly flagged with GeomFromEWKT().

Until we have a new standard to work to, I think changing this behaviour will only cause confusion. IMO the problem here is that the OGC SFS/SQL-MM specifications have lagged behind reality for years now (I suspect the 99- prefix denotes 1999) and don't cover things like predicates in multiple dimensions, EMPTY geometry semantics and geodetic systems. At least with the EWKT/EWKB functions what we're saying is that this extended functionality is ours, and so we have every right to change it as we need.

comment:6 Changed 11 years ago by mloskot

Paul, my imprecise understanding, got it now.

comment:7 Changed 11 years ago by robe

Mark,

I disagree with you on this point and mostly your train of thought rather than we should or shouldn't keep this. Well I don't think we should keep this as Paul refer's to it "Anal retentive check" now that I have heard your argument.

The reason being. The spec details the minimum that we should support. So any valid geometry we need to support with what we claim to be compliant :).

However the spec doesn't say anything about extending said support and databases do it all the time -- hell we are guilty of it in a number of functions. For example we go way beyond what the spec says we should support with ST_PointOnSurface and ST_Centroid (I discovered this when trying to figure out why the hell can't I do yeh in IBM and SQL Server and eh certainly not Oracle because, well they decided to not go further than what the spec dictated in yeh or neh or stick (I think Oracle stuck to the letter on those 2)).

The spec says nothing about support ST_Union for aggregates -- I don't think ST_Union is defined for aggregates and we even state that in our docs, and sure many spatial dbs decided that was hmm stupid.

So my general argument -- as long as we can support at least the minimum of what the spec says is valid, we are okay and its our liberty to go beyond that. IBM for one in all their examples I believe has propercase and all their examples show that. -- so I think its a no-brainer that we have to fix this to support all types of cases people who haven't discovered the cap lock may use or if they are just used to typing in a different case. That is clearly a bug and while we are fixing it, we might as well make our code shorter and easier to maintain.

comment:8 Changed 11 years ago by pramsey

I'm on Regina's side, but I want to make sure she understands what she's signing on to: right now, we only accept WKT in GeomFromText?, and upper-case WKT at that, because of the silly check. The silly check can be *enhanced* to also support lower case. Or it can be *removed* to support both cases and also accept EWKT and hex-encoded WKB. I prefer the latter.

comment:9 Changed 11 years ago by robe

Yes I'm aware of what I'm signing on to. I think for this whole OGC discussion -- its really more relevant what we do in output functions -- and for inputs that we accept at a minimum what the OGC/MM requires given OGC/MM inputs.

If we loosen our constraint a little -- it doesn't change prior behavior -- old apps work unchanged and behave exactly the same (except possibly they won't be whining so when given extended input or binary -- does anyone care? :)) . So new apps can get away with passing in 3D or SRID or binary. I don't see that as big enough issue to complicate our code over.

comment:10 Changed 11 years ago by pramsey

Resolution: fixed
Status: newclosed

OK, the check for OGC-standard WKT has been removed at r5001 and as a side-effect fixed this bug.

Note: See TracTickets for help on using tickets.