Opened 11 years ago
Closed 10 years ago
#2372 closed defect (fixed)
KML with space around ordinate values considered invalid
Reported by: | strk | Owned by: | strk |
---|---|---|---|
Priority: | high | Milestone: | PostGIS 1.5.9 |
Component: | postgis | Version: | 1.5.X |
Keywords: | history, kml | Cc: | rouault |
Description
Works:
ST_GeomFromKML('<Point><coordinates>0,0</coordinates></Point>')
Fails:
ST_GeomFromKML('<Point><coordinates>0, 0</coordinates></Point>')
Fails:
ST_GeomFromKML('<Point><coordinates>0 ,0</coordinates></Point>')
If the spaces are before X or after Y ordinates, it still works.
Broken from 1.5 to 2.2..
Change History (28)
comment:1 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:2 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
comment:4 by , 11 years ago
I've a working strtod implementation but to my surprise this case is under regression testsuite as one that is supposed to fail. Really !? Why ?
From regress/in_kml.sql:
-- ERROR: Spaces insides SELECT 'coordinates_14', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1, 2 3, 4</kml:coordinates></kml:LineString>'));
comment:5 by , 11 years ago
My implementation supports the above interpreting it as SRID=4326;LINESTRING(1 2,3 4)
, which I find correct
comment:6 by , 11 years ago
Another 2 that my implementation supports:
<Point><coordinates>1,.1</coordinates></Point> becomes POINT(1.0 0.1)
<Point><coordinates>1,-.1</coordinates></Point> becomes POINT(1.0 -0.1)
Problems with that ?
comment:7 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:8 by , 11 years ago
Keywords: | history kml added |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
comment:9 by , 11 years ago
Is it really a bug ? As whitespace mean tuple separator in KML spec (cf section 16.9.1)
On the other hand, i'm fine with .1 and -.1 support. Maybe worth also to look on ST_GeomFromGML one as they were both written closely…
comment:10 by , 11 years ago
Better accepting it that bailing out, if it can be interpreted univoquely, don't you agree ? See the NASA kml, it's a real world case. Where's the KML spec ?
comment:11 by , 11 years ago
The KML spec is provided by OGC http://www.opengeospatial.org/standards/kml
The paragraph related to coordinatesType seems to me quite clear:
"String representing one or more coordinate tuples, with each tuple consisting of decimal values for geodetic longitude, geodetic latitude, and altitude. The altitude component is optional. The coordinate separator is a comma and the tuple separator is a whitespace. Longitude and latitude coordinates are expressed in decimal degrees only."
comment:12 by , 11 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:13 by , 11 years ago
The spec is still respected, comma separates ordinate values, space separates tuples. Only we now additionally support blank characters surrounding the comma, what's the problem with that ? I don't think the paragraph you report forbids that.
Have you tried the NASA KML against google earth or some other viewer ? I tried it here: http://www.doogal.co.uk/KmlViewer.php (can read it)
According to the website from which the KML was downloaded, these KML can be directly opened in Google Earth: http://earthdata.nasa.gov/data/near-real-time-data/firms/active-fire-data#tab-content-2
Do we want to be picky and NOT support them because OGC wording is open to stricter interpretation than KML writers may think ?
comment:14 by , 11 years ago
No the spec is NOT respected anymore. As if you use the same char (space here) for tuple separator AND for coordinate separator you have no control on what you are parsing (meaning tuples and dimensions)
Few samples:
SELECT 'Valid KML', ST_AsEWKT(ST_GeomFromKML(' <LineString> <coordinates>-71.1663,42.2614 -71.1667,42.2616</coordinates> </LineString>')); SELECT 'Junk KML I', ST_AsEWKT(ST_GeomFromKML(' <LineString> <coordinates>-71.1663,42.2614, 12 -71.1667, 42.2616,14</coordinates> </LineString>')); SELECT 'Junk KML II', ST_AsEWKT(ST_GeomFromKML(' <LineString> <coordinates>-71.1663,42.2614, 12 42.2616, 14</coordinates> </LineString>'));
comment:15 by , 11 years ago
The "Junk KML I" I'd consider a valid 3D linestring, the "Junk KML II" I'd indeed call invalid,and the current code doesn't complain (new bug).
comment:16 by , 11 years ago
But how could 'Junk KML I' be a valid KML 2.2 3D Linestring ? It just broke the spec.
I could hear that from your point of view you it's a progress to allow some invalid data to also be parsed. But from my point of view it's a regress, because we loose the validation information.
Adding a ST_MakeValidKML seems clearly overkill, but an extra parameter to ST_GeomFromKML allowing the end user to switch from one behaviour to an another could make sense at the very minimum printing a NOTICE each time we met an invalid data.
For Junk KML II, it's was just a sample, there's possibly some other use case quite similar
comment:17 by , 11 years ago
I think 'Junk KML I' is valid because it contains 2 coordinates of 3 dimensions each, with each ordinate value separated by comma and coordinates separated by _only_ spaces.
Instead 'Junk KML II' is invalid because it contains a mix of 3d and 2d coordinates. I tried a version of 'Junk KML II' without spaces and guess what ? The old implementation considers it valid:
SELECT 'Junk KML II no spacepad', ST_AsEWKT(ST_GeomFromKML(' <LineString> <coordinates>-71.1663,42.2614,12 42.2616,14</coordinates> </LineString>'));
So accepting it is not even a regression.
You're right I generally think it's better to accept invalids and clean within the database, but anyway I've not seens a KML in the wild with mixed coordinate values so it shouldn't be a problem to bail out in that case. It'd be a matter of _checking_ num_dims variable to match the previous value whenever a new coordinate is pushed on the stack. Do you want me to do that ?
comment:18 by , 11 years ago
Cc: | added |
---|
Thanks to rouault here's another reference: https://developers.google.com/kml/documentation/kmlreference?hl=fr#point
This one is even more explicit about spaces NOT being allowed (agreeing with Colivier): "Do not include spaces between the three values that describe a coordinate."
Still, I think we should follow the robustness principle "Be conservative in what you do, be liberal in what you accept from others".
The GDAL ticket: http://trac.osgeo.org/gdal/ticket/5140
comment:19 by , 11 years ago
I'm OK to be liberal, on this one, as soon as:
- we keep the ability to also known if the parsed data was (or not) valid according the spec
- very obviously if we still able to rightly parse every use case needed by the initial spec.
I suggested printing NOTICE, or extra parameter to do so, do you have a preference or an another idea ?
comment:20 by , 11 years ago
See how you like r11603 for handling the mixed-dimensions KML. Note it will still fail to detect multi-component geometries with different components having different dimensions (but this isn't a regression).
About notifying not-strictly-valid I think raising a NOTICE is going to be the simplest implementation as it won't introduce any new parameter.
comment:21 by , 11 years ago
For the NOTICE, when do we want to raise it ? Should space ONLY be allowed to separate coordinates ? In other words, are these all to raise a NOTICE ?
1: <> 1,2<> 2: <>1,2 <> 3: <>1 ,2<> 4: <>1, 2<>
And what about multiple spaces ? (as the spec say "a whitespace"):
5: <>1,2 1,2<>
comment:22 by , 11 years ago
I will take time to check r11603 since tomorrow noon.
Related to NOTICE, every time we accept to parse the input as it's indeed not conform to the spec.
Space is tuple separator so all theses use cases should falling in invalid ones. (1,2,5 previously was tolerated, but no reason to do so with NOTICE)
comment:23 by , 11 years ago
Humm i've replyied a bit too fast on my latest post:
After checking back in OGC KML 2.2 spec, it itself contains some examples (e.g in 9.10.4) allowing space before/after tuples, or also several spaces as a tuple separator
So explain why 1,2,5 use cases were previously allowed, in first coordinates parser implementation.
So mean NOTICE only on 3 and 4 uses cases.
I've give a look on latest trunk Here an enhanced version of coordinates regress/in_kml.sql who should reflect current discussion:
-- -- Coordinates -- -- X,Y SELECT 'coordinates_1', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>1,2</kml:coordinates></kml:Point>')); -- ERROR: single X SELECT 'coordinates_2', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>1</kml:coordinates></kml:Point>')); -- X,Y,Z SELECT 'coordinates_3', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>1,2,3</kml:coordinates></kml:Point>')); -- ERROR: 4 dimension SELECT 'coordinates_4', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>1,2,3,4</kml:coordinates></kml:Point>')); -- ERROR: Only commas SELECT 'coordinates_5', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>,</kml:coordinates></kml:Point>')); SELECT 'coordinates_6', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates> , </kml:coordinates></kml:Point>')); -- ERROR: empty or spaces SELECT 'coordinates_7', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates></kml:coordinates></kml:Point>')); SELECT 'coordinates_8', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates> </kml:coordinates></kml:Point>')); -- ERROR: End on comma SELECT 'coordinates_9', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>1,2,3,</kml:coordinates></kml:Point>')); SELECT 'coordinates_10', ST_AsEWKT(ST_GeomFromKML('<kml:Point><kml:coordinates>1,2,</kml:coordinates></kml:Point>')); -- ERROR: Begin on comma SELECT 'coordinates_11', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>,1 2,3</kml:coordinates></kml:LineString>')); -- Whitespaces before and after SELECT 'coordinates_12', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates> 1,2 3,4 </kml:coordinates></kml:LineString>')); SELECT 'coordinates_13', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates> 1,2 3,4 </kml:coordinates></kml:LineString>')); -- NOTICE: Spaces insides -- Cf #2372 SELECT 'coordinates_14', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1, 2 3,4</kml:coordinates></kml:LineString>')); SELECT 'coordinates_15', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2 3 ,4</kml:coordinates></kml:LineString>')); SELECT 'coordinates_16', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2, 3 4, 5,6</kml:coordinates></kml:LineString>')); -- Several spaces as tuples separator SELECT 'coordinates_17', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2 3,4</kml:coordinates></kml:LineString>')); SELECT 'coordinates_18', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates> 1,2 3,4 </kml:coordinates></kml:LineString>')); -- ERROR: Mixed dimension SELECT 'coordinates_19', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2 3,4,5</kml:coordinates></kml:LineString>')); SELECT 'coordinates_20', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2,3 4,5</kml:coordinates></kml:LineString>')); SELECT 'coordinates_21', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2 ,3</kml:coordinates></kml:LineString>')); SELECT 'coordinates_22', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1, 2,3</kml:coordinates></kml:LineString>')); SELECT 'coordinates_23', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2,3 ,4,5</kml:coordinates></kml:LineString>')); SELECT 'coordinates_24', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2, 3,4,5</kml:coordinates></kml:LineString>')); -- ERROR: Mixed dimension and extra tuple separator (Cf #2372) SELECT 'coordinates_25', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>1,2 3,4, 5</kml:coordinates></kml:LineString>')); -- ERROR: Junk SELECT 'coordinates_26', ST_AsEWKT(ST_GeomFromKML('<kml:LineString><kml:coordinates>!@#$%^*()"</kml:coordinates></kml:LineString>'));
comment:24 by , 11 years ago
Colivier: could you provide testsuite changes in form of a patch ? Please patch both .sql and _expected (with the expected message).
comment:26 by , 10 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
comment:27 by , 10 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
the issue in the ticket summary was actually fixed (ie: now space around ordinate values are accepted)
comment:28 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
raising NOTICEs for not strictly-valid cases could be another ticket (@coliver, if you want)
Example KML containing the unsupported representation: http://firms.modaps.eosdis.nasa.gov/active_fire/kml/Europe_24h.kml