Ticket #1366 (closed feature: fixed)

Opened 5 years ago

Last modified 3 years ago

Atom Format

Reported by: seang Owned by: tschaub
Priority: critical Milestone: 2.9 Release
Component: Format Version: SVN
Keywords: Cc:
State: Complete

Description (last modified by tschaub) (diff)

Add an atom format for generating atom feed docs from features or atom entry docs from a single feature. Use georss:where to describe geometries. This will make use of the versioned GML parser (using v3 - which will conform with the simple features profile).

Attachments

atom.patch Download (20.8 KB) - added by tschaub 5 years ago.
adds an atom format
format-atom.patch Download (36.9 KB) - added by sgillies 4 years ago.
atom-format-2.patch Download (44.2 KB) - added by sgillies 4 years ago.
Read and write Atom content as text or DOM nodes.
atom.js Download (24.5 KB) - added by pwr 4 years ago.
revised class to handle projection for kml placemark nodes
1366.patch Download (45.7 KB) - added by tschaub 3 years ago.
atom format

Change History

  Changed 5 years ago by crschmidt

  • milestone set to 2.7 Release

Please make the GMLSF a subclass, probably named GML.SimpleFeatures (instead of GMLSF), subclassing from Format.GML instead of from Format.XML, and let me know when it's done, and I'll look at it.

(Hopefully, you only have to override one or two functions: It wasn't clear to me what actually changed, so I can't be sure yet.)

  Changed 5 years ago by tschaub

Working on this now (see Atom.js). I'd like to use this new style for XML parsers (see SLD). I'll add the versioning bit later. The idea is to be able to more easily extend custom functionality to parsers (by creating custom writers and readers functions) and to be able to share parsing code between formats (filter, sld, gml, etc). Also, this style parser visits each node once at most - too much getElementsByTagNameNS will sink us. Requires a handful of additions to XML.js.

  Changed 5 years ago by crschmidt

  • priority changed from minor to critical

  Changed 5 years ago by euzuro

  • state set to Needs More Work

  Changed 5 years ago by euzuro

  • version changed from 2.5 to 2.7 RC1

Floating to RC1... vector-behavior patches will go in on RC2.

  Changed 5 years ago by tschaub

Not ready for review yet, just placing this here in case my machine gets hit by a bus over lunch.

The addition of the Atom format comes with a couple changes to the xml format. These will be of use (and will reduce code duplication) in the other (new style) parsers. I've put together tests for the XML changes. Tests for Atom to come (GML changes will be handled in #1639).

  Changed 5 years ago by tschaub

  • description modified (diff)
  • summary changed from Atom and GML Simple Features Formats to Atom Format

  Changed 5 years ago by tschaub

The updated patch only contains atom specific changes now. Required changes to the xml format are attached to #1722. This still produces gml with the parser in the trunk (gml 2 largely - so not in conformance with the simple features profile). So, this is still waiting for #1639.

Changed 5 years ago by tschaub

adds an atom format

  Changed 5 years ago by tschaub

This depends on #1639. This also does not yet conform with  http://www.ietf.org/rfc/rfc4287.txt, so I'm leaving this "needs work." Not much, but I can't finish it now.

  Changed 5 years ago by euzuro

  • milestone changed from 2.7 Release to 2.8 Release

  Changed 4 years ago by crschmidt

Seems like this is pretty close, perhaps we want to actually pull it into 2.8?

  Changed 4 years ago by crschmidt

  • milestone changed from 2.8 Release to 2.9 Release

No love, bumping to 2.9.

  Changed 4 years ago by sgillies

  • version changed from 2.7 RC1 to SVN

For 2.9, I've attached a new OpenLayers.Format.Atom class and tests (all passing). It implements pretty much all of RFC 4287 except atom:source. Parsed Atom metadata goes into an "atom" namespace in feature attributes (feature.attributes.atom). atom:title and atom:summary are also copied to feature.attributes.title and feature.attributes.description.

The class uses the GML.v3 format to write out geometries and read children of georss:where elements. GeoRSS simple elements are read using code adapted from the GeoRSS format. It doesn't use the new XML parsing framework (readers and writers), but doesn't need to right now since there's not much to share with other formats.

Changed 4 years ago by sgillies

  Changed 4 years ago by sgillies

  • state changed from Needs More Work to Review

Review and commit, please.

  Changed 4 years ago by pwr

would it not be better if <content> were configurable rather than tied to GeoRSS? For example, if it could be a kml placemark, then the format could also handle Google's new Maps Data api feeds.

  Changed 4 years ago by sgillies

Attached is a new patch adding write support for Atom content types other than "text", with tests of "text" and "application/vnd.google-earth.kml+xml".

Changed 4 years ago by sgillies

Read and write Atom content as text or DOM nodes.

  Changed 4 years ago by sgillies

The most recent patch now allows us to read and write Atom content as text or DOM nodes. Where RFC 4287 says content must be text, the code expects attributes.atom.content.value to be a string. Where the RFC says content may contain child elements, the code treats attributes.atom.content.value as a node. OpenLayers doesn't (afaik) have a hierarchical feature model, so it's up to the user to (using pwr's KML example) encode features to KML nodes before adding them to an Atom entry's content. See the tests for example.

  Changed 4 years ago by pwr

But that's still dependent on GeoRSS, as it expects geometries to be in georss:where nodes, which won't be the case for other content types.

As I see it, GeoRSS is not part of the Atom spec, so doesn't belong in Format.Atom. Atom is a 'wrapper' around content in another format, a format within a format, so I would have thought that Format.Atom would parse only the Atom elements and pass any content in other formats to the appropriate sub-format to parse.

There are additional complications in Google's feeds, as they also use the gd: namespace, gd:etag is needed for versioning on updates, for example. Not sure where that belongs: a separate Format.GD perhaps?

  Changed 4 years ago by pwr

to take this a bit further, I attach sgillies's class amended to handle feature content as either georss or kml placemark (I'm assuming they won't both be present ;-) ). 2 new options, georssContent and kmlContent. The latter instantiates a format (this may need expanding to cover things like optionally extracting style), and then uses that for read/write. If there are other possible content formats, it might be better to make this more generic, i.e. have a contentType option which determines which format should be used.

I have left the georss logic as is, but IMO it should use Format.GeoRSS. parseLocations(), for example, is essentially the same as GeoRSS.createGeometryFromItem(), and to me it doesn't make sense duplicating this logic here.

I have tested this with Google's Maps Data api (using Protocol.HTTP via a proxy).

follow-up: ↓ 21   Changed 4 years ago by sgillies

If I've read the section above  http://code.google.com/apis/maps/documentation/mapsdata/developers_guide_protocol.html#RetrievingFeatures right, this feature of yours is likely to break when the Google starts putting multiple placemarks or folders in entry/content. In my own app, I will definitely be putting KML folders of placemarks in entry/content and then approximating the locations within the entry's georss:where.

But yes, it's pragmatic to try to marshal Google's feature feed entries into OpenLayers features, and the format switch feels right for now.

in reply to: ↑ 20 ; follow-up: ↓ 22   Changed 4 years ago by pwr

Replying to sgillies:

If I've read the section above  http://code.google.com/apis/maps/documentation/mapsdata/developers_guide_protocol.html#RetrievingFeatures right, this feature of yours is likely to break when the Google starts putting multiple placemarks or folders in entry/content.

what do you think will break? At the moment, the 'end product' isn't clear, as what Google have so far released is only a preliminary version, but a placemark in the feed should correspond to an OL feature, and OL's current Format should work whether there's only 1 placemark or 5000, no?

There is a limitation with Format.KML.write(), which at the moment only outputs name and description, not any of the attributes. I would assume that Google will use extended attributes in a future version for querying.

in reply to: ↑ 21   Changed 4 years ago by pwr

ok, I see what you're saying. It implies that an entry might at some point contain more than one placemark. However, if an entry continues to correspond to a (Google Earth) feature, that can't happen as kml is defined at the moment, as a placemark is a (Google Earth) feature with a geometry. I might raise this issue on Google's group.

  Changed 4 years ago by elemoine

  • keywords foss4g09 added

  Changed 4 years ago by elemoine

  • state changed from Review to Needs Discussion

I wanted to review this patch during the FOSS4G code sprint, but I actually don't know what the state of this patch is. Is it actually ready for review? pwr mentioned that the Atom format should use the GeoRSS format when georssContent is set, which would make sense to me. sgillies, pwr, please tell where we stand with this patch and I'll look at it again. Thanks.

  Changed 4 years ago by elemoine

  • keywords foss4g09 removed

  Changed 4 years ago by sgillies

A while back I sent an email to dev@openlayers that never seemed to make it. It explained why I didn't use the "GeoRSS" format. In a nutshell: because the "GeoRSS" format is really mostly about RSS 2.0, not GeoRSS:

--

I've been slowly and intermittently working on Format.Atom, a RFC 4287 conforming Atom syndication format reader/writer for OpenLayers. There's a working patch attached to

 http://trac.openlayers.org/ticket/1366

It has pretty good test coverage and all pass, worth trying if you're working with this format.

Atom is open to extension in 2 different ways. You can put arbitrary data in an entry's atom:content element, and you can add extension elements to the entry (or feed) itself. GeoRSS extends Atom in the second way, adding extension elements (georss:where, georss:point, etc) to an entry.

Pwr (Paul Ramsey?) asks in  http://trac.openlayers.org/ticket/1366#comment:18 if Format.Atom shouldn't be passing GeoRSS namespace elements to Format.GeoRSS or, more generally, passing all extension elements to new OpenLayers formats. For an entry's payload, the data within atom:content, I think that's the right approach. An Atom entry might, as in the Google Maps Data API case, carry KML. That KML should be parsed by Format.KML. Note that Format.Atom doesn't do that for you right now because I don't want to presume how anybody wants to deserialize a placemark (or folder of placemarks) in the context of an Atom entry.

Now, Format.GeoRSS isn't about parsing GeoRSS extension elements in the Atom context, it's about parsing RSS 2.0 (with some hacks for Atom) -- and is a bit dated. We can't use it in the same way Format.Atom uses Format.GML. There's some potential for refactoring, I suppose, making a new Format.GeoRSS that is only concerned about geometries, but I don't have time for that now.

What to do about Atom extension elements from other namespaces like Google Data ("gd")? OpenLayers already has a framework for registering and finding formats by namespace, right? I suppose we should use that for everything other than GeoRSS.

  Changed 4 years ago by pwr

Atom's a bit of a special case, as it is not in itself a spatial format, so differs from the normal OL Format class. From my point of view, I was looking for a way of testing Google's Maps Data api, and with some minor tweaking this format fitted the bill.

Specifically on the Maps Data api, there are 2 further points: 1. Google are now talking of transmitting feature attributes as gd:customAttribute elements, so if so the format would have to be changed to handle those - I think this highlights how different implementations of atom are likely to require custom handling. 2. in reality, this format wouldn't be used much, as Google also offer js client libs which handle the atom parsing for you. They also use some jiggery-pokery with css urls and the like to get round the cross-domain restrictions; this seems to work well, even with largish datasets, so has the big advantage that you don't need a proxy to communicate with Google's servers. I have written a protocol class to handle that with OL, but am holding fire until Google release the new version they've been promising 'soon' for several months now . . .

This patch as it stands does work, so perhaps it can be committed now as a first draft with a note that it and the handling of georss and other elements could use some refactoring/improving in the future, as sgillies says above.

  Changed 4 years ago by pwr

This patch as it stands does work

hmm, I spoke too soon. The internal/externalProjection on my kml format is not being set, so coordinate transforms aren't being performed.

I'll try and get a fix out tomorrow.

Changed 4 years ago by pwr

revised class to handle projection for kml placemark nodes

follow-up: ↓ 30   Changed 4 years ago by pwr

new version submitted to handle coordinate transforms when creating the kml format obj. Assumes that in/externalProjection are set in the (atom) constructor options; setting them after creating the (atom) format won't have any effect as the (kml) format won't be changed.

sgillies, ISTM that you are transforming georss coordinates on input, but not output, no?

in reply to: ↑ 29   Changed 4 years ago by pwr

Replying to pwr: oh yes, and it also sets extractStyles, currently only a one-way process

  Changed 4 years ago by sgillies

Elemoine, I'm not inclined to incorporate pwr's atom.js (without tests) into my patch.

  Changed 4 years ago by pwr

well, I can certainly incorporate my proposal into the test script, but I've not used OL's test framework before so I'd have to look into it.

A question of priority I suppose. I don't see having the Maps Data handling in there as a high priority, so if people need atom/georss now I've no problem with leaving the kml handling out for the moment.

However, longer-term I do think the atom format needs to be more flexible. Google alone uses atom for 3 different types of spatial content that I'm aware of: Picasa uses georss, Maps Data uses kml placemark, and Base Data uses <g:location><g:latitude><g:longitude> nodes. Ideally, OL's atom format should cater for all for them.

  Changed 4 years ago by sgillies

Yes, let's leave the KML handling out and get this committed. I think Format.Atom could easily be extended by new precisely-suited formats (Format.GMapsData, Format.Picasa, Format.BaseData?) when the need arises for them.

  Changed 4 years ago by tschaub

To make this easier, please create a patch with svn diff using a working copy of the trunk.

  Changed 4 years ago by sgillies

Nevermind the atom.js attachment. Much of it is already incorporated in  http://trac.openlayers.org/attachment/ticket/1366/atom-format-2.patch. I've used "patch -p1" from the dir above lib ... works fine.

Changed 3 years ago by tschaub

atom format

  Changed 3 years ago by tschaub

  • state changed from Needs Discussion to Commit

Ok, here's a bit of what I've done to the  atom-format-2.patch:

  • Made it work on IE. There were a number of things that kept this from working on IE. Among them, the first arg to  substring has to be non-negative
  • Parsed polygons correctly (s/i/j/)
  • Removed some global vars
  • Made interior/exteriorProjection work for read and write
  • Added options for entry and feed titles
  • Cached node list length in for loops
  • Renamed getChildValue to getFirstChildValue so it doesn't clobber and instead uses the same method on the XML format
  • Changed new Array(foo) to [foo] to guard against cases where foo is an integer (perhaps never, but better practice)
  • Used a single GML format instance instead of one per geometry (avoids creating hundreds of new ActiveXObject("Microsoft.XMLDOM") instances)
  • Refactored a bit to make it easier to move to new format style

At some point, I'd like to make this work more like the the first patch I added here. But I understand this has languished long enough and I'm sure we've tried everybody's patience enough.

  Changed 3 years ago by tschaub

  • keywords atom georss gml removed
  • status changed from new to closed
  • state changed from Commit to Complete
  • resolution set to fixed

(In [9901]) Adding an Atom parser. Thanks sgillies for the patch (and patience). r=me (closes #1366)

Note: See TracTickets for help on using tickets.