MapGuide RFC 10 - Make Schemas More Amenable to Interim Enhancements
This page contains an change request (RFC) for the MapGuide Open Source project. More MapGuide RFCs can be found on the RFCs page.
|RFC Template Version||1.0|
|Submission Date||Dec 21/2006|
|Last Modified||jasonbirch Timestamp|
|Author||Jon Curtis, Tom Fukushima|
|Assigned PSC guide(s)||n/a|
|Voting History||Jan 5, 2007|
|+1||Tom, Paul, Bruce, Andy|
We propose a change that will allow layer definition and feature source documents to contain extensions while remaining forwards and backwards compatible without changing the schema.
Currently, with MGOS, whenever a schema update happens, the repository will need to be upgraded to the new structure. Each time a schema upgrade is done a program must be run against the repository and this is an inconvenience to developers and users. It would help if these updates could still be made without the need for an upgrade. Then at some point (for example, when development nears completion), the changes will be rolled up into a single schema change.
We focus on the layer definition and feature source schemas for now because that is where we expect the most changes.
The layer definition and feature source schemas will be modified so that new data can be added to the XML documents without having to change the schema. The parsers will also be modified so that if they encounter data that they do not recognize (from a more forward version) that the data will be ignored, but retained. This means that if the document was created in version x and then edited and saved in version x-1, when opened back up in a version x editor no information specific to version x is lost.
NOTE that even though the schema "doesn't change" any change that uses the techniques provided here will still need to create an RFC since the schema will eventually have to be versioned to accomodate the change.
Our ability to create a flexible schema which can be validated is limited by incomplete designs in the SAX parser and restrictions in the way Berkeley DB XML handles parsing errors. But within those limitations we have the following means of accomplishing our needs.
Each complex element in the Layer Definition and Feature Source schemas has a corresponding Model Object. For most of these objects we want to enable the possible addition of new elements to the end of their data sequences. (There are a few objects which we may be able to skip, as it seems unlikely that we would need to change them going forward.)
Telling the parser to ignore any potential new data is done by including the following element at the end of the complex element’s sequence:
<xs:element name=”ExtendedData1” minOccurs=”0”> <xs:complexType> <xs:sequence> <xs:any maxOccurs=”unbounded” processContent=”lax” /> </xs:sequence> </xs:complexType> </xs:element>
Note: To further understand how this mechanism would work in the future – here is an example of adding one new element. This additional data will be validated by new parsers, and it will be ignored by older parsers. Note also, that this addition further provides another “space” for newer, future data. And if the NewStuff did not have minOccurs=”0” then all we’d need to add is the <xs:any> tag without another wrapper element.
<xs:element name=”ExtendedData1” minOccurs=”0”> <xs:complexType> <xs:sequence> <xs:element name=”NewStuff” type=”xs:string” minOccurs=”0” /> <xs:element name=”ExtendedData2” minOccurs=”0”> <xs:complexType> <xs:sequence> <xs:any maxOccurs=”unbounded” processContent=”lax” /> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element>
In the Layer Definition and Feature Source schemas, there are approximately 35 places where we can add the extended data elements.
Note: The option of using <xs:any namespace=”##other” … /> was also explored but rejected because; new schema changes in the future could only be added in separate XSD files. It is not possible to define new data from an “other” namespace within the same XSD file. Thus each new version would add an additional XSD file for each modified Schema, and a single schema would now need to be spread across multiple files.
In the interim, for preserving unrecognized data, it will be stored in the corresponding Model Object.
There are 45 types of Model objects which are instantiated during the parsing of Feature Source and Layer Definition XML data, and as indicated above, we need to add support to approximately 35 of them. Each of these objects has a corresponding IO handler which is used by the parser to do the parsing and serialization. The following code changes will be required to support each extended data element for each model object.
To support the collection of unrecognized XML data, one IOUnrecognized handler class will be needed by each parser, the SAX2Parser for Layer Definitions and the FSDParser for Feature Source Definitions. When unrecognized elements are found in an IO class’ StartElement() method, it will instantiate the IOUnrecognized handler which will gather the XML data, keeping it in its original format. Upon completion of the containing IO class’ parsing, in EndElement(), that data will be transferred as a string to the model object.
The round-trip is completed during serialization. As the parser is serializing, when it reaches the end of the elements for the object, it will retrieve the unrecognized data string from the model object and write it back to the XML. The reason why it is necessary that all new data (and <xs:any> elements) be placed at the end of sequences is so we know exactly where to write the unrecognized data back out.
So, for each <xs:any> tag, we need to add to the IO handler: detection of unrecognized elements in StartElement() and creation of the IOUnrecognized handler, passing of unrecognized data to the model object in EndElement(), and writing out the unrecognized data during serialization. Each corresponding model object will need to have one additional string property to store the unrecognized XML.
Note that the extended data section is only intended to be an interim area for new data. The new data will eventually be put back into a non-extended data part of the schema at some point in the future.
There are no documentation or compatibility issues.
Documents will be created and tested to make sure that they in fact do retain all unrecognized data.