The very first task is to choose XML parser that will be used as a foundation of the KML driver. There are many questions that have to be answered:
- SAX or DOM?
- provide XML validation for KML documents (mandatory, optional, none)?
- etc.
There are 4 XML parsers under our consideration:
Here is very detailed comparison (made in 2003) of XML parsing libraries that may be helpful during the analysis.
Feature | Expat | Xerces | minixml | libXML |
Parser | SAX | SAX, DOM | DOM | DOM |
Validating | no | yes | no | yes (w/o reparsing) |
Encoding | UTF-8 | UTF-8/16,ASCII,latin1 | ASCII/UTF-8 | UTF-8/16,ASCII,latin1 |
Library size | 150KB | ~4MB | built-in | ~1MB |
Thread safe | yes | yes | yes | yes |
Used in GDAL part | OGDI | GML, ILI | many places | not yet |
Secure * | yes | no | ||
... | ... | ... | ... | ... |
* Google avoids Xerces because it considers it to be insecure in the face of hostile XML documents, but considers Expat safe. Presumably based on an indepth security review.
Speed
Speed comparison of different XML parsers
SAX or DOM
Which technic should be used for reading the KML files?
- I (Jens) guess SAX would be best, because of the memory consumption of DOM
- SAX is more complex to use in case of KML
Final
I will use expat, because of the decision for SAX, the speed and it is more secure than Xerces.