Opened 13 years ago

Closed 9 years ago

#3884 closed defect (wontfix)

GML XSD Reader does not seem to be working properly

Reported by: rmanwani Owned by: warmerdam
Priority: high Milestone:
Component: OGR_SF Version: 1.7.3
Severity: critical Keywords: gml reader
Cc:

Description

I am using GML Driver with Xerces 3.1 library with VC++ 10.0 compiler. The method GMLReader::ParseXSD is returning False for a valid GML XSD file which is attached. Is XSD file really incorrect or GML Reader has some issue? Further, the method PreScanForSchema is throwing exception '0xC0000005: Access violation writing location 0x00000000' in some method of probably Xerces library.

Attachments (6)

gml.xsd (2.1 KB ) - added by rmanwani 13 years ago.
gml.xml (683.0 KB ) - added by rmanwani 13 years ago.
Cambridge.xml (2.2 KB ) - added by rmanwani 13 years ago.
Cambridge.xsd (3.5 KB ) - added by rmanwani 13 years ago.
Schools.xml (4.5 KB ) - added by rmanwani 13 years ago.
Schools.xsd (2.8 KB ) - added by rmanwani 13 years ago.

Download all attachments as: .zip

Change History (24)

by rmanwani, 13 years ago

Attachment: gml.xsd added

by rmanwani, 13 years ago

Attachment: gml.xml added

comment:1 by Even Rouault, 13 years ago

The GML XSD parser has been substantially improved (although still limited) in the developement version (the future GDAL 1.8.0) and I've verified it is now able to parse correctly the gml.xsd you've provided

However, I'm a bit surprised about the crash you see. Could you try compiling a GDAL 1.8.0 snapshot ? You can download one at http://trac.osgeo.org/gdal/changeset/21305/trunk?old_path=%2F&format=zip

comment:2 by rmanwani, 13 years ago

Hi Rounalt,

Thanks for your inputs.

I have downloaded GDAL 1.8.0 snapshot.

Please guide me how this change set shall be used in conjuction with version 1.7.3 installed on my machine?

Regards, R G Manwani

comment:3 by Even Rouault, 13 years ago

Not sure to really understand your question. There's nothing special to compile GDAL 1.8.0. It should be pretty similar to what you did to compile 1.7.3. Unzip the file somewhere. Make the appropriate changes in nmake.opt and compile. Adjust the PATH and GDAL_DATA environment variable to point to the appropriate directories and enjoy ...

in reply to:  3 comment:4 by rmanwani, 13 years ago

Replying to rouault:

Not sure to really understand your question. There's nothing special to compile GDAL 1.8.0. It should be pretty similar to what you did to compile 1.7.3. Unzip the file somewhere. Make the appropriate changes in nmake.opt and compile. Adjust the PATH and GDAL_DATA environment variable to point to the appropriate directories and enjoy ...

I understand that it is not full fledged set of files i.e. a complete version. Few files only appear in the zip. I do not see the directory structure like 1.7.3 set. Do I have to overwrite 1.7.3 with files in tar or it is self-sufficient?

comment:5 by Even Rouault, 13 years ago

Ah, ok... The name of the link is indeed a bit misleading, but it is in fact a full set of files (a .zip of ~ 13 MB). You have the trunk/autotest subdirectory for the autotest suite and the trunk/gdal subdirectory with the sources.

comment:6 by rmanwani, 13 years ago

I have tested with GDAL 1.8.0 development version and it is indeed working properly. Any idea when it is going to be realsed? Also, if there are any issues with this development version, how tickets can be raised?

comment:7 by rmanwani, 13 years ago

The reader is probably recognizing the one class - polygon present in the schema file. But I am still observing the crash in Xerces code subsequently. The stack trace is produced below :

msvcr100d.dll!_VEC_memcpy(void * dst, void * src, int len) + 0x55 bytes C

dbt7d.dll!0112ed44()

[Frames below may be incorrect and/or missing, no symbols loaded for dbt7d.dll]

!GMLHandler::startElement(const char * pszName, void * attr) Line 549 + 0x11 bytes C++

!GMLXercesHandler::startElement(const wchar_t * const uri, const wchar_t * const localname, const wchar_t * const qname, const xercesc_3_1::Attributes & attrs) Line 81 + 0x19 bytes C++ xerces-c_3_1.dll!xercesc_3_1::SAX2XMLReaderImpl::startElement(const xercesc_3_1::XMLElementDecl & elemDecl, const unsigned int elemURLId, const wchar_t * const elemPrefix, const xercesc_3_1::RefVectorOf<xercesc_3_1::XMLAttr> & attrList, const unsigned long attrCount, const bool isEmpty, const bool isRoot) Line 787 C++ xerces-c_3_1.dll!xercesc_3_1::IGXMLScanner::scanStartTagNS(bool & gotData) Line 2641 C++ xerces-c_3_1.dll!xercesc_3_1::IGXMLScanner::scanNext(xercesc_3_1::XMLPScanToken & token) Line 387 C++ !GMLReader::NextFeature() Line 442 + 0x2f bytes C++ !OGRGMLLayer::GetNextFeature() Line 147 + 0x27 bytes C++

comment:8 by rmanwani, 13 years ago

I replaced the Xerces library with Xpat library. There also an exception is occuring but it not for memory access but for trying to allocate huge memory. I tried with Expat library for the files without schema. In this case, I am not facing any issue.

The issue seems be with presence of XSD files.

comment:9 by Even Rouault, 13 years ago

Mmm, issues with both xerces and expat, this is really odd. I suspect something wrong in the way you build GDAL or its dependencies. Perhaps you could check with Tamas Szekeres' daily builds at http://vbkto.dyndns.org/sdk/ You can try downloading release-1600-gdal-mapserver.zip and see if you have the same issues (it uses Xerces 2.8 AFAIR). The corresponding source package is release-1600-dev.zip

comment:10 by rmanwani, 13 years ago

I cannot use release 1.6 as I am using some ogr formats available only in 1.7.3 onwards

comment:11 by Even Rouault, 13 years ago

No, 1600 = MSVC v16.00 = Visual Studio 2010. The -developement packages are 1.8.0dev snapshots

comment:12 by Even Rouault, 13 years ago

Resolution: worksforme
Status: newclosed

I've just tried to compile GDAL trunk with Visual Studio Express 2010 and Xerces 3.1 donloaded from http://archive.apache.org/dist/xml/xerces-c/Xerces-C_3_1_0/binaries/xerces-c-3.1.0-x86-windows-vc-9.0.zip . Everything runs fine, both with the .xsd and without it.

in reply to:  12 comment:13 by rmanwani, 13 years ago

Replying to rouault:

I've just tried to compile GDAL trunk with Visual Studio Express 2010 and Xerces 3.1 donloaded from http://archive.apache.org/dist/xml/xerces-c/Xerces-C_3_1_0/binaries/xerces-c-3.1.0-x86-windows-vc-9.0.zip . Everything runs fine, both with the .xsd and without it.

I modified the method GMLXercesHandler::GetAttributes. Here, I changed datatype of osRes from CPLString to std::string. Everything works fine for me!. The string was not initialized and code was crashing at statement 'osRes += ""'.

comment:14 by Even Rouault, 13 years ago

Hum, this is weird. You shouldn't have to do that. CPLString derives from std::string (in port/cpl_string.h). So a CPLString object should be correctly initialized at its instanciation...

by rmanwani, 13 years ago

Attachment: Cambridge.xml added

by rmanwani, 13 years ago

Attachment: Cambridge.xsd added

by rmanwani, 13 years ago

Attachment: Schools.xml added

by rmanwani, 13 years ago

Attachment: Schools.xsd added

in reply to:  14 ; comment:15 by rmanwani, 13 years ago

Priority: normalhigh
Resolution: worksforme
Status: closedreopened

Replying to rouault:

Hum, this is weird. You shouldn't have to do that. CPLString derives from std::string (in port/cpl_string.h). So a CPLString object should be correctly initialized at its instanciation...

I do not have much understanding of CPLString. Hence, I used std::string to verify my doubt about the bug and it worked. Now, that it is not crashing, I am testing with more samples. I have attached two samples ( both data and xsd files ). And the parser is not able to parse XSD files properly. One of the XSD files has 5 classes but the parser returns only 1. Similary, parser returns less number of classes than available in XSD. This issue needs to be attended. I have tested with GDAL 1.8 using Xerces 3.1 sent by you. I checked with orginfo also.

in reply to:  15 comment:16 by rmanwani, 13 years ago

Replying to rmanwani:

Replying to rouault:

Hum, this is weird. You shouldn't have to do that. CPLString derives from std::string (in port/cpl_string.h). So a CPLString object should be correctly initialized at its instanciation...

I do not have much understanding of CPLString. Hence, I used std::string to verify my doubt about the bug and it worked. Now, that it is not crashing, I am testing with more samples. I have attached two samples ( both data and xsd files ). And the parser is not able to parse XSD files properly. One of the XSD files has 5 classes but the parser returns only 1. Similary, parser returns less number of classes than available in XSD. This issue needs to be attended. I have tested with GDAL 1.8 using Xerces 3.1 sent by you. I checked with orginfo also. Without XSD also, its behavior is not correct. It is returning only 1 class for Schools.xml when actually there are 5 element types.

comment:17 by Even Rouault, 13 years ago

Yes, you run indeed into "known" limitations of the driver (completely unrelated to the CPLString issue I don't understand). Those 2 .XSD are too complex to be understood by the XSD parser of the GML driver, which can currently only understand .XSD restricted to the GML "simple features" profile.

Cambridge.xml has a "flat" structure. It is just the Cambridge.xsd that is too complex for OGR (that can only detect Mountain type). So if you remove Cambridge.xsd, OGR will be able to understand the structure of the XML by doing a first analyzing pass on it.

As far as Schools.xml, the issue is more deep. This file contains features inside features. The first level of feature is SchoolDistrict, and a SchoolDistrict can contain School or College. There's no convenient way of modeling that in the OGR feature model... By its limited understanding of the associated .XSD, OGR can only understand the School and College types. If you remove the .XSD, OGR will restrict it self to the SchoolDistrict level and will more or less aggregetate the various School/College that are inside.

comment:18 by Even Rouault, 9 years ago

Resolution: wontfix
Status: reopenedclosed

I don't think there's really any immediate action to be taken on this.Closing

Note: See TracTickets for help on using tickets.