Opened 17 years ago
Last modified 17 years ago
#1534 closed defect
OGR GML reader fails if file has UTF-8 BOM prefix — at Version 3
Reported by: | rogerjames99 | Owned by: | Mateusz Łoskot |
---|---|---|---|
Priority: | normal | Milestone: | 1.4.1 |
Component: | OGR_SF | Version: | unspecified |
Severity: | normal | Keywords: | UTF BOM GML |
Cc: | warmerdam |
Description (last modified by )
The function OGRGMLDataSource::Open in ogrgmldatasource.cpp fails is the GML file has a UTF-8 encoded UNICODE BOM (Byte order mark) at the start of the file. This is valid UTF-8 encoding (see RFC 3629 section 6) and should be allowed. Xerces properly handles this sequence. The code below is a modification to this function to allow for this.
It may be better to remove the "Test Open" functionality altogether and just let Xerces worry about correctly formed xml.
int OGRGMLDataSource::Open( const char * pszNewName, int bTestOpen ) { FILE *fp; char szHeader[1000]; /* -------------------------------------------------------------------- */ /* Open the source file. */ /* -------------------------------------------------------------------- */ fp = VSIFOpen( pszNewName, "r" ); if( fp == NULL ) { if( !bTestOpen ) CPLError( CE_Failure, CPLE_OpenFailed, "Failed to open GML file `%s'.", pszNewName ); return FALSE; } /* -------------------------------------------------------------------- */ /* If we aren't sure it is GML, load a header chunk and check */ /* for signs it is GML */ /* -------------------------------------------------------------------- */ if( bTestOpen ) { char *szPtr = szHeader; VSIFRead( szHeader, 1, sizeof(szHeader), fp ); szHeader[sizeof(szHeader)-1] = '\0'; /* -------------------------------------------------------------------- */ /* Check for a UTF-8 BOM and skip if found */ /* -------------------------------------------------------------------- */ if (((unsigned char)szPtr[0] == 0xEF) && ((unsigned char)szPtr[1] == 0xBB) && ((unsigned char)szPtr[2] == 0xBF)) szPtr += 3; if( szPtr[0] != '<' || strstr(szPtr,"opengis.net/gml") == NULL ) { VSIFClose( fp ); return FALSE; } }
Change History (4)
by , 17 years ago
Attachment: | ogrgmldatasource.cpp added |
---|
comment:2 by , 17 years ago
Component: | default → OGR_SF |
---|
comment:3 by , 17 years ago
Cc: | added |
---|---|
Description: | modified (diff) |
Milestone: | → 1.4.1 |
Owner: | changed from | to
Priority: | low → normal |
Severity: | minor → normal |
Roger,
Could you attach a smallish file with this marker in it?
Mateusz,
Could you fix this trunk and 1.4 branch? We *do* want to keep the pre-checks but they need to be safer. We don't want to start up xerces for every file passed to OGROpen().
Please add a test for this in the test suite.
by , 17 years ago
Attachment: | smallsample.gml added |
---|
Sorry about the typos and munged source. I have attached the modded file from the 1.4.0 source.