Opened 11 years ago

Closed 6 years ago

Last modified 5 years ago

#3043 closed enhancement (fixed)

Allow specification of intended driver on GDALOpen

Reported by: clundgren Owned by: warmerdam
Priority: normal Milestone: 2.0.0
Component: GDAL_Raster Version: unspecified
Severity: normal Keywords: gdalopen
Cc: warmerdam, Even Rouault, hobu, antonio

Description

GDALOpen does not allow specification of an intended driver, which can lead to problematic behavior in some cases.

For example, consider writing an NITF file (e.g. "bob.ntf") using the Create() method to a directory containing a similarly-named ENVI file (e.g. "bob.hdr"). The NITF Create() method first writes part of the file, then attempts to open this using GDALOpen. If it turns out that the ENVI driver is also loaded and occurs first in the driver manager's list, this partial NITF file will actually be opened as an ENVI file (due to the presence of the .hdr file).

It would be nice if GDALOpen had an optional parameter to specify an intended driver, and that relevant places throughout GDAL that clearly should be using a specific driver are modified to provide this parameter.

Attachments (5)

open_by_drivername.patch (3.9 KB) - added by ilucena 9 years ago.
Open by driver name
open_by_drivername_v2.patch (4.0 KB) - added by ilucena 9 years ago.
Open by driver name
open_by_drivername_v3.patch (4.0 KB) - added by ilucena 9 years ago.
Open by driver name
open_by_drivername_v4.patch (4.3 KB) - added by ilucena 9 years ago.
This version uses the "driver=driver-name,file-name" convention
open_by_drivername_v5.patch (5.2 KB) - added by ilucena 9 years ago.

Download all attachments as: .zip

Change History (21)

comment:1 Changed 11 years ago by warmerdam

Type: defectenhancement

This request would pretty much need to be pursued as an RFC.

In the case mentioned, the ENVI .hdr file is going to cause problems next time someone tries to open the file so fixing the createcopy case isn't really going to resolve everything anyways.

Changed 9 years ago by ilucena

Attachment: open_by_drivername.patch added

Open by driver name

comment:2 Changed 9 years ago by ilucena

Cc: warmerdam Even Rouault hobu added
Keywords: gdalopen added

Could you please give a try to that patch to see if that works as intended. I ran GDAL autotest with those changes and I got no test failure.

The way it works is that you can pass a driver name at the beginning of the fileName parameter string in that format "driver_name:file_name", ex.:

gdalinfo nitf:two_images_jp2.ntf

It will avoid the driver probing, going direct to the driver you have informed. I tried to avoid misinterpretation of file system driver, like in "R:" and if for some reason your driver fails to recognize the file, then GDALOpen goes on and runs the driver probing. That solution also works for driver that already respond for that same drive-name entry on fileName.

If that makes sense I think we could go ahead and post a RFC. What do you think Frank?

comment:3 Changed 9 years ago by antonio

Cc: antonio added

comment:4 Changed 9 years ago by Even Rouault

I have reviewed (but not actually tested) the patch, and here are my comments :

  • The "GDALAccess eAccess" argument is not necessary. I see you need it to create GDALOpenInfo oOpenInfoClone( &pszColon[1], eAccess ); , but I think you could just write GDALOpenInfo oOpenInfoClone( &pszColon[1], oOpenInfo.eAccess ); instead.
  • The pszFilename and pszColon variables in the new GDALOpenInternal() could likely be const char* and not char* . That would save a cast.
  • Perhaps it would be good to revalidate that pszColon is not NULL (even if I agree that with your code it cannot happen)
  • poDriver->pfnOpen can be NULL for a few drivers (like the KMLSUPEROVERLAY), so it might be good to validate it and return if it is NULL.
  • The coexistence of 2 GDALOpenInternal() methods is a bit confusing. Perhaps renaming the old GDALOpenInternal() to GDALOpenWithAllowedDrivers(), and the one you propose to GDALOpenGuessDriverFromName() might make things clearer. I'm not very good at naming however ;-)

At first, I thought that the current GDALOpenInternal() (with the papszAllowedDrivers argument) could be just what it's needed, but that would require patching the GDAL utilities and add a new option to specify the driver name instead. So your solution might be more pragmatic.

comment:5 Changed 9 years ago by ilucena

Very good points. I would also add that we should change the trigger order to:

    if ( oOpenInfo.bStatOK == false && strstr( pszFilename, ":" ) != NULL )

...avoid crossing long path strings looking for colons in vain. Those long string are most likely to be existent files names anyway, without driver name.

About the name. I agree with you, names matter. In fact we already have 2 versions of GDALOpenInternal. So that would be the third one:

/* Internal method for now. Might be subject to later revisions */
GDALDatasetH GDALOpenInternal( const char * pszFilename, GDALAccess eAccess,
                               const char* const * papszAllowedDrivers);
GDALDatasetH GDALOpenInternal( GDALOpenInfo& oOpenInfo,
                               const char* const * papszAllowedDrivers);
GDALDatasetH GDALOpenInternal( GDALOpenInfo& oOpenInfo );

But it is still very much an internal call. So maybe that was the intention of the original author. If we call it GDALOpenByDriver(), for example, people would request it to exposed in the C API with *driver* and *filename* as parameter. The parameter GDALOpenInfo would be odd. But that is all just speculation, of course. I can go either way.

I will update a new patch with your suggestion shortly.

Thanks.

Changed 9 years ago by ilucena

Attachment: open_by_drivername_v2.patch added

Open by driver name

comment:6 Changed 9 years ago by ilucena

Patch update is available.

comment:7 Changed 9 years ago by Even Rouault

you can drop the (char*) cast in const char* pszColon = (char*) strstr( pszFilename, ":" );

Changed 9 years ago by ilucena

Attachment: open_by_drivername_v3.patch added

Open by driver name

comment:8 Changed 9 years ago by ilucena

I just created the RFC 36 proposing that change:

http://trac.osgeo.org/gdal/wiki/rfc36_open_by_drivername

comment:9 Changed 9 years ago by Even Rouault

At that point, I think you should write an email to gdal-dev to open the public discussion on the RFC.

comment:10 Changed 9 years ago by ilucena

I added the code here to make it easier to read it.

GDALDatasetH GDALOpenInternal( const char * pszFilename, GDALAccess eAccess,
                               const char* const * papszAllowedDrivers)
{
    GDALOpenInfo oOpenInfo( pszFilename, eAccess );

    /* Parse 'driver_name:' on dataset identification */

    if ( oOpenInfo.bStatOK == false && strstr( pszFilename, ":" ) != NULL  )
    {
        GDALDataset *poDS = (GDALDataset*) GDALOpenInternal( oOpenInfo );
        if ( poDS )
        {
            return poDS;
        }
    }

    return GDALOpenInternal(oOpenInfo, papszAllowedDrivers);
}

/* Parse the driver name from inside the oOpenInfo.pszFilename
   ("driver-name:file-name") and tries to open the dataset with
   that specific driver. */
GDALDatasetH GDALOpenInternal( GDALOpenInfo& oOpenInfo )
{
    VALIDATE_POINTER1( oOpenInfo.pszFilename, "GDALOpen", NULL );

    GDALDriverManager *poDM = GetGDALDriverManager();
    CPLLocaleC oLocaleForcer;
    CPLErrorReset();
    CPLAssert( NULL != poDM );

    GDALAccess eAccess = oOpenInfo.eAccess;

    GDALDataset *poDS = NULL;

    /* Parse driver name */

    const char* pszFilename = oOpenInfo.pszFilename;
    const char* pszColon    = strstr( pszFilename, ":" );

    if ( pszColon == NULL )
    {
        return NULL;
    }

    char* pszDriver = CPLScanString( pszFilename, ( pszColon - pszFilename ), FALSE, FALSE );

    /* Get driver object */

    GDALDriver *poDriver = poDM->GetDriverByName( pszDriver );

    CPLFree( pszDriver );

    if ( poDriver == NULL || poDriver->pfnOpen == NULL )
    {
        return NULL;
    }

    /* Try to open dataset without driver name */
    GDALOpenInfo oOpenInfoClone( &pszColon[1], eAccess );

    poDS = poDriver->pfnOpen( &oOpenInfoClone );

    if( poDS == NULL )
    {
        /* Try again using the driver name within the file name */
        poDS = poDriver->pfnOpen( &oOpenInfo );
    }

    if( poDS != NULL )
    {
        if( strlen(poDS->GetDescription()) == 0 )
            poDS->SetDescription( pszFilename );

        if( poDS->poDriver == NULL )
            poDS->poDriver = poDriver;

        if( CPLGetPID() != GDALGetResponsiblePIDForCurrentThread() )
            CPLDebug( "GDAL", "GDALOpen(%s, this=%p) succeeds as %s (pid=%d, responsiblePID=%d).",
                      oOpenInfo.pszFilename, poDS, poDriver->GetDescription(),
                      (int)CPLGetPID(), (int)GDALGetResponsiblePIDForCurrentThread() );
        else
            CPLDebug( "GDAL", "GDALOpen(%s, this=%p) succeeds as %s.",
                      oOpenInfo.pszFilename, poDS, poDriver->GetDescription() );

        return (GDALDatasetH) poDS;
    }

    return NULL;
}

Changed 9 years ago by ilucena

Attachment: open_by_drivername_v4.patch added

This version uses the "driver=driver-name,file-name" convention

comment:11 Changed 9 years ago by ilucena

This version uses the "driver=driver-name,file-name" convention. Those are the changes in gdaldataset.cpp:

GDALDatasetH GDALOpenInternal( const char * pszFilename, GDALAccess eAccess,
                               const char* const * papszAllowedDrivers)
{
    GDALOpenInfo oOpenInfo( pszFilename, eAccess );

    if ( oOpenInfo.bStatOK == false &&          // Not a simple existing file name
        EQUALN( pszFilename, "driver=", 7 ) )   // Contains the driver name token
    {
        return GDALOpenInternal( oOpenInfo );
    }

    return GDALOpenInternal(oOpenInfo, papszAllowedDrivers);
}

GDALDatasetH GDALOpenInternal( GDALOpenInfo& oOpenInfo )
{

    VALIDATE_POINTER1( oOpenInfo.pszFilename, "GDALOpen", NULL );

    GDALDriverManager *poDM = GetGDALDriverManager();
    CPLLocaleC oLocaleForcer;
    CPLErrorReset();
    CPLAssert( NULL != poDM );

    GDALDataset *poDS = NULL;

    /* Parse driver name and the rest of the string after the comma "," */

    const char* pszEqual = (char*) strstr( oOpenInfo.pszFilename, "=" );
    const char* pszComma = (char*) strstr( oOpenInfo.pszFilename, "," );

    if ( pszComma == NULL || pszEqual == NULL )
    {
        return NULL;
    }

    const char* pszFilename = pszComma + 1;
    char* pszDrivername = CPLScanString( pszEqual + 1, 
                          ( pszComma - pszEqual - 1 ), FALSE, FALSE );

    /* Get driver object */

    GDALDriver *poDriver = poDM->GetDriverByName( pszDrivername );

    CPLFree( pszDrivername );

    if ( poDriver == NULL || poDriver->pfnOpen == NULL )
    {
        return NULL;
    }

    /* Try to open dataset with the selected driver */

    GDALOpenInfo oOpenInfoClone( pszFilename, oOpenInfo.eAccess );

    poDS = poDriver->pfnOpen( &oOpenInfoClone );

    if( poDS != NULL )
    {
        if( strlen(poDS->GetDescription()) == 0 )
            poDS->SetDescription( pszFilename );

        if( poDS->poDriver == NULL )
            poDS->poDriver = poDriver;

        if( CPLGetPID() != GDALGetResponsiblePIDForCurrentThread() )
            CPLDebug( "GDAL", "GDALOpen(%s, this=%p) succeeds as %s (pid=%d, responsiblePID=%d).",
                      oOpenInfo.pszFilename, poDS, poDriver->GetDescription(),
                      (int)CPLGetPID(), (int)GDALGetResponsiblePIDForCurrentThread() );
        else
            CPLDebug( "GDAL", "GDALOpen(%s, this=%p) succeeds as %s.",
                      oOpenInfo.pszFilename, poDS, poDriver->GetDescription() );

        return (GDALDatasetH) poDS;
    }

    return NULL;
}

Changed 9 years ago by ilucena

Attachment: open_by_drivername_v5.patch added

comment:12 Changed 9 years ago by ilucena

Please note the new patch (version 5) where I am proposing a few changes to the original GDALOpenInternal function, instead of creating a new GDALOpenInternal.

Examples of use:

GDALDatasetH poDS = GDALOpen( "filename.tif", GA_ReadOnly );
GDALDatasetH poDS = GDALOpen( "driver=gtiff,filename.tif", GA_ReadOnly );

I revoke the previous attempt to accept entry like the following.

So that should *not* work and will return an error.

GDALDatasetH poDS = GDALOpen( "driver=gtiff_raw:filename.tif", GA_ReadOnly );
...
ERROR 4: `driver=gtiff_raw:filename.tif' does not exist in the file system,
and is not recognised as a supported dataset name.

By the way, we should remove that "in the file system" from that error message, since GDAL can now handle data from other sources than just files on file system. But if you are going to do that be aware that those error messages are used in some if statements on Python autotest scripts.

Anyway, for all the driver that consume "drivername:" inside the file name, we would do as it shows here:

GDALDatasetH poDS = GDALOpen( "driver=gtiff,gtiff_raw:filename.tif", GA_ReadOnly );

What is more consistent, IMHO.

comment:13 Changed 6 years ago by Even Rouault

Milestone: 2.0
Resolution: fixed
Status: newclosed

Implemented with GDALOpenEx() with the papszAllowedDrivers per RFC 46

comment:14 Changed 6 years ago by Mateusz Łoskot

Even,

So, this feature is in scope of two RFCs: 36 and 46. Shouldn't status of the former rfc36_open_by_drivername be changed to "Adopted", perhaps with a comment like "Adopted (as part of RFC 46)"?

comment:15 Changed 6 years ago by Even Rouault

hum, strictly speaking it is not really implemented in the terms of rfc36, but covers most of the technical points. "withdrawn, covered by RFC 46" would probably be more appropriate. I let Ivan decide since he is the author of rfc36.

comment:16 Changed 5 years ago by Even Rouault

Milestone: 2.02.0.0

Milestone renamed

Note: See TracTickets for help on using tickets.