RFC 31: OGR 64bit Integer Fields and FIDs

Authors: Frank Warmerdam
Contact: warmerdam@…
Status: Development

Summary

This RFC addresses steps to upgrade OGR to support 64bit integer fields and feature ids. Many feature data formats support wide integers, and the inability to transform these through OGR causes increasing numbers of problems.

64bit FID

It is planned that feature id's will be handled as type "GIntBig" instead of "long" internally. This will include the nFID field of the OGRFeature. The existing GetFID() and SetFID() methods on the OGRFeature use type long. It is difficult to change this without significant disruption to existing application code, so it is intended to introduce new methods to the OGRFeature class:

  GIntBig  OGRFeature::GetFID64();
  OGRErr   OGRFeature::SetFID64(GIntBig nFID );

The old methods will be deprecated in favor of the new interfaces in documentation, etc. Howevever the will continue to exist, and will just cast as needed. Note that the old interfaces using "long" are already 64bit on 64bit operating systems so there is little harm to applications continuing to use these interfaces on 64bit operating systems.

The OGRLayer class allows several operations based on the FID. The signature of these will be *altered* to accept GIntBig instead of long. In theory this should not require any changes to application code since long can be converted to GIntBig losslessly. However, all existing OGR drivers will require changes, including private drivers. This will also result in a backwards incompatible change in the C ABI.

    virtual OGRFeature *GetFeature( GIntBig nFID );
    virtual OGRErr      DeleteFeature( GIntBig nFID );

64bit Fields

New field types will be introduced for 64bit integers:

   OFTInteger64 = 12
   OFTInteger64List = 13

The OGRField union will be extended to include:

    GIntBig     Integer64;
    struct {
        int nCount;
        GIntBig *paList;
    } Integer64List;

The OGRFeature class will be extended with these new methods:

    GIntBig             GetFieldAsInteger64( int i );
    GIntBig             GetFieldAsInteger64( const char *pszFName );
    const int          *GetFieldAsInteger64List( const char *pszFName,
                                               int *pnCount );
    const int          *GetFieldAsInteger64List( int i, int *pnCount );

    void                SetField( int i, GIntBig nValue );
    void                SetField( int i, int nCount, GIntBig * panValues );
    void                SetField( const char *pszFName, GIntBig nValue )
    void                SetField( const char *pszFName, int nCount,
                                  GIntBig * panValues )

Furthermore, the new interfaces will internally support setting/getting integer fields, and the integer field methods will support getting/setting 64bit integer fields so that one case can be used for both field types where convenient.

Python / Java / C# / perl Changes

No thoughts yet on the impact to the various SWIG derived interfaces.

Utilities

ogr2ogr, ogrinfo and other utilities will be updated to support the new 64bit interfaces.

File Formats

As appropriate, existing OGR drivers will be updated to support the new interfaces. In particular an effort will be made to update the database driver interfaces to support 64bit integer columns for use as feature id, though I am not convinced we should create FID columns as 64bit by default when creating new layers as this may cause problems for other applications.

For prototyping purposes the Shapefile, and PostGIS drivers have been updated to properly support 64bit integer fields.

Also, all drivers need to be updated to use GIntBig for the FID in the GetFeature?() and DeleteFeature?() interfaces.

Test Suite

The test suite will be moderately extended to test the new capabilities.

Compatibility Issues

Driver Code Changes

  • Most drivers supporting CreateField?() likely ought to be extended to support OFTInteger64 as an integer field if nothing else is available (and if bApproxOK is TRUE.
  • Drivers reporting FIDs via Debug statements, printf's or using sprintfs like statements to format them for output will need updates to either cast the FID to long, or to use CPL_FRMT_GIB to format the FID. Failure to make these changes may result in code crashing.

Application Code

  • Application code may need to be updated to use GIntBig for FIDs in order to avoid warnings about downcasting.
  • Application code formatting FIDs using printf like facilities may also need to be changed to downcast explicitly or to use CPL_FRMT_GIB.
  • Application code may need to add Integer64 handling in order to utilize wide fields.

Behavioral Changes

  • Wide integer fields that were previously treated as "real" by the shapefile driver will now be treated as Integer64 which will likely not work with some applications, and translation to other formats will often fail.