FDO RFC 16 - FDO Provider for SQLite
This page contains a request for comments document (RFC) for the FDO Open Source project. More FDO RFCs can be found on the RFCs page.
|RFC Template Version||(1.0)|
|Last Modified||Traian Stanev 2008-03-29|
|Implementation Status||under development|
|Assigned PSC guide(s)||Jason Birch|
|+1||Greg, Orest, Jason, Mateusz, Haris, Bob, Frank|
- April 20, 2008 (jbirch) - Modified to explicitly specify the contents of the spatial_ref_sys table. Changed VARCHAR references to TEXT because VARCHAR is only a valid SQLite type through inference.
This RFC proposes a new FDO provider for accessing (spatial) data stored in SQLite database files.
SQLite is an efficient and popular way to store data. Support for accessing SQLite databases via FDO would expand the range of data sources supported by FDO. A SQLite format for storing spatial data would make a good interchange format, similar to, but more flexible than Shapefiles.
What makes a valid spatial SQLite database?
The goal of this simple specification is to allow access to SQLite databases in as open and standard way as possible. This document does not conform to the OGC Simple Features for SQL implementation specification (06-104r3), but it does leverage the specification where practical.
Here is the minimum set of characteristics expected in SQLite database:
- well-defined subset of the geometry_columns table (as per SF spec, section 220.127.116.11, 18.104.22.168); see below for specific column requirements,
- spatial_ref_sys table (as per SF spec, section 22.214.171.124, 126.96.36.199). The unique SRID and valid WKT spatial reference text MUST be defined for each unique coordinate system used in the database.
- geometry_format column extending the geometry_columns table, set to one of WKB, WKT, WKB12, WKT12 or FGF. If the column is not present or the value is null, WKB will be assumed.
- one or more tables to store the geometry and associated features. Although in general type does not matter in SQLite, BLOB type is recommended at least for WKB and FGF formats as it does not perform character translation on the data.
Note that this implementation does not currently include support for the enhanced multi-dimensional WKB and WKT formats that are defined in the OGC Simple Feature Access specification 06-103r3. The tags WKB12 and WKT12 are reserved for future use in support of this enhanced format. FGF refers to the FDO Binary Geometry Format.
SQLite databases require the following fields in their geometry_columns table. Other fields are permitted, but will not affect data access. For detailed explanations of all of the standard fields (everything but geometry_format), see the SFSQL 06-104r3 document, section 188.8.131.52.
|f_table_name||TEXT||name of a feature table with a geometry column.|
|f_geometry_column||TEXT||name of a geometry column in the feature table|
|geometry_type||INTEGER||integer value representing the form of geometry (1=POINT, 2=LINESTRING, etc). See Table 4 on Page 29-31 of the SFSQL 06-104r3 spec for a full list.|
|coord_dimension||INTEGER||dimensionality of the geometric features: 2, 3 or 4 [Addition - Ticket #715] 5 used for case when we use XYM (to make a difference between XYZ and XYM)|
|srid||INTEGER||foreign key into the spatial_ref_sys table|
|geometry_format||TEXT||One of "WKB", "WKT", "WKB12", "WKT12", "FGF"|
The suggested definition is:
CREATE TABLE geometry_columns ( f_table_name TEXT, f_geometry_column TEXT, geometry_type INTEGER, coord_dimension INTEGER, srid INTEGER, geometry_format TEXT )
SQLite databases require the following fields in their spatial_ref_sys table.
|srid||INTEGER||an integer value that uniquely identifies each Spatial Reference System within a database|
|auth_name||TEXT||the name of the standard or standards body that is being cited for this reference system. EPSG would be an example of a valid AUTH_NAME.|
|auth_srid||INTEGER||the ID of the Spatial Reference System as defined by the Authority cited in AUTH_NAME. 26910 would be an example of a valid EPSG srid (UTM Zone 10 NAD830).|
|srtext||TEXT||The Well-known Text Representation of the Spatial Reference System.|
The suggested definition is:
CREATE TABLE spatial_ref_sys ( srid INTEGER UNIQUE, auth_name TEXT, auth_srid INTEGER, srtext TEXT )
Proposed FDO Provider
The proposed SQLite provider will have the following features:
- Data stored and consumed in native SQLite format, and therefore accessible to other SQLite-based applications;
- Geometry will be serialized in FDO geometry format (FGF), WKB, or WKT and stored in a BLOB column. The type of geometry encoding will be specified in the geometry_columns table in a column named geometry_format. In addition, the provider may support point geometry with coordinates stored in non-geometric columns, via a connection string setting.
- A non-persistent spatial index usable for BBOX queries, built on the fly and valid for the duration of the FDO connection;
- Execution of arbitrary SQL commands;
- Metadata about feature geometry and coordinate systems will be stored in OGC geometry_columns and spatial_ref_sys tables;
- A tentative list of supported FDO commands : DescribeSchema, Select, SelectAggregates, Insert, Update, Delete, GetSpatialContexts, arbitrary SQL, ApplySchema, CreateDataStore;
FDO features the provider will initially *not* support:
- Feature class inheritance;
- Geometry filters other than BBOX;
- FDO Transactions -- but you can still manually execute BEGIN/COMMIT around other FDO commands;
- Any other exotic feature you can think of.
This is a new provider, so there will be no issues with backwards compatibility. We need to coordinate with the GDAL/OGR project (and any other data access libraries that express an interest) to make sure the metadata and geometry formats are usable cross-platform.
This section goes into some detail about how FDO constructs will be implemented with a SQLite back end.
- Spatial Context
In FDO schemas, the spatial context association of geometry properties is the string name of the spatial context. This will have to be mapped to an SRID integer, which is what the spatial_ref_sys table and geometry_columns table use to link geometry column to its spatial ref sys.
- FDO Data type mappings
SQLite only supports integer, real, string and blob data types. All integer FDO data types will map to SQLite integers. Double and Single will map to Real. DateTime? will map to a string, using ISO 8601 standard date time format. Decimal maps to Real. BLOB and geometry map to BLOBs and strings map to string.
- Insert command
The provider will feature the following peculiarity of the Fdo Insert command implementation: if you reuse the same FdoIInsert instance to insert several features, these will be inserted within the same SQLite transaction, which closes when the command object is disposed. Inserts are extremely slow in SQLite unless they are batched into a single transaction, and this behavior of insert will help bulk feature inserts.
- Select and Select Aggregate filters
Initially, FDO filters will be converted to SQL where clauses directly using the ToString() method. Therefore, any functions used within the clause need to be available or registered with the SQLite SQL engine. SQLite does have a way to bind external functions to the SQL engine, which is how we can implement most of the FDO expression functions (like sin, cos etc).
There will be code which detects and extracts BBOX spatial conditions from the filter before doing the above conversion to string. The provider will use its spatial index evaluate the spatial component of the filter. The SQLite SQL engine will be used for evaluating the remaining part of the filter.
This approach is the same as the one used in the OGR provider.
The provider will not be multi-threaded. For simulataneous access, to the same database, one would have to open two connections.
- Spatial Index
The provider will initially use an in-memory spatial index, built up during the first spatial filter request. Serialization of the spatial index is possible, but will not be initially implemented. Once we have spatial index serialization, multiple connections will be able to use the same index via memory mapping. Until then, each connection will have its own spatial index. Due to 5., this may end up consuming more memory than it should.
The initial implementation of ApplySchema will allow addition of new feature classes, but not modification of existing ones. If a feature class specified in the ApplySchema? schema already exists, it will be skipped.
This command will create a new sqlite database and also create empty spatial_ref_sys and geometry_columns table. If the database file already exists, the two tables will be created in the existing database.
- Autogenerated ID
The provider will not support composite keys. The recommended key will be auto-increment integer (32 or 64 bit). This will map directly to the internal SQLite row id. It will be possible to use other key types via the SQL UNIQUE constraint on the key column.
- SQLite library
Some modifications to and creative uses of SQLite's VDBE (SQL engine) will be necessary in order to improve read performance. Therefore the provider will use its own copy of SQLite. Note that this does not affect the file format in any way, just improves read performance when reading from within an FdoIFeatureReader.
There will be a small test suite that tests often used functionality. Arbitrary SQL execution will not be tested extensively.
- SpatialLite - a spatial extension for SQLite
- FGF Binary Specification
- OGC Simple Features for SQL (see 06-104r3)
- OGC Simple Feature Access (see 06-103r3)