= FDO RFC 16 - FDO Provider for SQLite = This page contains a request for comments document (RFC) for the FDO Open Source project. More FDO RFCs can be found on the [wiki:FDORfcs RFCs] page. == Status == ||RFC Template Version||(1.0)|| ||Submission Date||2008-03-18|| ||Last Modified||Traian Stanev 2008-03-29|| ||Author||Traian Stanev|| ||RFC Status||draft|| ||Implementation Status||under development|| ||Proposed Milestone||3.4.0.0|| ||Assigned PSC guide(s)||Jason Birch|| ||'''Voting History'''||2008-04-3|| ||+1||Greg, Orest, Jason, Mateusz, Haris, Bob, Frank|| ||+0|||| ||-0|||| ||-1|||| == Overview == This RFC proposes a new FDO provider for accessing (spatial) data stored in SQLite database files. == Motivation == SQLite is an efficient and popular way to store data. Support for accessing SQLite databases via FDO would expand the range of data sources supported by FDO. A SQLite format for storing spatial data would make a good interchange format, similar to, but more flexible than Shapefiles. == What makes a valid spatial SQLite database? == The goal of this simple specification is to allow access to SQLite databases in as open and standard way as possible. This document does ''not'' conform to the OGC Simple Features for SQL implementation specification ([http://www.opengeospatial.org/standards/sfs 06-104r3]), but it does leverage the specification where practical. Here is the minimum set of characteristics expected in SQLite database: * well-defined subset of the geometry_columns table (as per SF spec, section 7.1.3.2, 7.1.3.3); see below for specific column requirements, * spatial_ref_sys table (as per SF spec, section 7.1.2.2, 7.1.2.3). Valid WKT spatial reference systems must be defined for each unique coordinate system used in the database. * geometry_format column extending the geometry_columns table, set to one of WKB, WKT, WKB12, WKT12 or FGF. If not present, WKB will be assumed. * one or more tables to store the geometry and associated features. Although in general [http://www.sqlite.org/datatype3.html type does not matter in SQLite], BLOB type is recommended at least for WKB and FGF formats as it does not perform character translation on the data. Note that this implementation does ''not'' currently include support for the enhanced multi-dimensional WKB and WKT formats that are defined in the OGC Simple Feature Access specification [http://www.opengeospatial.org/standards/sfa 06-103r3]. The tags WKB12 and WKT12 are reserved for future use in support of this enhanced format. FGF refers to the [http://fdo.osgeo.org/files/fdo/docs/FDG_FDODevGuide/files/WSfacf1429558a55de8821c21057fbebc2b-789.htm FDO Binary Geometry Format]. == geometry_columns Subset == SQLite databases require the following fields in their geometry_columns table. Other fields are permitted, but will not affect data access. For detailed explanations of all of the standard fields (everything but geometry_format), see the SFSQL [http://www.opengeospatial.org/standards/sfs 06-104r3] document, section 7.1.3.3. || Column Name || Type || Notes || || f_table_name || VARCHAR || name of a feature table with a geometry column. || || f_geometry_column || VARCHAR || name of a geometry column in the feature table || || geometry_type || INTEGER || integer value representing the form of geometry (1=POINT, 2=LINESTRING, etc). See Table 4 on Page 29-31 of the SFSQL [http://www.opengeospatial.org/standards/sfs 06-104r3] spec for a full list. || || coord_dimension || INTEGER || dimensionality of the geometric features: 2, 3 or 4 || || srid || INTEGER || foreign key into the spatial_ref_sys table || || geometry_format || VARCHAR || One of "WKB", "WKT", "WKB12", "WKT12", "FGF" || The suggested definition is: {{{ CREATE TABLE geometry_columns ( f_table_name VARCHAR, f_geometry_column VARCHAR, geometry_type INTEGER, coord_dimension INTEGER, srid INTEGER, geometry_format VARCHAR ) }}} == Proposed FDO Provider == The proposed SQLite provider will have the following features: * Data stored and consumed in native SQLite format, and therefore accessible to other SQLite-based applications; * Geometry will be serialized in FDO geometry format (FGF), WKB, or WKT and stored in a BLOB column. The type of geometry encoding will be specified in the geometry_columns table in a column named geometry_format. In addition, the provider may support point geometry with coordinates stored in non-geometric columns, via a connection string setting. * A non-persistent spatial index usable for BBOX queries, built on the fly and valid for the duration of the FDO connection; * Execution of arbitrary SQL commands; * Metadata about feature geometry and coordinate systems will be stored in OGC geometry_columns and spatial_ref_sys tables; * A tentative list of supported FDO commands : !DescribeSchema, Select, !SelectAggregates, Insert, Update, Delete, !GetSpatialContexts, arbitrary SQL, !ApplySchema, !CreateDataStore; FDO features the provider will initially *not* support: * Feature class inheritance; * Geometry filters other than BBOX; * FDO Transactions -- but you can still manually execute BEGIN/COMMIT around other FDO commands; * Associations; * Constraints; * Any other exotic feature you can think of. == Implications == This is a new provider, so there will be no issues with backwards compatibility. We need to coordinate with the GDAL/OGR project (and any other data access libraries that express an interest) to make sure the metadata and geometry formats are usable cross-platform. == Implementation Plan == This section goes into some detail about how FDO constructs will be implemented with a SQLite back end. 1. Spatial Context In FDO schemas, the spatial context association of geometry properties is the string name of the spatial context. This will have to be mapped to an SRID integer, which is what the spatial_ref_sys table and geometry_columns table use to link geometry column to its spatial ref sys. 2. FDO Data type mappings SQLite only supports integer, real, string and blob data types. All integer FDO data types will map to SQLite integers. Double and Single will map to Real. DateTime will map to a string, using ISO 8601 standard date time format. Decimal maps to Real. BLOB and geometry map to BLOBs and strings map to string. 3. Insert command The provider will feature the following peculiarity of the Fdo Insert command implementation: if you reuse the same FdoIInsert instance to insert several features, these will be inserted within the same SQLite transaction, which closes when the command object is disposed. Inserts are extremely slow in SQLite unless they are batched into a single transaction, and this behavior of insert will help bulk feature inserts. 4. Select and Select Aggregate filters Initially, FDO filters will be converted to SQL where clauses directly using the !ToString() method. Therefore, any functions used within the clause need to be available or registered with the SQLite SQL engine. SQLite does have a way to bind external functions to the SQL engine, which is how we can implement most of the FDO expression functions (like sin, cos etc). There will be code which detects and extracts BBOX spatial conditions from the filter before doing the above conversion to string. The provider will use its spatial index evaluate the spatial component of the filter. The SQLite SQL engine will be used for evaluating the remaining part of the filter. This approach is the same as the one used in the OGR provider. 5. Multithreading The provider will not be multi-threaded. For simulataneous access, to the same database, one would have to open two connections. 6. Spatial Index The provider will initially use an in-memory spatial index, built up during the first spatial filter request. Serialization of the spatial index is possible, but will not be initially implemented. Once we have spatial index serialization, multiple connections will be able to use the same index via memory mapping. Until then, each connection will have its own spatial index. Due to 5., this may end up consuming more memory than it should. 7. !ApplySchema The initial implementation of !ApplySchema will allow addition of new feature classes, but not modification of existing ones. If a feature class specified in the ApplySchema schema already exists, it will be skipped. 8. !CreateDataStore This command will create a new sqlite database and also create empty spatial_ref_sys and geometry_columns table. If the database file already exists, the two tables will be created in the existing database. 9. Autogenerated ID The provider will not support composite keys. The recommended key will be auto-increment integer (32 or 64 bit). This will map directly to the internal SQLite row id. It will be possible to use other key types via the SQL UNIQUE constraint on the key column. 10. SQLite library Some modifications to and creative uses of SQLite's VDBE (SQL engine) will be necessary in order to improve read performance. Therefore the provider will use its own copy of SQLite. Note that this does not affect the file format in any way, just improves read performance when reading from within an FdoIFeatureReader. == Test Plan == There will be a small test suite that tests often used functionality. Arbitrary SQL execution will not be tested extensively. == Funding/Resources == == References == * [http://www.sqlite.org/ SQLite] * [http://www.gaia-gis.it/spatialite/ SpatialLite] - a spatial extension for SQLite * [http://fdo.osgeo.org/files/fdo/docs/FDG_FDODevGuide/files/WSfacf1429558a55de8821c21057fbebc2b-789.htm FGF Binary Specification] * [http://www.opengeospatial.org/standards/sfs OGC Simple Features for SQL] (see 06-104r3) * [http://www.opengeospatial.org/standards/sfa OGC Simple Feature Access] (see 06-103r3)