Changes between Version 3 and Version 4 of FDORfc52


Ignore:
Timestamp:
Aug 8, 2010, 3:58:05 PM (14 years ago)
Author:
traianstanev
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FDORfc52

    v3 v4  
    2323== Overview ==
    2424
    25 
    26 
    27 == Motivation ==
    28 
    29 
    30 
    31 == Proposed Solution ==
    32 
    33 
    34 
    35 == Implications ==
    36 
    37 
    38 
    39 == Test Plan ==
    40 
    41 
     25This RFC proposes a convience wrapper API for FDO. The basic features of the API are:
     26
     27* Mostly procedural API (unlike FDO Core object oriented, command based structure).
     28  This significantly reduces object lifetime issues, removes the need of refcounting
     29  and eases possible managed wrapper, due to only one single object controlling database connection lifetime.
     30* Provides shortcuts for commonly used functionality, like spatial query, fetching extents, feature count
     31* Automatic type conversion (a.k.a duck typing). Unlike core FDO, the API converts automatically between comaptible type
     32  (No more need to switch-case on FdoByte, FdoInt16, FdoInt32, FdoInt64, FdoDecimal, just to get an integer).
     33* Thread safe connection pooling/caching. This is a common feature that often developers have to implement
     34  from scratch.
     35* Possibility to provide alternative backend to the API (for example, OGR backend).
     36* Possibility to implement common features, like coordinate system transformations,
     37  in the wrapper layer, in a single common piece of code for all provider backends.
     38* Automatic backend data source resolution, similar to OGR
     39  (i.e. if the connection string is a path to an SHP file, the SHP provider will automatically be loaded and used)
     40
     41
     42== High Level Architecture ==
     43
     44There are four main objects in the API, reflecting the database-oriented nature of most FDO providers:
     45    * Database (top level object)
     46        * Table (a Database can have several Tables)
     47        * Row (a Table consists of many Rows)
     48        * Value (a Row consists of many column Values)
     49
     50The Database is the top level object, whose lifetime controls the lifetime of all other objects.
     51The API user only directly has to manage the lifetime of the Database object, which can be obtained from
     52and released to the thread safe connection pool automatically using a RAII-style DbHandle object,
     53to completely avoid manual object disposal.
     54
     55== Examples ==
     56
     57The detailed API is available in header files attached to this document. Here are some examples
     58which illustrate the basics.
     59
     60=== Example 1: Read and print the contents of a data source ===
     61
     62This code sample iterates over all features of all tables of a database
     63and prints out the value for each column. Note that automatic data type conversion
     64to string and the simple access by index for each column (via operator[]).
     65Column value access by name is also possible using operator[].
     66
     67void PrintFile(const wchar_t* path)
     68{
     69    printf("Data source: %ls \n", path);
     70
     71        // Get a connection to from the pool
     72    DbHandle srcdb = DbPool::GetConnection(path);
     73
     74    if (!srcdb)
     75        return;
     76
     77    //for all feature classes
     78    for (int q=0; q<srcdb->Count(); q++)
     79    {
     80        Table& srctbl = (*srcdb)[q];
     81
     82        //table name
     83        printf("Table: %ls\n", srctbl.Def().Name());
     84
     85        //show the overall extent
     86        double ext[4];
     87        srctbl.GetExtent(ext);
     88        printf("Bounds: %.8g, %.8g, %.8g, %.8g\n", ext[0], ext[1], ext[2], ext[3]);
     89
     90        //number of features
     91        printf("Total feature count: %lld\n", srctbl.GetRowCount());
     92     
     93        //no spatial reordering -- just run through a full table select
     94        srctbl.Query();
     95
     96        while (srctbl.Next())
     97        {
     98            const Row& r = srctbl.At();
     99
     100            printf("\tFeature: %lld\n", r.ID());
     101
     102            for (int j=0; j<r.Def().ColumnCount(); j++)
     103            {
     104                //skip the fid which we printed above
     105                if (j == r.Def().IndexOfID())
     106                    continue;
     107
     108                printf("\t\t%ls :\t%ls\n", r.Def()[j].Name(), r[j].AsString());
     109            }
     110        }
     111
     112        srctbl.EndQuery(); //not strictly needed
     113    }
     114}
     115
     116=== Example 2: Bulk copy from one data source to another (e.g. SHP to SQLite conversion) ===
     117
     118This example copies verbatim the source data into the destination file.
     119Note that all that's needed is to call Insert() on the destination with the source
     120row. The insert call takes a flag indicating whether to preserve the column FID or not,
     121and optionally a pointer which is filled with the FID of the newly created feature in the
     122target data store.
     123
     124void ConvertFDOToFDO(const wchar_t* src, const wchar_t* dst)
     125{
     126    printf("%ls ---> %ls\n\n", src, dst);
     127
     128    //Create target data store
     129    if (!DatabaseFdo::Create(dst))
     130        return;
     131
     132    //open source and target connections
     133    DbHandle srcdb = DbPool::GetConnection(src);
     134    DbHandle dstdb = DbPool::GetConnection(dst);
     135
     136    if (!srcdb)
     137        return;
     138
     139    if (!dstdb)
     140        return;
     141
     142    //copy the schema and coord sys defs
     143    dstdb->SetSchema(srcdb->Def());
     144
     145    //for all feature classes
     146    for (int q=0; q<dstdb->Count(); q++)
     147    {
     148        Table& dsttbl = (*dstdb)[q]; //get destination table
     149        Table& srctbl = *(*srcdb)[dsttbl.Def().Name()]; //get corresponding source table by name
     150     
     151        int count = 0;
     152
     153        //basic "select *" kind of query
     154        srctbl.Query();
     155        //double bbox[4] = { 0, 0, 10000, 10000 };
     156        //srctbl.Query(bbox); //query with bounding box
     157
     158        while (srctbl.Next())
     159        {
     160            //directly insert source row into target table
     161            //without any transformation
     162            dsttbl.Insert(srctbl.At(), NULL, true);
     163
     164            count++;
     165
     166            if (count % 10000 == 0) printf ("# processed: %d\n", count);
     167        }
     168
     169        srctbl.EndQuery();
     170
     171        printf ("\n\nTotal feature count : %d\n", count);
     172    }
     173}
     174
     175
     176== Performance Implications ==
     177
     178As every wrapper, there will be some performance overhead to using this wrapper instead of FDO directly.
     179For simple queries accessing multiple column, this overhead is not very large, but it can be significant
     180in cases where only one or two columns out of a very wide row are needed by the caller. This is fundamentally
     181because the wrapper API always pre-fills all column values before returning a row -- something that
     182can be fixed at the price of adding code complexity.
     183 
     184With OGR as backend, the performance overhead is very small, due to the proposed wrapper mapping almost 1:1 to the
     185underlying OGR API. The main overhead with OGR is geometry conversion from WKB to FGF format.
     186 
     187The above weakness (pre-filling of the row) has the potential of becoming a strength if ever a direct backend
     188is implemented for sqlite for example. Under normal use cases, using FDO requires 2*N virtual function
     189calls to retrieve N column values of a row, while the wrapper API can do the same with a single virtual function call.
     190For very fast backends this can make a huge difference, in particular if column values are accessed by index (and not by name).
     191
     192
     193
     194== Attachments ==
     195
     196Attached is a rough implementation of the wrapper which supports the examples shown above.
     197For all base API definitions, refer to BaseWrap.h.
     198Working backends are included for both FDO (FdoWrap.h/cpp) and OGR (OgrWrap.h/cpp).
     199The FDO backend recognizes SHP, SDF and SQLite connections only.
     200The OGR backend recognizes all OGR backends, but does not yet implement datastore creation. It does implement insert/update to existing data stores.
     201The connection pooling is implelmented in DbPool.h/cpp.
    42202
    43203== !Funding/Resources ==
    44204
     205This RFC is currently only a request for discussion and feedback on the design of the proposed API.