= FDO RFC 50 - Extend FDO API to Select from Multiple Classes and Select Joins = This page contains a request for comments document (RFC) for the FDO Open Source project. More FDO RFCs can be found on the [wiki:FDORfcs RFCs] page. == Status == ||RFC Template Version||1.0|| ||Submission Date||Jun. 02 2010|| ||Last Modified||Romica Dascalescu [[Timestamp]]|| ||Author||Romica Dascalescu|| ||Contibutors||Orest Halustchak, Greg Boone|| ||RFC Status||Draft|| ||Implementation Status||Pending|| ||Proposed Milestone||3.6.0.0|| ||Assigned PSC guide(s)||Greg Boone|| ||'''Voting History'''||(vote date)|| ||+1|||| ||+0|||| ||-0|||| ||-1|||| == Motivation == It is a common requirement for applications to join classes together so that they can process related information that is spread across more than one class, e.g. properties stored in one class and geometry in another class. This can be for modeling purposes such as when geometry in one class is shared between separate feature classes or when properties from another class are needed for rendering purposes. Currently this is achieved either by using client-side join processing as in MapGuide or for RDBMS sources, by defining views as the physical level and exposing them as feature classes at the FDO API level. While these techniques are useful, the first results in very poor performance compared to a native join if the classes come from the same data store and the data store can do native joins as is possible with RDBMS-based providers. Using views is common as well, but requires the user to do this at the physical data store level. Finally, users could use SQL pass-through for providers that support it but that has issues because again the user is dealing with physical schema rather than FDO classes and the result is not a feature class. This RFC adds the ability to create joined selects using the FDO API directly rather than using the workarounds described above. This adds more power and flexibility to the FDO API for applications that need that capability. This enhancement will be optional for providers and a capabilities function is included for an application to determine if a provider supports it. == Overview == Implementing this new select is optional for all providers, for now we can support this only for SQLite and in the future we can extend it to other providers as needed. We expect the RDBMS-based provider to support this over time. The idea is to add support in FDO API for selecting from multiple classes and get a feature reader back. We have a few options from which we can chose in order to achieve our goal: * 1. Add a new select command e.g. FdoIJoinSelect derived from FdoISelect, add a new command type and caller needs to ensure that command exist in the provider command capabilities before calling this interface. A small improvement here we can add two classes FdoIJoinSelect (the interface having only pure methods) and !FdoJoinSelect (having a default implementation of those methods). All providers can use the class with a base implementation and in the future in case we want to add a new method we just add a default implementation in !FdoJoinSelect, however this adds more complexity to the API since developer would be confused which class should use as base class in his implementation. * '''Pro''': will not change the FdoISelect command, all other providers do not have to change anything on their side, and only in case a provider wants to support this command it has to implement it and expose it as supported. * '''Cons''': We add a new command and a new type command, we already have a FdoIExtendedSelect for scrollable readers and users can be confused by those two commands. This will add more complexity in the API. * 2. Enhance FdoISelect, then create a !FdoSelect base class and add default implementers for !GetFeatureClassNames () and !GetJoinCriteria(). Then each provider would be changed to base their select command implementation on !FdoSelect instead of FdoISelect. Two new capability functions will be added so that applications can determine: if a provider supports the join criteria and supported join types, SupportsJoins() & !GetJoinTypes(). Again, if future enhancements are ever made to FdoISelect, the need to change each provider would be eliminated. * '''Pro''': this solution will not add more complexity to the API. * '''Cons''': We need to change all the providers leading to a big effort from everyone. * 3. Enhance FdoISelect, add two new methods ‘!GetFeatureClassNames()’ and ‘GetJoinCriteria()’ with a default implementation throwing an exception. Providers which want to implement these new methods just can override then and provide a detailed implementation. Even FdoISelect is declared as ‘interface’ we do not have interfaces in C++ and we can have all other abstract methods and only the new one with a default implementation. A good example here would be locking methods from select command: most of the provider do not support locking and have to provide an empty implementation (usually to throw an exception), so more code on provider side when we could just provide a default implementation in the base class. Following this path we just modify the base class and all other providers will not be changed since will inherit the default implementation. Looking at FdoISelect there are already two methods which provide default implementation: !AddRef() and Release(), so adding default implementation for the new methods will not be something totally new. * '''Pro''': this solution will not add more complexity to the API, we do not need to change any provider. * '''Cons''': None. == Proposed Solution == The last option seems to be the best solution since will not enforce other providers to add a blank implementation for those new methods and also will not add a new command. We avoid having two select commands one “extended” and one “join” this way we also eliminate a confusion which might show-up in commands naming. Also it opens a different approach to update a base class without changing all providers to add a blank implementation. We need to define supported cases and all necessary classes need to be added: * A. '''Selecting form multiple classes'''. We should be able to select from multiple classes and have filters applied on any class attributes. In order to be able to achieve that we can use FdoISelect::GetFeatureClassNames () and add all classes we want to select from. The filter must be based on properties from the selected classes and filter can be validated at run time when the whole select is built. In this collection we can have: * FdoIdentifier’s in case we just need to pass a class name. * FdoComputedIdentifier’s having a name and as expression value a FdoIdentifier, this way we can get aliases, e.g. PropName AS NewPropName. We already use computed identifiers in our API for expressions and we provide a name and an expression value. In this case it will be the same, since an alias is actually an expression: we provide an alias (name) and the real name of the property. * FdoClassComputedIdentifier’s which is a FdoISelect having a name. This option might not be supported by all RDBMS providers and it depends of the internal implementation. This class behaves like a computed class which will be evaluated at run time and really depends of the provider implementation. This new class will help to define sub-selects, see below case (C) We will use some SQL examples just to illustrate things we are trying to achieve, each RBDMS-based provider will translate FDO selects in server side SQL depending of the server capabilities. '''SELECT * FROM class1 AS p1, class2 AS p2 WHERE p1.ID=p2.FeatId AND SPATIAL_COND(p1.GEOM, GEOM_VAL);''' Below we added some C++ code on how caller can use this new improvement to achieve above select statement: {{{ FdoPtr sel = static_cast(conn->CreateCommand(FdoCommandType_Select)); FdoPtr fclasscoll = = sel->GetFeatureClassNames(); FdoPtr fcname1 = FdoIdentifier::Create(L"class1"); FdoPtr fcp1 = FdoComputedIdentifier::Create(L"p1", fcname1); fclasscoll->Add(fcp1); FdoPtr fcname2 = FdoIdentifier::Create(L"class2"); FdoPtr fcp2 = FdoComputedIdentifier::Create(L"p2", fcname2); fclasscoll->Add(fcp2); FdoPtr filter = FdoFilter::Parse(L"p1.ID=p2.FeatId AND Geometry INTERSECTS GeomFromText('CURVESTRING(...)')"); sel->SetFilter(filter); FdoPtr rdr = sel->Execute(); }}} '''Note:''' In the base class we have a method named GetFeatureClassName, in case caller will use it and the FdoIdentifier is not added into GetFeatureClassNames we can add it at Execute time. * B. '''Handle Joins''' : below we define new classes to define and use a join criteria used in the join select. {{{ class FdoJoinCriteria : public FdoIDisposable { /// \cond DOXYGEN-IGNORE protected: /// Constructs a default instance of a join criteria. FdoJoinCriteria(); /// Constructs an instance of a join criteria using the specified arguments. FdoJoinCriteria(FdoString* name, FdoIdentifier* joinClass, FdoJoinType joinType, FdoFilter* filter); /// Default destructor for join criteriab. virtual ~FdoJoinCriteria(); virtual void Dispose(); public: /// Constructs a default instance of a join criteria. FDO_API static FdoJoinCriteria* Create(); /// Constructs a full defined instance of a join criteria. FDO_API static FdoJoinCriteria* Create(FdoString* name, FdoIdentifier* joinClass, FdoJoinType joinType, FdoFilter* filter); /// Returns the alias name for the join criteria class FDO_API FdoString* GetName(); /// Sets the alias name for the join criteria class FDO_API void SetName(FdoString* value); /// Returns the identifier of the right join class FDO_API FdoIdentifier* GetJoinClass(); /// Sets the identifier of the right join class FDO_API void SetJoinClass(FdoIdentifier* value); /// Returns the join type, e.g. inner join, cross join, left join FDO_API FdoJoinType GetJoinType(); /// Sets the join type, e.g. inner join, cross join, left join FDO_API void SetJoinType(FdoJoinType value); /// Returns the join filter, e.g. LEFT JOIN rclass ON(filter) FDO_API FdoFilter* GetFilter(); /// Sets the join filter, e.g. LEFT JOIN rclass ON(filter) FDO_API void SetFilter(FdoFilter* value); private: wchar_t* m_name; FdoIdentifier* m_JoinClass; FdoJoinType m_JoinType; FdoFilter* m_Filter; }; /// Join type enumeration used to specify the join type enum FdoJoinType { /// default join type. FdoJoinType_None = 0x01, /// inner join FdoJoinType_Inner = 0x02, /// right outer join FdoJoinType_RightOuter = 0x04, /// left outer join FdoJoinType_LeftOuter = 0x08, /// full outer join FdoJoinType_FullOuter = 0x10, /// cross join FdoJoinType_Cross = 0x20 }; }}} We need to add two new methods in connection capabilities: supports join selects (to avoid an exception when a user will try to use these two new methods from FdoISelect) and also supported join types which can be a collection or a value as OR between values, since all join types cannot be more than we can fit in a int32 using OR e.g.: val1|val2. {{{ class FdoIConnectionCapabilities { virtual bool SupportsJoins(){ return false; } virtual FdoInt32 GetJoinTypes() {return FdoJoinType_None;} } }}} We also need a collection to be defined for join criteria: {{{ class FdoJoinCriteriaCollection : public FdoCollection { protected: /// \cond DOXYGEN-IGNORE virtual void Dispose(); public: /// Default constructor FDO_API static FdoJoinCriteriaCollection* Create(); /// Returns FdoJoinCriteria having the name. /// This will throw an exception in case item is not found FDO_API FdoJoinCriteria* GetItem(FdoString* name); /// Returns FdoJoinCriteria having the name or NULL. FDO_API FdoJoinCriteria* FindItem(FdoString* name); /// Returns FdoJoinCriteria having the index. FDO_API FdoJoinCriteria* GetItem(FdoInt32 index); }; }}} Join criteria will be used to generate join SQL commands, like: '''SELECT p.FeatId AS xFID,p.Geometry,pxy.Name,pxy.X,pxy.Y,pex.NameB,pex.Link FROM p '''[[BR]] '''INNER JOIN pxy ON(p.FeatId = pxy.FeatId)'''[[BR]] '''INNER JOIN pex ON(pex.FID = pxy.FeatId)'''[[BR]] '''WHERE p.FeatId >= 0;'''[[BR]] Below we added some C++ code on how caller can use this new improvement to achieve above select statement: {{{ FdoPtr sel = static_cast(conn->CreateCommand(FdoCommandType_Select)); FdoPtr fclasscoll = = sel->GetFeatureClassNames(); FdoPtr jcrit = sel->GetJoinCriteria(); FdoPtr fcpoint = FdoIdentifier::Create(L"p"); fclasscoll->Add(fcpoint); FdoPtr jcCls1 = FdoIdentifier::Create(L"pxy"); FdoPtr flcr1 = FdoFilter::Parse(L"p.FeatId=pxy.FeatId"); FdoPtr jc1 = FdoJoinCriteria::Create(L"pxy", jcCls1, FdoJoinType_Inner, flcr1); jcrits->Add(jc1); FdoPtr jcCls2 = FdoIdentifier::Create(L"pex"); FdoPtr flcr2 = FdoFilter::Parse(L"pex.FID=pxy.FeatId"); FdoPtr jc2 = FdoJoinCriteria::Create(L"pex", jcCls2, FdoJoinType_Inner, flcr2); jcrits->Add(jc2); FdoPtr rdr = sel->Execute(); }}} * C. '''Handle Sub-selects''' : below we define a new class to define sub-select used in the new select. {{{ class FdoComputedClassIdentifier : public FdoIdentifier { protected: /// Constructs a default instance of a computed class identifier. FdoComputedClassIdentifier(); /// Constructs an instance of a computed class /// identifier using the specified arguments. FdoComputedClassIdentifier(FdoString* name, FdoISelect* select); /// Default destructor for computed class identifier. virtual ~FdoComputedClassIdentifier(); virtual void Dispose(); public: /// Constructs a default instance of a computed class identifier. FDO_API static FdoComputedClassIdentifier* Create(); /// Constructs an instance of a computed class identifier /// using the specified arguments. FDO_API static FdoComputedClassIdentifier* Create( FdoString* name, FdoISelect* select); /// Gets the sub-select of the computed class identifier. FDO_API FdoISelect* GetSelect(); /// Sets the sub-select of the computed class identifier. FDO_API void SetSelect(FdoISelect* value); /// Returns the type of the expression used to identify the /// type and avoid dynamic casts FDO_API FdoItemExpressionType GetExpressionType() { return FdoItemExpressionType_ComputedClassIdentifier; } private: FdoISelect* m_subSelect; }; }}} Named sub-select will be used to generate SQL commands, like: '''SELECT * FROM point WHERE FeatId IN (SELECT pt.FeatId FROM pt WHERE pt.FeatId >= 2 AND SPATIAL_COND(pt.GEOM, GEOM_VAL));''' In above case we will have two selects having following filters: * Main select: “FeatId IN(:SUBSEL)” * Sub-select: “pt.FeatId >= 2 AND SPATIAL_COND(pt.GEOM, GEOM_VAL)” Below we added some C++ code on how caller can use this new improvement to achieve above select statement: {{{ FdoPtr sel = static_cast(conn->CreateCommand(FdoCommandType_Select)); FdoPtr fclasscoll = = sel->GetFeatureClassNames(); FdoPtr fcpoint = FdoIdentifier::Create(L"point"); fclasscoll->Add(fcpoint); FdoPtr subsel = static_cast(conn->CreateCommand(FdoCommandType_Select)); FdoPtr ssfclasscoll = = subsel->GetFeatureClassNames(); FdoPtr fcpt = FdoIdentifier::Create(L"pt"); ssfclasscoll->Add(fcpt); FdoPtr ssfilter = FdoFilter::Parse(L"pt.FeatId>=2 AND Geometry INTERSECTS GeomFromText('CURVESTRING(...)')"); subsel->SetFilter(ssfilter); FdoPtr fcSubSel = FdoComputedClassIdentifier::Create(L"SUBSEL1", subsel); fclasscoll->Add(fcSubSel); FdoPtr filter = FdoFilter::Parse(L"FeatId IN (:SUBSEL1)"); sel->SetFilter(filter); FdoPtr rdr = sel->Execute(); }}} In order to be able to support sub-selects for '''IN''' operator we need provide a way to pass it through FDO Filter and make the parser to recognize the filter, since FDO Filter parser don’t know SQL syntax. The best way to achieve that is to pass a parameter in the filter, e.g.: '''FeatId IN(:PARAM)'''. This parameter will be replaced by the provider at execution time by the sub-select. This way we improve IN() operator in FDO to handle sub-selects. As new additions to the FDO API we will have three new classes (FdoJoinCriteria, FdoJoinCriteriaCollection and FdoComputedClassIdentifier) and two enumerations (FdoItemExpressionType and FdoJoinType). We also need to add a new two new command capabilities, two new methods SupportsJoins and GetJoinTypes in order to be able to detect if those methods are supported by the provider (otherwise caller might get exceptions from the default implementation). We are planning at execute time to return a feature reader even there might be cases when the class returned by the reader might not have a primary key or a geometry column. In case select do not provide a select property list, all returned columns will be added to the feature class reader built at runtime, however in case user will request certain properties only those properties will be added to the feature class and even returned feature class might have no primary key or no geometry, depending of caller request. A feature reader is way more useful in applications. In the worst case we can add all primary keys from the selected classes to the result primary key of the class. Also in case there is no way to return a primary key user can add one of the property to the primary key collection of the returned class to have a primary key. The returned class will be read-only since there is no way to know what we can update. FdoISelectAggregates interface will be updated in the same manner in order to allow doing aggregates selects having sub-selects in the filter side. Update and delete commands will not be changed to allow such complex filters, however we can update those commands doing a similar RFC in the future if is needed. == Small improvements in FDO == * '''Handling FDO expressions to avoid dynamic casts''' In the current API in order to make a difference between FdoIdentifier and FdoComputedIdentifier we need to use dynamic cast and that’s not too fast. There is a simple way to avoid many dynamic casts by adding a new abstract method in FdoExpression named GetExpressionType(). All main derivate classes from FdoExpression need to return a type without adding a new parameter this way we avoid increasing the memory consumption by FDO expression objects. A new enumeration will be added, see below, and classes like FdoIdentifier will have to implement this method. {{{ /// The FdoExpressionType enumeration defines the expression types available. enum FdoItemExpressionType { /// Identifier e.g. PROP FdoItemExpressionType_Identifier, /// Identifier e.g. PROP*5 AS EXP FdoItemExpressionType_ComputedIdentifier, /// Identifier e.g. SELECT * FROM C1 AS CLS FdoItemExpressionType_ComputedClassIdentifier, /// Parameter e.g. :PROP FdoItemExpressionType_Parameter, /// Function e.g. AVG() FdoItemExpressionType_Function, /// Expression e.g. A+B FdoItemExpressionType_BinaryExpression, /// Expression e.g. -A FdoItemExpressionType_UnaryExpression, /// Data value e.g. 12, 12.4 'text' FdoItemExpressionType_DataValue, /// Geometry e.g. 12, 12.4 'text' FdoItemExpressionType_GeometryValue }; class FdoIdentifier : public FdoExpression { /// Returns FdoItemExpressionType FDO_API virtual FdoItemExpressionType GetExpressionType() { return FdoItemExpressionType_Identifier; } } }}} * '''Update the FDO Filter to handle custom names''' FDO filter parser needs to be updated since we support only Class.Property e.g.: "A.BB >=10". We need to support also formats like: ||Class."Property"|| ||"Class".Property|| ||"Class"."Property"|| This change is required in order to be able to support properies names having special characters, like: myClass."My Property" > 10. This change will not invlove any API change and will be done like a bug fix. == Managed FDO API == The FDO Managed Interfaces will be updated in a similar manner to reflect the proposed changes. == Provider Implementation == Initially, we need this for SQLite provider. Other provider developers can add it and use it as needed. The providers which decide to use this new select need to implement the new methods from the FdoISelect and return the right command capabilities. == Test Plan == Run existing FDO core unit tests and expand SQLite unit tests to test the proposed enhancements defined above. == Funding/Resources == Autodesk to provide resources / funding.