wiki:FDORfc23

FDO RFC 23 - New GetSchemaNames, GetClassNames commands, and DescribeSchema hint

This page contains a request for change document (RFC) for the FDO Open Source project. More FDO RFCs can be found on the RFCs page.

Status

RFC Template Version1.0
Submission DateJuly 8, 2008
Last ModifiedRonnie Louie Timestamp?
AuthorRonnie Louie
RFC StatusAdopted
Implementation StatusIn Progress
Proposed Milestone3.4.0.0
Assigned PSC guide(s)Greg Boone
Voting HistoryJuly 24, 2008
+1Greg, Mateusz, Jason, Orest, Haris
+0Frank
-0
-1

Overview

This RFC is for adding new APIs for retrieving a specific subset of available information that would otherwise be obtained from executing a full DescribeSchema command.

Motivation

Applications using FDO to interact with underlying datastores are often interested in retrieving a list of the available classes, and the set of properties available in each of those classes from the available FDO provider. This information is used to configure a connection to a provider, and to obtain supported geometries and properties from the requested datastore.

Currently this information is retrieved from executing a DescribeSchema command, which returns a full database schema, containing detailed information about every feature class available from that provider. For a typical real-world database deployment, containing hundreds of feature classes, it can take a very long time to retrieve all the data for a full schema.

For commonly required tasks such as requesting a list of schema names, class names, or a list of properties for a specific classes, it could be accomplished more efficiently by retrieving only the requested information.

Proposed Solution

The performance gain from using the new GetSchemaNames and GetClassNames commands, and the modified DescribeSchema command will be most notable for the RDBMS-based FDO providers. Support for the new commands will be indicated by the provider's command capabilities object. Hence, only the RDBMS-based FDO providers will include !FdoCommandType_GetSchemaNames, and !FdoCommandType_GetClassNames in the GetCommands() response when interrogating the provider's capabilities. Likewise, the new behaviour in DescribeSchema will be supported in the RDBMS-based providers. For all the non-RDBMS based providers, there will not be any change in the execution of the DescribeSchema command.

GetSchemaNames

To retrieve the names of the available schemas in a feature source without having to incur the cost of a DescribeSchema, a new command GetSchemaNames, will be added to the API. The interface for this command is shown below.

/// \brief
/// The FdoIGetSchemaNames interface defines the GetSchemaNames command, which
/// retrieves the list names for the available schemas. 
/// The Execute operation returns an FdoStringCollection object.
class FdoIGetSchemaNames : public FdoICommand
{
    friend class FdoIConnection;

public:

    /// \brief
    /// Executes the GetSchemaNames command and returns a 
    /// FdoStringCollection. 
    /// 
    /// \return
    /// Returns the string collection of the names of the available schemas.
    FDO_API virtual FdoStringCollection* Execute() = 0;
};

GetClassNames

To retrieve the class names without having to incur the cost of a DescribeSchema, a new command, GetClassNames, will be added to the API for obtaining a list of class names for a given schema. If schema is not specified, the list will consist of all classes in the feature source.

/// \brief
/// The FdoIGetClassNames interface defines the GetClassNames command,
/// which retrieves the list of available class names. 
/// The Execute operation returns an FdoStringCollection object.
class FdoIGetClassNames : public FdoICommand
{
    friend class FdoIConnection;

public:
    /// \brief
    /// Gets the name of the schema for enumeration. This function is optional;
    /// if not specified, execution of the command will enumerate the classes in all schemas.
    /// 
    /// \return
    /// Returns the schema name
    /// 
    FDO_API virtual FdoString* GetSchemaName() = 0;

    /// \brief
    /// Sets the name of the schema for the enumeration. This function is optional; if not
    /// specified execution of the command will enumerate the classes in all schemas.
    /// 
    /// \param value 
    /// Input the schema name
    /// 
    /// \return
    /// Returns nothing
    /// 
    FDO_API virtual void SetSchemaName(FdoString* value) = 0;

    /// \brief
    /// Executes the GetClassNames command and returns a 
    /// FdoStringCollection. If the specified schema name does not exist,
    /// the Execute method throws an exception.
    /// 
    /// \return
    /// Returns the string collection of the fully qualified class names for the specified schema.
    FDO_API virtual FdoStringCollection* Execute() = 0;
};

DescribeSchema

To retrieve specific class definitions, the existing DescribeSchema command will be modified to allow specifying the classes contained in the class collection of the resulting schema. In particular, a new method SetClassNames will be added to the command to facilitate the request for a collection of certain classes. If the command has set the schema to retrieve, the classes set by SetClassNames will be restricted to that schema. If no schema has been set, fully qualified class names should be passed to the SetClassNames method. Finally, if the schema has not been set, and class names are not fully qualified, then the result will contain all matching class names. An exception will be thrown if the schema name specified in both the schema name and qualified class name parameter do not match.

Since this change involves an update to the DescribeSchema command interface, implementation of the SetClassNames method will be mandatory for all providers. However, due to resource constraints and the cost versus benefit of restricting the results to certain classes for each provider, it is proposed that the class names passed to this method serve as a hint for the DescribeSchema command during execution. Providers will not necessarily take advantage of the hint, depending on each provider's criteria and implementation schedule, and simply return the full schema. There would not be any capability exposed to advertise support for the new behaviour, and no error or exception thrown if the provider does not intend to use the hint. Initially the RDBMS-based providers will use the hint and restrict the returned schema to the specific classes, since those providers should exhibit performance improvements when the full schema does not need to be returned.

/// \brief
/// The FdoIDescribeSchema interface defines the DescribeSchema command, which
/// describes the feature schemas available from the connection. The DescribeSchema
/// command can describe a single schema or all schemas available from
/// the connection. The Execute operation returns an FdoFeatureSchemaCollection
/// object.
class FdoIDescribeSchema : public FdoICommand
{
    friend class FdoIConnection;

public:
    /// \brief
    /// Gets the name of the schema to describe. This function is optional;
    /// if not specified, execution of the command will describe all schemas.
    /// 
    /// \return
    /// Returns the schema name
    /// 
    FDO_API virtual FdoString* GetSchemaName() = 0;

    /// \brief
    /// Sets the name of the schema to describe. This function is optional; if not
    /// specified execution of the command will describe all schemas.
    /// 
    /// \param value 
    /// Input the schema name
    /// 
    /// \return
    /// Returns nothing
    /// 
    FDO_API virtual void SetSchemaName(FdoString* value) = 0;

    /// \brief
    /// Gets the names of the classes to retrieve. This is optional,
    /// if not specified execution of the command will describe all classes.
    /// If the class name is not qualified, and the schema name is not specified,
    /// the requested class from all schemas will be described.
    /// The class names specified serve only as a hint.  Use of the hint
    /// during command execution is provider dependent.  Providers that 
    /// will not use the hint will describe the schema for all classes.
    /// 
    /// \return
    /// Returns the collection of class names
    /// 
    FDO_API virtual FdoStringCollection* GetClassNames() = 0;

    /// \brief
    /// Sets the name of the classes to retrieve. This is optional, if not
    /// specified execution of the command will describe all classes.
    /// If the class name is not qualified, and the schema name is not specified,
    /// the requested class from all schemas will be described.
    /// The class names specified serve only as a hint.  Use of the hint
    /// during command execution is provider dependent.  Providers that 
    /// will not use the hint will describe the schema for all classes.
    /// 
    /// \param value 
    /// Input the collection of class names
    /// 
    /// \return
    /// Returns nothing
    /// 
    FDO_API virtual void SetClassNames(FdoStringCollection* value) = 0;

    /// \brief
    /// Executes the DescribeSchema command and returns a 
    /// FdoFeatureSchemaCollection. If a schema name is given that has 
    /// references to another schema, the dependent schemas will 
    /// be returned as well. If the specified schema name does not exist,
    /// the Execute method throws an exception.
    /// 
    /// \return
    /// Returns the schema collection representing the schema created.
    /// The element states for all elements will be set to FdoSchemaElementState_Unchanged.
    /// Each provider-specific implementation of Execute() can ensure 
    /// that this is the case by 
    /// calling FdoFeatureSchema::AcceptChanges() for each feature schema
    /// in the returned collection.
    /// 
    FDO_API virtual FdoFeatureSchemaCollection* Execute() = 0;
};

Discussion of Alternate Approach (but not proposed as part of this RFC)

An alternative solution is to modify DescribeSchema to make filtering of the schema by class names a fundamental part of the command, requiring all providers to have full implementation for the new behviour. Although this approach makes for a cleaner API experience in terms of DescribeSchema usage and consistency amongst all providers, there are some issues which makes the feasibility of this approach unsuitable at this time.

  1. Not all providers would benefit from the new behaviour in DescribeSchema. The main benefit would be the responsiveness of the command for the RDBMS providers. Some of the file-based providers may also benefit, but if caching is used there may not be any significant improvement. As such, it would not be worth the effort to implement in all providers when there is little to no gain. The solution to this should include optional implementation on an as-needed basis.
  1. Updating the DescribeSchema command across all providers would need to be scheduled in a timely manner. Coordinating a mandatory command update with all provider writers (particularly from third parties) may pose significant challenges.
  1. Resources for implementing in all providers is limited for all concerned parties. Since implementation of the DescribeSchema changes is not optional, this may create undue strain on available resources for all interested parties.

By favouring optional use a hint to restrict classes in DescribeSchema over mandatory implementation of support for class filtering, the effort can be focused on where it is needed most. It also allows flexibility for the other providers to determine if and when use of the hint should be implemented based on their own criteria.

Implications

Application code that currently uses DescribeSchema to enumerate schema/class names or retrieve class definitions should be updated to utilize the new APIs for improved performance. The application developer should be aware that not all occurrences of DescribeSchema can be replaced, such as when a complete schema is required for a particular operation. Also, providers that do not support the new commands can fall back on using DescribeSchema and walk through the results to retrieve the schema, class names, and class definitions.

The existing DescribeSchema API, which retrieves the full schema for all the available feature classes, will continue to perform as before. Performance gains will only be realized when the new APIs are used to restrict the response to the data of interest when executed against supported providers.

Test Plan

Test existing and new APIs to validate correct functionality.

Funding/Resources?

Autodesk

Last modified 9 years ago Last modified on Jul 28, 2008 10:51:22 AM