Opened 7 years ago
Closed 4 years ago
#6713 closed defect (wontfix)
WFS paging does not traverse next URI for advancing pages
Reported by: | gkm4d | Owned by: | warmerdam |
---|---|---|---|
Priority: | normal | Milestone: | closed_because_of_github_migration |
Component: | default | Version: | unspecified |
Severity: | normal | Keywords: | WFS, paging |
Cc: |
Description
OGC WFS 2.0.2 Interface Standard, Section 7.7.4.4 (http://docs.opengeospatial.org/is/09-025r2/09-025r2.html#76) defines the concept of response paging. Per the standard (emphasis is mine):
Response paging is accomplished using the previous and next parameters defined on the response collections ... The value of the previous or next attribute shall be server generated URIs that retrieves the corresponding set or results. The specific format of these URIs is implementation dependant
and,
The sequence of interactions with the server proceeds as follows:
a) A client sends the request to a server that supports paging.
b) The server responds with a wfs:FeatureCollection element containing the first 100 records in the result set. The next attribute is set so that the client can retrieve the next 100 features, but the previous attribute is not set since this is the first set of features in the response set.
c) The client traverses the next URI.
As best as I can tell, GDAL completely ignores the next URI when advancing to the next page of results, and instead simply generates a new URL which is effectively the same as the current URL but with an incremented STARTINDEX parameter. While this client produced URL is still valid from a WFS perspective, and will in fact produce the next page of results, this approach ignores any special handling which may have been incorporated into the server generated next URL. For instance this server generated URL could include Vendor Specific Parameters (VSPs -- http://docs.opengeospatial.org/is/09-025r2/09-025r2.html#67) which allow for increased query performance when retrieving subsequent pages. GDAL would not require any specific knowledge of these VSPs and would simply need to traverse the server provided next URL without modification.
Change History (5)
comment:1 by , 7 years ago
comment:2 by , 7 years ago
I am not aware of a publicly available WFS implementation that relies on this feature. However, at my workplace we have developed an internal WFS implementation from the ground up and do leverage the next URI to improve our query performance on subsequent pages. We routinely see a 2-3x improvement in response time when traversing the next URL versus naive &COUNT=xxx&STARTINDEX=yyy.
You are right that sorting can be a complication, but this is an implementation detail and is managed server-side to ensure the correct range of data is returned page-to-page.
comment:3 by , 6 years ago
@gkm4d are you willing to propose a patch to implement this ? (it should still allow using the count+startindex method if "next" is missing, and perhaps if a configuration option is set in case the server "next" wouldn't be relieable)
comment:4 by , 6 years ago
@rouault I'm afraid I'm not able to propose a patch for this. I'm working on a team standing up a new WFS implementation. One of our end users utilizes the WFS client capabilities from GDAL, and this is how we observed this deficiency with paging. I simply filed this ticket on their behalf so that the issue could be captured.
comment:5 by , 4 years ago
Milestone: | → closed_because_of_github_migration |
---|---|
Resolution: | → wontfix |
Status: | new → closed |
This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.
Do you know any WFS server that implements this feature, and in some other way than creating a GetFeature request with &COUNT=xxx&STARTINDEX=yyy? You are right that it is possible that the advertised next and previous links may point to some faster service, perhaps cached one. I feel a bit skeptical but perhaps in case of full GetFeature without filters and with server side limit for count it could make some service faster. But if ResponseCacheTimeout is exceeded GDAL could only quit because it can't know if it would be possible to get the next missing page with count and startindex. Data obtained with next/previous requests which are not simple count+startindex may be sorted differently than what you get with count+startindex.