Opened 13 years ago

Last modified 13 years ago

#3739 new defect

Need to define a means of limiting the number of features drawn....

Reported by: sdlime Owned by: sdlime
Priority: highest Milestone: 6.2 release
Component: MapServer C Library Version: unspecified
Severity: major Keywords:
Cc: dmorissette, assefa, tbonfort, warmerdam, chodgson

Description

With the emergence of vector renderers we need a means of limiting the number of features output. This will protect against the creation of huge files in memory that could possibly exhaust server resources.

Easiest solution would be to allow users to define some sort of limit on the number of features drawn. There is already a MAXFEATURES layerObj property that might be used. Currently the shapefile and PostGIS drivers make use of it. Problem is that the that parameter is also in use with the WFS code. Ideas?

  • universally respect the MAXFEATURES parameter for all drawing and query operations?
  • do we need query vs. draw values?
  • should this be output format specific?
  • could the limit be hierarchical (output format takes precedence over layer)?

This all needs clarification as part of the 6.0 release.

Steve

Change History (17)

comment:1 by sdlime, 13 years ago

How does GeoServer handle this? --Steve

comment:2 by assefa, 13 years ago

this is a link related to the kml support in GeoServer (limiting the maxfeatures); http://docs.geoserver.org/latest/en/user/googleearth/features/kmlscoring.html

comment:3 by sdlime, 13 years ago

One idea would be to create a msLayerGetMaxFeatures() that could be used as necessary. Signature would be something like:

int msLayerGetMaxFeatures(layerObj *layer, ...)

Where the variable arguments would be the metadata names to check for values.

The function would return an integer containing the max number of features to process or -1 for all features. This way all the rules associated with this are in one spot. I'd propose a rule order like so:

1) layer metadata (in the order passed) 2) output format option (e.g. FORMATOPTION "maxfeatures=n") 3) layer->maxfeatures

The desired OUTPUTFORMAT *must* be set ahead of the drawing or query function being called. I think that might require some changes on the CGI/query side of things.

Then all operations that loop through a set of features would call this new function and stop processing when max features is hit. Areas I can think of are:

mapquery.c - all msQueryBy...() functions mapdraw.c - msDrawLayer()

These front the subsequent presentation functions (GML code, templating, etc...) so it seems like those portions of the code don't need to know about it. Question is how to deal with OWS services that wrap standard queries (e.g. WFS, WMS). I suppose we could pass the fact a query originates from an OGC service and look for OGC metadata?

Steve

comment:4 by sdlime, 13 years ago

Any thoughts on this? Anyone? Kind of relates to ticket #3561 as well.

Steve

comment:5 by assefa, 13 years ago

Hi Steve,

Sorry for the delay. Looking into what we have now:

  • the layer->maxfeatures is already used by certain drivers such as oracle and postgis (and maybe shape??) for limiting the draw/query at the driver level.
  • For wfs, when receiving a maxfeatures in the request, the layer->maxfeatures is set. It is also passed to the gml/ogr output functions to make sure that It is respected.

There is also a metedata ows/wfs_maxfeatures that the user can set to make sure that the maxfeatures requested is not above this value.

  • the doc on MAXFETURES specifies "..the number of features that should be drawn for this layer in the CURRENT window ...". I am not sure how this is enforced for drivers that do nothing with the mapxfetures parameter.
  • there is also, related to this, the need in wfs to do pagination (startindex passed as a parameter). Right now this is handled by the gml/or outputs. That uses the startindex parameter on the layer. The only driver that honors it nativley is Oracle I think.

I think It makes sense for the drivers to keep honoring the maxfeatures and starindex, since we should gain in query/draw speed.

I am in favor at this point (for 6.0 release) to only have the draw part respect the layer's maxfeatures parameter, which I think addresses I believe what triggered this bug ("prevent people from jamming the server by getting a vector output of a big kml:) and push and work on queries after this release.

comment:6 by assefa, 13 years ago

do we do the draw part, do we postponed this?

comment:7 by dmorissette, 13 years ago

Sorry for the lack of feedback on my part. I just don't feel like I have a clear enough understanding of all the implications to express a useful opinion.

At first sight it seems that the msLayerGetMaxFeatures() proposal should suffice to protect against server overload as a first step for 6.0, but maybe I'm missing something?

For sure we need to move quick to get this in 6.0... and from what bots of you wrote it sounds like without this, 6.0 servers could be easily overloaded, Or was the situation already the same with 5.6.x with respect to potential for server overload with very large outputs?

comment:8 by dmorissette, 13 years ago

See also ticket #2424 (Ability to limit number of query results per layer)

comment:9 by sdlime, 13 years ago

I think 6.0 is worse than 5.6 because a KML format exists by default, although perhaps PDF and SVG were liabilities in a similar fashion in the past. I know it's an issue with KML because the XML is rendered in-memory.

Right now the drivers that do respect maxfeatures do so in the driver code. We should at least make sure all of them do so (and setup tests to verify). That would limit the need for changes to query or drawing code. This should the minimum we do...

One issue (I think) is that the limit you might set for map production (e.g. png) may very well be different than what you might do for vector output. That's why the template format option code has it's own mechanism for limiting features in the [results...] or [feature...] tag, can't remember which.

Adding an output format option so set max features on a format level would help. I guess that value would have to be max features per layer. The output driver would need to modify layer->maxfeatures much like the WFS code does I guess. The template outformat could also support this (in addition to the tag-based approach). This would be a nice-to-have...

Steve

comment:10 by assefa, 13 years ago

I will add the limits for the kml driver to respect the maxfeatures and also set a default limit.

comment:11 by sdlime, 13 years ago

Assefa: I think the drivers should respect maxfeatures and the renderers should respect a value set in the output format. I'd like to make this a priority for beta 6...

Steve

comment:12 by chodgson, 13 years ago

Just noting that the fix for this will allow a cleaner solution for #3561.

Let me know if I can help on this one.

comment:13 by chodgson, 13 years ago

Cc: chodgson added

comment:14 by assefa, 13 years ago

I don't know if all the driver support maxfeatures. I did not check. I think postgis,shp, oracle support it. For the renders, we can add possibly a check in msDrawVectorLayer where we can test if an a "maxfeaturestodraw" is set on the layer level, then map level, the output format option level. For kml renderer if the value is not explicitly set by the user, I will default to to a reasonable value ate the start of the drawing process. I will try this now.

comment:15 by assefa, 13 years ago

committed in r11482 as described above. the KML driver will set the maxfeatures per layer to 1000 if not configured by the user. Need to be documented.

comment:16 by sdlime, 13 years ago

Milestone: 6.0 release6.2 release

Moving to 6.2, we'll need an RFC to deal with this across the board.

Steve

comment:17 by assefa, 13 years ago

kml docs updated r11564.

Note: See TracTickets for help on using tickets.