Opened 13 years ago

Last modified 13 years ago

#4027 new enhancement

Document behavior and use of SQL “inner query” in mapfile DATA statement

Reported by: panzel Owned by: jmckenna
Priority: normal Milestone:
Component: Documentation - MapServer Version: unspecified
Severity: normal Keywords:
Cc: havatv

Description

By misadventure I’ve discovered that the inner query of a mapfile/LAYER/DATA statement is passed intact to the underlying Microsoft SQL Server 2008 database engine. This behavior is not documented at http://mapserver.org/input/vector/mssql.html. It would be nice see the behavior documented, in part as a commitment to this behavior being stable (so that I can use it with some confidence).

The path to discovery under Microsoft Windows: by intentionally creating a malformed DATA statement, and using a command line query such as "drawmap" or its successors, one can get an error message that shows the modification that MapServer/Mapscript applies to the original DATA text; this can lead to effective modification to add extra derived attributes (such as function results) to the data stream coming back to MapServer for rendering purposes.

Here’s an example of the path to discovery. Assume within a mapfile that an original well-formed DATA statement is modified to include a non-existent column, 'BOGUS_COLUMN'. The DATA statement might be:

DATA "shape from

(select shape, ObjectID, ElevFtRound from base.contour_vectors

where Interval = 250 or Interval = 500 OR BOGUS_COLUMN = 'whatever') as MinorLines using unique ObjectId using SRID=3857"

When part of a mapfile that is evaluated by drawmap, this data statement triggers a System.ApplicationException error originating from the underlying SQL Server engine:

Unhandled Exception: System.ApplicationException: msDrawMap(): Image handling error. Failed to draw layer named 'MedRes/Minor_250_500'.;msMSSQL2008LayerGetShape(): Query error. Error executing MSSQL2008 SQL statement: SELECT shape.STAsBinary(),convert(varchar(36), ObjectId) from

(select shape, ObjectID, ElevFtRound

from base. contour_vectors where Interval = 250 or Interval = 500 OR BOGUS_COLUMN = 'whatever'

) as MinorLines

WHERE shape.STIntersects(

geometry::STGeomFromText(

'POLYGON((-13234137.3720917 3976767.37231785,

-13083710.5126544 3976767.37231785, -13083710.5126544 4069489.50206934, -13234137.3720917 4069489.50206934, -13234137.3720917 3976767.37231785))',3857)) = 1

-[Microsoft][ODBC SQL Server Driver][SQL Server]Invalid column name 'BOGUS_COLUMN'.

at OSGeo.MapServer.mapObj.draw() at DrawMap.Main(String[] args)

Note that the body of the subquery parenthetical expression is sent intact to Sql Server. Hence any legal Sql Server 2008 subquery may be transmitted and executed on the Sql Server provider.

How is this knowledge useful? Let us assume that we want to have the returned data stream return additional attributes that do not exist in the source data. As an example, let’s say we want to draw cities and annotate with labels sized and prioritized as a function of each city’s population. The labeling might be specified with:

LABELITEM 'CITY_NAME' LABEL

Color '#000000' Font 'arial' PRIORITY [PopWeight] size [DoublePopWeight] buffer 12 OutlineColor '#ff0000' OutlineWidth 3 Position auto type truetype

END

The DATA statement might be:

DATA "ShapeGeometry from (select ShapeGeometry, objectID , city_name,

1 + cast(LOG10(city_pop + 1) as int) as PopWeight, 2*(1 + cast(LOG10(city_pop + 1) as int)) as DoublePopWeight from NAV_CITY as base_table )

as subqry using unique objectID using SRID=4269"

The knowledge that the subquery is evaluated by the underlying database engine allows for easy experimentation and optimization. In this case, given sufficient rights to the source data, one could either define a view that adds the calculated column values, or failing that, defined "user defined functions" (UDFs, in Microsoft’s "Transact-SQL" language) that could be referenced in the DATA statement:

DATA "ShapeGeometry from (select ShapeGeometry, objectID , city_name,

fnPopWeight(city_pop) as PopWeight, 2 * fnPopWeight(city_pop) as DoublePopWeight from NAV_CITY as base_table)

as subqry using unique objectID using SRID=4269"

By documenting the stability of the subquery, one can be comfortable committing design time to its use. This information might also be of use to users of other kinds of data sources, to the extent that it applies. In addition it might be good to know whether the behavior applies to any filters or expressions, or syntactically in part to text within attribute markers, "[attributeName]" (for example, can "attributeName" be an expression in the context of the underlying data provider?).

Change History (2)

comment:1 by panzel, 13 years ago

Type: defectenhancement

comment:2 by havatv, 13 years ago

Cc: havatv added

I guess that this should apply to all database connections. For PostGIS there are examples that indicate this behaviour, and the same applies to Oracle.

Could it be documented under LAYER DATA (like in the "For Oracle ..." part)?

If not, it would perhaps be better to document behaviour that is common to all database type connections separately (perhaps a new database connections document under data input that can be referenced from postgis, mssql, oracle, ...).

I think there is a need for advice from developers. How much is generic for database connections?

Note: See TracTickets for help on using tickets.