wiki:RFC59-Draft

Version 2 (modified by sdlime, 14 years ago) ( diff )

--

RFC 59 - MapServer Expression Parser Overhaul

Overview

This is a draft RFC addressing 1) how the Bison/Yacc parser for logical expressions is implemented and 2) where in the MapServer code the parser can be used. This RFC could have broader impacts on query processing depending on additional changes at the driver level, specifically the RDBMS ones. Those changes don't have to occur for this to be a useful addition.

A principle motivation for this work is to support OGC filter expressions in a single pass in a driver-independent manner.

Existing Expression Parsing

The existing logical expression handling in MapServer works like so:

1) duplicate expression string
2) substitute shape attributes into string (e.g. '[name]' => 'Anoka')
3) parse with yyparse()

The parser internally calls yylex() for its tokens. Tokens are the smallest pieces of an expression.

Advantages

  • it's simple and it works

Disadvantages

  • limited by substitution to strings, no complex types can be handled
  • have to perform the substitution and tokenize the resulting string for every feature

Proposed Technical Changes

This RFC proposes a number of technical changes. The core change, however, involved updating the way logical expressions work. Additional capitalize on this to bring additional capabilities to MapServer.

Core Parser Update

I propose moving to a setup where a logical expression is tokenized once (via our Flex-generated lexer) and then Bison/Yacc parser works through tokens (via a custom version of yylex() defined in mapparser.y) as necessary for each feature. This eliminates the substitution and tokenize steps currently necessary and opens up possibilities for supporting more complex objects in expressions. Basically we'd hang a list/array of tokens off an expressionObj, populate it in msLayerWhichItems() and leverage the tokens as needed in the parser. The following new structs and enums are added to mapserver.h:

enum MS_TOKEN_LOGICAL_ENUM { MS_TOKEN_LOGICAL_AND=100, MS_TOKEN_LOGICAL_OR, MS_TOKEN_LOGICAL_NOT };
enum MS_TOKEN_LITERAL_ENUM { MS_TOKEN_LITERAL_NUMBER=110, MS_TOKEN_LITERAL_STRING, MS_TOKEN_LITERAL_TIME, MS_TOKEN_LITERAL_SHAPE };
enum MS_TOKEN_COMPARISON_ENUM {
  MS_TOKEN_COMPARISON_EQ=120, MS_TOKEN_COMPARISON_NE, MS_TOKEN_COMPARISON_GT, MS_TOKEN_COMPARISON_LT, MS_TOKEN_COMPARISON_LE, MS_TOKEN_COMPARISON_GE, MS_TOKEN_COMPARISON_IEQ,
  MS_TOKEN_COMPARISON_RE, MS_TOKEN_COMPARISON_IRE,
  MS_TOKEN_COMPARISON_IN, MS_TOKEN_COMPARISON_LIKE,
  MS_TOKEN_COMPARISON_INTERSECTS, MS_TOKEN_COMPARISON_DISJOINT, MS_TOKEN_COMPARISON_TOUCHES, MS_TOKEN_COMPARISON_OVERLAPS, MS_TOKEN_COMPARISON_CROSSES, MS_TOKEN_COMPARISON_WITHIN, MS_TOKEN_COMPARISON_CONTAINS,
  MS_TOKEN_COMPARISON_BEYOND, MS_TOKEN_COMPARISON_DWITHIN
};
enum MS_TOKEN_FUNCTION_ENUM { MS_TOKEN_FUNCTION_LENGTH=140, MS_TOKEN_FUNCTION_TOSTRING, MS_TOKEN_FUNCTION_COMMIFY, MS_TOKEN_FUNCTION_AREA, MS_TOKEN_FUNCTION_ROUND, MS_TOKEN_FUNCTION_FROMTEXT };
enum MS_TOKEN_BINDING_ENUM { MS_TOKEN_BINDING_DOUBLE=150, MS_TOKEN_BINDING_INTEGER, MS_TOKEN_BINDING_STRING, MS_TOKEN_BINDING_TIME, MS_TOKEN_BINDING_SHAPE };

typedef union {
  double dblval;
  int intval;
  char *strval;
  struct tm tmval;
  shapeObj *shpval;
  attributeBindingObj bindval;
} tokenValueObj;

typedef struct {
  int token;
  tokenValueObj tokenval;
} tokenObj;

Backwards Compatibility Issues

Security Issues

Attachments (2)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.