Changes between Initial Version and Version 1 of rfc52_strict_sql_quoting


Ignore:
Timestamp:
Jan 26, 2015, 9:41:20 AM (9 years ago)
Author:
Even Rouault
Comment:

Add RFC 52

Legend:

Unmodified
Added
Removed
Modified
  • rfc52_strict_sql_quoting

    v1 v1  
     1= RFC 52: Strict OGR SQL quoting =
     2
     3Author: Even Rouault[[BR]]
     4Contact: even dot rouault at spatialys dot com [[BR]]
     5
     6== Summary ==
     7
     8This RFC proposes that OGR SQL enforces strict quoting rules for SQL literals and identifiers
     9
     10== Details ==
     11
     12Currently the OGR SQL engine deals indifferently with single quote characters (') and double quote characters ("), although they can be used in 2 different contexts :
     13  * to specify string literals
     14  * to specify column or table names (when they need quoting)
     15
     16SQL 92 mandates that string literals are surrounded with single quote characters whereas quoted identifiers (column, table names) are surrounded with double quote charactes.
     17
     18From http://savage.net.au/SQL/sql-92.bnf.html:
     19{{{
     20<delimited identifier>    ::=   <double quote> <delimited identifier body> <double quote>
     21<character string literal>    ::=<quote> [ <character representation> ... ] <quote>. ]
     22}}}
     23
     24Current OGR behaviour is generally fine, except in some situations when specifying columns in a SELECT statement or in a WHERE expression. The OGR SQL engine tries to determine the intent with the following rule : if the quoted string matches a column name, then it is assumed to be a column identifier, otherwise a string litteral. But sometimes (see http://trac.osgeo.org/gdal/ticket/5428), there are situations where filters such as "MyField" = 'MYFIELD' are needed, and currently they always evaluate to TRUE.
     25
     26To avoid any ambiguities and more conformant to SQL, string literals should be single-quoted, and identifiers (column/table names) unquoted or surrounded by double-quote charachter if conflicting with reserved keywords.
     27
     28== Implementation ==
     29
     30A proposed implementation was available in ticket [http://trac.osgeo.org/gdal/ticket/4280 4280]. It would require some refreshing against current trunk.
     31
     32== Compatibility ==
     33
     34This change affects compatibility in the sense that users that relied on the current permissive behaviour will have to adapt their SQL expressions. This might be quite tricky to detect as it will not always cause obvious runtime failures. For example " SELECT 'i_thought_this_would_be_interpreted_as_a_column_name_but_now_it_is_a_string_litteral' FROM atable " will run without error but not produce the previously expected result. So applications will have to review how they build SQL filters ( SetAttributeFilter() ) and expressions ( ExecuteSQL() )
     35
     36Expressions fixed to be compliant with stricter rules will also work with older GDAL versions.
     37
     38A prominent warning should also be put on the OGR SQL documentation page to advertize the change of behaviour
     39
     40== Discussion ==
     41
     42Tests done with SQLite show that it is sometimes tolerant to misuse of quote characters, but in the reverse way as current OGR. So it will accept SELECT "non_existing_column" FROM 'atable' and interpret it as SELECT 'non_existing_column' FROM "atable". On the contrary, PostgreSQL will not accept it. I'd be rather inclined to follow PostgreSQL stricter approach to keep things simple and understandable: "string literals should be single-quoted, and identifiers (column/table names) unquoted or surrounded by double-quote charachter"
     43
     44== Testing ==
     45
     46The Python autotest suite will be fixed to still pass.
     47
     48== Voting history ==
     49
     50TBD