Ticket #2123 (closed defect: fixed)

Opened 6 years ago

Last modified 4 years ago

Layer filter does not work when values contain quotes

Reported by: unicoletti Owned by: sdlime
Priority: normal Milestone: 5.6 release
Component: MapServer C Library Version: svn-trunk (development)
Severity: normal Keywords:
Cc: dmorissette, aboudreault

Description

Setting a filter on a shp layer like the following :

'[NAME]'=='some name'

where the NAME attribute can contain values with quotes (single quote in my case) causes the following error (in Java mapscript):

msEvalExpression: Expression parser error. Failed to parse expression;msyyparse():

This seems to be a know problem as it is even documented here (towards the end):

 http://mapserver.gis.umn.edu/docs/howto/msexpressions

Attachments

issue2123.patch Download (2.5 KB) - added by unicoletti 6 years ago.
Proposed patch: works for me with shape files
issue2123-maputil.patch Download (0.8 KB) - added by unicoletti 6 years ago.
In case the quote is in the value stored in the shapefile
issue2123.2.patch Download (2.5 KB) - added by unicoletti 6 years ago.
Fix memory leaks because of strdup
issue2123-mapparser.patch Download (2.6 KB) - added by unicoletti 6 years ago.
Fix memory leaks because of strdup (forget previous, is an error)

Change History

  Changed 6 years ago by unicoletti

I have investigated on this issue and found that the cgi works (without raising an error) because it silently discards a parsing error that occurs at line 363 of maputil.c. Both (mapscript and cgi) will fail when the right hand expression contains a single quote.

I'm looking into mapparser.y if we can introduce an escape character.

Changed 6 years ago by unicoletti

Proposed patch: works for me with shape files

  Changed 6 years ago by unicoletti

Steve could you look at it? This is my first attempt at flex so it could be that I have made mistakes.

Inspired by:

 http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html

Changed 6 years ago by unicoletti

In case the quote is in the value stored in the shapefile

Changed 6 years ago by unicoletti

Fix memory leaks because of strdup

Changed 6 years ago by unicoletti

Fix memory leaks because of strdup (forget previous, is an error)

  Changed 6 years ago by unicoletti

double free error becaus of free at maputil.c around line 360. Please remove it:

if (status != 0) {

free(tmpstr); <----- REMOVE THIS FREE msSetError(MS_PARSEERR, "Failed to parse expression", "msEvalExpression"); return MS_FALSE;

}

follow-up: ↓ 5   Changed 6 years ago by sdlime

Why not just write:

FILTER '"[NAME]" = "some name"'

instead? The outer single quotes are stripped off at parse time. Ultimately I think escaped chars is a good idea though. Will check this bug out over the weekend...

Steve

in reply to: ↑ 4 ; follow-up: ↓ 10   Changed 6 years ago by unicoletti

Replying to sdlime:

Why not just write: FILTER '"[NAME]" = "some name"' instead? The outer single quotes are stripped off at parse time.

Because valid matches will be discarded as well. For example:

' "[CITY]" == "Arqua' Petrarca" '

will never match even if in fact there is a record with such an attribute and value.

You reasoning is probably justified because there is a big difference in the way that the cgi and mapscript deal with parse errors raised in msEvalExpression: while the cgi silently discards the parse error caused by the filter failure, mapscript reads the error from the stack, raises an exception and the map generation aborts.

  Changed 6 years ago by unicoletti

  • status changed from new to closed
  • resolution set to fixed

Fixed in revision 6277.

  Changed 6 years ago by dmorissette

  • cc dmorissette added

I came across this by accident, this is a very nice enhancement and I think it should be documented somewhere for the users (MIGRATION_GUIDE.TXT, mapfile ref, etc) instead of just being silently marked fixed.

Note for those reading this ticket and wondering (like me) how the escape works, from a browse of the code it seems to use C-like escaping for "\\", "\"" and "\'" e.g.

"[CITY]" == "Arqua\' Petrarca"

  Changed 6 years ago by dmorissette

  • milestone set to 5.0 release

  Changed 6 years ago by unicoletti

I'm going to modify the relevant page on the website.

in reply to: ↑ 5   Changed 6 years ago by sdlime

Replying to unicoletti:

Replying to sdlime:

Why not just write: FILTER '"[NAME]" = "some name"' instead? The outer single quotes are stripped off at parse time.

Because valid matches will be discarded as well. For example: ' "[CITY]" == "Arqua' Petrarca" ' will never match even if in fact there is a record with such an attribute and value. You reasoning is probably justified because there is a big difference in the way that the cgi and mapscript deal with parse errors raised in msEvalExpression: while the cgi silently discards the parse error caused by the filter failure, mapscript reads the error from the stack, raises an exception and the map generation aborts.

I'll have to work up a test dataset and try it. I have a lake names here like O'Brien and I was pretty sure they worked fine with an expression like:

("[LAKENAME]" = "O'Brien")

Steve

follow-up: ↓ 12   Changed 6 years ago by sdlime

Also, does expression writing re-escape strings so msSaveMap will work ok?

Steve

in reply to: ↑ 11   Changed 6 years ago by unicoletti

  • status changed from closed to reopened
  • resolution fixed deleted

Replying to sdlime:

Also, does expression writing re-escape strings so msSaveMap will work ok?

I haven't tested, in fact I didn't even think about it. Looking into it asap.

  Changed 6 years ago by unicoletti

  • status changed from reopened to closed
  • resolution set to fixed

Looked at the code and it seems that msSaveMap should work because the escaping code is done each time the espression is evaluated and not when the map file is read. The unescaped original expression string is always held in:

expression->string

and strdupped in msEvalExpression each time. expression->string is then used by writeExpression

  Changed 6 years ago by sdlime

I'm wondering if we might consider unescape/escape functions in the future so you have to do that only once. Unescape when tokenizing and escaping when writing.

Steve

  Changed 4 years ago by sdlime

  • status changed from closed to reopened
  • resolution fixed deleted

  Changed 4 years ago by sdlime

  • version set to svn-trunk (development)
  • milestone changed from 5.0 release to 5.6 release

The solution here limits expression tokens to 1024 and that's showing up on the mailing list as a problem for a number of users with long expressions using the IN operator. No error occur, it's just that not all features that should be drawn are drawn. Depending on the expression the effects could be very subtle, just a few features not being drawn.

Would make several users happy if this were fixed somehow.

Steve

  Changed 4 years ago by sdlime

  • cc aboudreault added

Alan: Is this one you'd have time to look at?

Steve

  Changed 4 years ago by aboudreault

  • status changed from reopened to closed
  • resolution set to fixed

Fixed and committed in r9560.

  Changed 4 years ago by sdlime

Alan: Sweet. I can verify the fix works based on a test case I have here. I'm kinda surprised the right hand token in an expression like this:

'foo' in 'foo,bar'

even gets into that processing given that 'foo,bar' contains no internal quotes. I would have thought the rule:

<EXPRESSION_STRING>\"[\"]*\"i|\'[\']*\'i

would have caught those, although I have no idea what the i's are for in that expression.

Steve

  Changed 4 years ago by dmorissette

Here is the regex again without Trac formatting:

 <EXPRESSION_STRING>\"[^\"]*\"i|\'[^\']*\'i

  Changed 4 years ago by sdlime

You can ignore my last comment. The i is obviously for case insensitive strings. I take it we don't use those in the context of logical expressions...

Steve

Note: See TracTickets for help on using tickets.