Opened 8 years ago

Closed 8 years ago

#1491 closed defect (invalid)

python libs shouldn't parse freeform text fields with parse_key_val()

Reported by: hamish Owned by: grass-dev@…
Priority: normal Milestone: 7.0.0
Component: Python Version: svn-trunk
Keywords: parsing Cc:
CPU: All Platform: All

Description

Hi,

raster.py and vector.py use core.py's parse_key_val() to evaluate the results of r.info and v.info, and report the result of kv[1] using = as the field separator.

r.info's title, vdatum, and units results, and most of v.info's extended metadata results can contain free-from text, which can contain = chars. And so text containing that will get cut off and only give the result up to the first =.

vector_info_topo(v.info -t) looks safe to use it.

db.py contains one too, using : as the fs, it is easy to imagine this failing if the database: string is on MS Windows and contains C:\, only the initial drive letter would get through.

raster3d.py's one should be ok, I don't spot any free-form text options in r3.info.

g.region -g in core.py should be safe too.

g.findfile in core.py won't be safe if the file= path contains a =. I suspect that's safe on MS Windows but not in UNIX?

thanks, Hamish

ps- r.info and v.info's -g flags restored to their original "show region" meanings in trunk. maybe r.info -s (show res) should get merged into -g, a bunch of the others could not be eval'd and had to be moved back out. after doing that it became obvious that leaving the rest was inconsistent and breaking the meaning of what -g has historically been for, so backed out as well.

Change History (3)

comment:1 Changed 8 years ago by huhabla

A solution would be that parse_key_val() accepts only the first "=" to create the key value pair. In case of free form text, the reporting module should put the info in double quotes, escaping all other double quotes located in the message content.

So parse_key_val() is able to detect any content with quotes and new lines as values.

Using several different and between the modules inconsistent flags to print content in shell style is IMHO not a better solution.

comment:2 in reply to:  description ; Changed 8 years ago by glynn

Replying to hamish:

raster.py and vector.py use core.py's parse_key_val() to evaluate the results of r.info and v.info, and report the result of kv[1] using = as the field separator.

r.info's title, vdatum, and units results, and most of v.info's extended metadata results can contain free-from text, which can contain = chars. And so text containing that will get cut off and only give the result up to the first =.

parse_key_val() only splits at the first separator:

	kv = line.split(sep, 1)

This results in a two-element tuple, with the first element containing everything before the first separator, and the second element containing everything after, even if it contains the separator character.

Unless there are specific cases which either actually don't work or which are uncertain based upon the actual behaviour of parse_key_val(), this ticket should be closed as "invalid".

comment:3 in reply to:  2 Changed 8 years ago by hamish

Resolution: invalid
Status: newclosed

Replying to glynn:

parse_key_val() only splits at the first separator:

 	kv = line.split(sep, 1)

This results in a two-element tuple, with the first element containing everything before the first separator, and the second element containing everything after, even if it contains the separator character.

Unless there are specific cases which either actually don't work or which are uncertain based upon the actual behaviour of parse_key_val(), this ticket should be closed as "invalid".

ah, that's good. I missed the ,1 and was assuming it was designed to work like G_tokenize(), with as many array items as number of seps+1.

Hamish

Note: See TracTickets for help on using tickets.