Opened 12 years ago

Closed 5 years ago

#4397 closed defect (wontfix)

issue in StringList serialization

Reported by: esseffe Owned by: warmerdam
Priority: normal Milestone: closed_because_of_github_migration
Component: OGR_SF Version: unspecified
Severity: normal Keywords:
Cc: a.furieri@…, Kyle Shannon

Description (last modified by Even Rouault)

I've noticed an issue in OGRFeature::GetFieldAsString?()

when a StringList? value is exported, the following serialized notation is adopted:

- value#1 = "A01"
- value#2 = "B02"
- serialized = "(2:A01,B01)"

this serialized format is generally fine (and is easily parsable by other apps), but will introduce fatal ambiguities when any COMMA is used by some value to be serialized:

- value#1 = "A,01"
- value#2 = "B,02"
- serialized = "(2:A,01,B,01)"

quite obviously, a serialized value like this can never be succesfully parsed so to retrieve the initial values.

suggested remedy: applying the same criteria used to mask quotes in SQL string, i.e. masking any COMMA found in the string value to be exported as COMMA-COMMA. This would produce the following output:

- value#1 = "A,01"
- value#2 = "B,02"
- serialized = "(2:A,,01,B,,01)"

a serialized string like this has no longer any ambiguity at all; and can be easily and safely parsed so to retrieve the original values.

please see the attached patch

best regards, Sandro Furieri


Hi Even,

so standing things there is a strong inconsistency in the SQLite [output] driver implementation.

I'll actually find several "serialized" values in any SpatiaLite DB created by ogr2ogr, and I'm completely unable to parse such values in a safe and consistent way. that's not all, such "serialized" values could be actually truncated, and there is no way to understand if some truncation occurred when readind the DB. Sorry, I didn't noticed this further issue. So, such "serialized" values into the DB are irremediably useless.

On the other side I notice that the SHP driver completely omits to create any attribute/column of the StringList, IntegerList or RealList type: may be this one could be a better (and more consistent) design choice for the SQLite driver as well ?

by Sandro

Attachments (1)

comma-patch.zip (1.1 KB ) - added by esseffe 12 years ago.

Download all attachments as: .zip

Change History (13)

by esseffe, 12 years ago

Attachment: comma-patch.zip added

comment:1 by Even Rouault, 12 years ago

I'm not convinced that this patch is really usefull. There's a risk of confusion between items in the list that are empty, which would also serialized as "" . But the main objection is that you shouldn't rely on GetFieldAsString() to get a string list, because in the case of lists, GetFieldAsString() limits the output to char szTempBuffer[TEMP_BUFFER_SIZE], that is to say 80 bytes. So the best approach is to use GetFieldAsStringList().

comment:2 by esseffe, 12 years ago

Description: modified (diff)

comment:3 by warmerdam, 12 years ago

I concur with Even. The GetFieldAsString() format for string lists was never intended to be guaranteed parsable. It is just intended to give the end user a brief concept of the field value. Note it is also truncated if long.

comment:4 by warmerdam, 12 years ago

Sandro,

I actually would be willing to contemplate a GetFieldAsXML() method that would always return a full representation of a field value in a form that declares the field type, etc. For instance for a string list it might return <StringList><Item>abc</Item><Item>def</Item></StringList>. Then drivers like Spatialite could use this for any fields that don't map so well such as binary, and the various list types.

comment:5 by esseffe, 12 years ago

Hi Frank,

this one is a really good idea. SQLite/SpatiaLite can store TEXT strings of unlimited length; so exporting xxxList values as XML formatted strings could actually allow further components to correctly retrieve the original values without any possible ambiguity or information suppression.

bye, Sandro

comment:6 by Even Rouault, 12 years ago

Description: modified (diff)

(I've tried to re-establish the first version of the description of the ticket that was lost in a previous edit and might be useful to understand what is discussed here.)

About StringList, yes, you've discovered that there are a bit a second-citizen class of field types. Very few drivers support them appropriately currently. I know PG driver support them in reading/writing because there's a matching type in PG data types. GeoJSON also. But that's almost about all. Other drivers may have read support, like GML. But Shapefile definitely not. DBF format has no adequate representation for such a concept.

comment:7 by esseffe, 12 years ago

All right; SQLite/SpatiaLite isn't able at all to support a matching data type, because only scalars are supported. But SQLite is able to store TEXT strings of unlimited length.

What do you think about inserting xxxList values in SQLite/SpatiaLite as XML strings returned by GetFieldAsXML() [as suggested by Frank] ?

If you agree with this I'll candidate myself to implement GetFieldAsXML() in the next few days, then patching the SQLite output driver accordingly to this ;-)

bye Sandro

comment:8 by Even Rouault, 12 years ago

I'm fine with sqlite supporting the field value as XML. I'm wondering : when reading back a DB, how will you distinguish user XML data from OGR XML serialization ? Will there be a hint at the sqlite table definition to know that (perhaps with sqlite3_column_decltype() ?), or will you need to read a record ?

About GetFieldAsXML(), this would perhaps require a small RFC to discuss and document the intended XML format. I also imagine that there would be a need for the matching SetFieldFromXML() ?

comment:9 by warmerdam, 12 years ago

I agree an RFC would be appropriate, and that a corresponding SetFieldFromXML() method would be needed. I imagine a bit late for 1.9.0.

comment:10 by Kyle Shannon, 11 years ago

Cc: Kyle Shannon added

comment:11 by Jukka Rahkonen, 9 years ago

RFC about GetFieldAsXML() / SetFieldFromXML() has not been done yet.

comment:12 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: newclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.