Opened 8 years ago

Closed 6 years ago

#1633 closed defect (fixed)

Unable to display shapefile attribute table due to pipe characters in dbf field

Reported by: richardc Owned by: grass-dev@…
Priority: critical Milestone: 6.4.3
Component: wxGUI Version: svn-releasebranch64
Keywords: database, separator Cc:
CPU: Unspecified Platform: Unspecified

Description (last modified by hamish)

Hi,

I'm receiving the following error on attempting to view the attribute table of shapefiles in GRASS 6.4.1.2 (on Ubuntu Lucid):

item = layer, log = self.goutput)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 653, in __init__

self.__createBrowsePage()
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 698, in __createBrowsePage

self.mapDBInfo, layer)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 95, in __init__

keyColumn = self.LoadData(layer)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 247, in LoadData

self.AddDataRow(i, record, columns, keyId)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 286, in AddDataRow

if self.columns[columns[j]]['ctype'] != types.StringType:
IndexError
:
list index out of range 

Method: I import the shapefile as follows:

Layer Manager > File > Import vector data > Common import formats > Select ESRI shapefile > Browse and select file > Import

And then select the layer in 'Layer Manager', and click on the 'Show attribute table' icon. Then the error results

'Please wait loading attribute data..' with the following in the command console -

if self.columns[columns[j]]['ctype'] != types.StringType:
IndexError
:
list index out of range 

On closer inspection of the dbf file, some of the fields contain pipe (i.e., '|') characters. On removing these the attribute table can be displayed.

QGIS is able to display all data in the attribute table, including 'pipe' characters.

Example of problem fields in single column named 'VARNAME_1,C,150':

VARNAME_1,C,150

Bangkok|Krung Thep|Krung Thep Maha Nakhon|Phra Nakhon-Thonburi Buri Rum Chaxerngsao|Pad Rew|Paed Riu|Petrieu|Shajeun Dhrao Chainat Chantaburi|Muang Chan

NOTE: issue posted originally at http://osgeo-org.1560.n6.nabble.com/Index-error-Unable-to-show-attribute-table-of-shapefile-tp4678952p4678952.html

Attachments (1)

db_mngr_12jun13.diff (2.1 KB) - added by hamish 6 years ago.
patch to fix #1633 in a less user visible way

Download all attachments as: .zip

Change History (20)

comment:1 Changed 7 years ago by annakrat

Component: DisplaywxGUI
Keywords: database separator added

In GRASS 7 I added option to choose separator in GUI settings. Instead of strange error during loading table, you should now get dialog which suggests you to change the separator. Could someone test it before backport?

Anna

comment:2 in reply to:  1 ; Changed 7 years ago by neteler

Replying to annakrat:

In GRASS 7 I added option to choose separator in GUI settings.

For the record: r52712

Instead of strange error during loading table, you should now get dialog which suggests you to change the separator. Could someone test it before backport?

Tested, it works almost well now, telling the user:

"Inconsistent number of columns in the table <vectorname>. Try to change field separator in GUI Settings, Attributes tab, Data browser section."

Perhaps it should then not reach the attribute manager window (since useless) but just return or close the attribute manager window?

comment:3 in reply to:  2 Changed 7 years ago by annakrat

Replying to neteler:

Perhaps it should then not reach the attribute manager window (since useless) but just return or close the attribute manager window?

It is easier just to remove the browse data page (it looks like the case when you have table with no data). Attribute manager can stay open because you can still want to change some other things related to table columns, connection. Done in r52730.

Anna

comment:4 Changed 7 years ago by annakrat

Resolution: fixed
Status: newclosed

Backported in r52815 and r52816, closing the ticket.

comment:5 Changed 7 years ago by hamish

Description: modified (diff)

comment:6 Changed 7 years ago by annakrat

Platform: LinuxUnspecified
Resolution: fixed
Status: closedreopened

Because the change causes bug #1706, I reverted it for Windows. Therefore I reopen this ticket.

Anna

comment:7 in reply to:  6 Changed 7 years ago by hellik

Replying to annakrat:

Because the change causes bug #1706, I reverted it for Windows. Therefore I reopen this ticket.

see http://trac.osgeo.org/grass/ticket/1706#comment:7

closing ticket?

Helmut

comment:8 Changed 7 years ago by annakrat

If I remember the problem correctly, this ticket is still valid for Windows. So on Windows you cannot load attribute table data with pipe characters. Maybe we can change milestone?

comment:9 in reply to:  8 Changed 7 years ago by neteler

Version: 6.4.1svn-releasebranch64

Replying to annakrat:

If I remember the problem correctly, this ticket is still valid for Windows.

Confirmed: g.copy e.g. the "boundary_municip" NC map to myvector, then edit an entry in the "NAME" column and insert "|" there, it (apparently) do that but then no longer show the table. However, a message is shown suggesting to change the separator to a different character.

So on Windows you cannot load attribute table data with pipe characters.

Currently no unless you select a different separator.

Maybe we can change milestone?

To which would you change it? How about a condition if-on-windows predefine a separator different from | by default? Which is more save?

comment:10 Changed 7 years ago by annakrat

There should be an easy workaround for this keeping | as default (r55719). We'll see tomorrow.

comment:11 in reply to:  10 Changed 7 years ago by annakrat

Replying to annakrat:

There should be an easy workaround for this keeping | as default (r55719). We'll see tomorrow.

It seems to work so I applied it to all branches. More testing welcome.

comment:12 Changed 7 years ago by hamish

Hi,

sorry I am coming to this a bit late. Could someone explain/provide the command line and parsing that the gui does which causes this? ie why is it a problem on Windows an no where else? is v.db.select outputting with fs='|', but '|' is also in the data, so too many columns in the output?

From a quick read of the report it sounds like as long as varchar() can contain all readable chars <=127, and fs= can be all readable chars <=127, then changing the fs= from some to another obscure char is always going to be a problem for some data which contains that char. How about if fs= some >128 char which varchar() doesn't support?

SEP=`echo -e "\xB7"`
v.db.select roads fs="$SEP"

then .split() on char 183? or might the varchar() string contain that too?

Is the problem because we are using text files as the intermediary format? and not like xargs's -0 to use the null char as the f.sep.?

or could the python csv library help? It's a similar tricky quoting problem for reading csv when the string can contain a comma. http://docs.python.org/2/library/csv.html

?, Hamish

comment:13 in reply to:  12 Changed 6 years ago by hamish

Replying to hamish:

or could the python csv library help? It's a similar tricky quoting problem for reading csv when the string can contain a comma.

http://docs.python.org/2/library/csv.html

I couldn't remember the link for what I was looking for then, but I just ran across it.. using Perl to parse the text (naturally), not Python.

source:grass-addons/tools/csv_dequote.pl

Hamish

comment:14 Changed 6 years ago by hamish

maybe replicate what find -print0 + xargs -0 does and support fs=null for v.db.select?

sample data:

#easting,northing,text string
599490,4920855,foo
599590,4920755,bar
599690,4920655,baz|qux
599790,4920555,qux
599890,4920455,foo

v.in.ascii in=test_pipe.csv out=test_pipe fs=, x=1 y=2

another less definitive but more probable to work with standard python string functions idea is to take advantage of the fact that v.db.select's fs= option can take an unlikely string, not just a single char:

v.db.select test_pipe fs='{_sep_}'

Hamish

comment:15 Changed 6 years ago by hamish

Hi, attaching proposed patch to fix it by changing the field sep to a multi-char string which is highly unlikely to occur naturally. As long as we stay with an ascii flat file as the intermediary there will be a finite possibility that the fs string could be there in someone's data, but at least with the patch the chances of that go to near zero.

tested on devbr6 + linux, but not Windows.

Hamish

Changed 6 years ago by hamish

Attachment: db_mngr_12jun13.diff added

patch to fix #1633 in a less user visible way

comment:16 Changed 6 years ago by hamish

tested on Windows XP with nightly build of 6.5 from a couple days ago. works.

Hamish

comment:17 Changed 6 years ago by hamish

is the fieldSeparator used elsewhere, or could it be removed from gui_core/preferences.py?

Hamish

comment:18 in reply to:  17 ; Changed 6 years ago by annakrat

Replying to hamish:

is the fieldSeparator used elsewhere, or could it be removed from gui_core/preferences.py?

no, please go ahead

Thanks

comment:19 in reply to:  18 Changed 6 years ago by hamish

Resolution: fixed
Status: reopenedclosed

Replying to annakrat:

no, please go ahead

ok, applied in all branches r56694-6.

(I kept side fixes like r52730 there)

please test.

Hamish

Note: See TracTickets for help on using tickets.