Ticket #1633 (reopened defect)

Opened 14 months ago

Last modified 5 weeks ago

Unable to display shapefile attribute table due to pipe characters in dbf field

Reported by: richardc Owned by: grass-dev@…
Priority: critical Milestone: 6.4.3
Component: wxGUI Version: svn-releasebranch64
Keywords: database, separator Cc:
Platform: Unspecified CPU: Unspecified

Description (last modified by hamish) (diff)

Hi,

I'm receiving the following error on attempting to view the attribute table of shapefiles in GRASS 6.4.1.2 (on Ubuntu Lucid):

item = layer, log = self.goutput)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 653, in __init__

self.__createBrowsePage()
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 698, in __createBrowsePage

self.mapDBInfo, layer)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 95, in __init__

keyColumn = self.LoadData(layer)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 247, in LoadData

self.AddDataRow(i, record, columns, keyId)
  File "/usr/lib/grass64/etc/wxpython/gui_modules/dbm.py",
line 286, in AddDataRow

if self.columns[columns[j]]['ctype'] != types.StringType:
IndexError
:
list index out of range 

Method: I import the shapefile as follows:

Layer Manager > File > Import vector data > Common import formats > Select ESRI shapefile > Browse and select file > Import

And then select the layer in 'Layer Manager', and click on the 'Show attribute table' icon. Then the error results

'Please wait loading attribute data..' with the following in the command console -

if self.columns[columns[j]]['ctype'] != types.StringType:
IndexError
:
list index out of range 

On closer inspection of the dbf file, some of the fields contain pipe (i.e., '|') characters. On removing these the attribute table can be displayed.

QGIS is able to display all data in the attribute table, including 'pipe' characters.

Example of problem fields in single column named 'VARNAME_1,C,150':

VARNAME_1,C,150

Bangkok|Krung Thep|Krung Thep Maha Nakhon|Phra Nakhon-Thonburi Buri Rum Chaxerngsao|Pad Rew|Paed Riu|Petrieu|Shajeun Dhrao Chainat Chantaburi|Muang Chan

NOTE: issue posted originally at  http://osgeo-org.1560.n6.nabble.com/Index-error-Unable-to-show-attribute-table-of-shapefile-tp4678952p4678952.html

Change History

follow-up: ↓ 2   Changed 9 months ago by annakrat

  • keywords database, separator added
  • component changed from Display to wxGUI

In GRASS 7 I added option to choose separator in GUI settings. Instead of strange error during loading table, you should now get dialog which suggests you to change the separator. Could someone test it before backport?

Anna

in reply to: ↑ 1 ; follow-up: ↓ 3   Changed 9 months ago by neteler

Replying to annakrat:

In GRASS 7 I added option to choose separator in GUI settings.

For the record: r52712

Instead of strange error during loading table, you should now get dialog which suggests you to change the separator. Could someone test it before backport?

Tested, it works almost well now, telling the user:

"Inconsistent number of columns in the table <vectorname>. Try to change field separator in GUI Settings, Attributes tab, Data browser section."

Perhaps it should then not reach the attribute manager window (since useless) but just return or close the attribute manager window?

in reply to: ↑ 2   Changed 9 months ago by annakrat

Replying to neteler:

Perhaps it should then not reach the attribute manager window (since useless) but just return or close the attribute manager window?

It is easier just to remove the browse data page (it looks like the case when you have table with no data). Attribute manager can stay open because you can still want to change some other things related to table columns, connection. Done in r52730.

Anna

  Changed 9 months ago by annakrat

  • status changed from new to closed
  • resolution set to fixed

Backported in r52815 and r52816, closing the ticket.

  Changed 8 months ago by hamish

  • description modified (diff)

follow-up: ↓ 7   Changed 8 months ago by annakrat

  • status changed from closed to reopened
  • platform changed from Linux to Unspecified
  • resolution fixed deleted

Because the change causes bug #1706, I reverted it for Windows. Therefore I reopen this ticket.

Anna

in reply to: ↑ 6   Changed 2 months ago by hellik

Replying to annakrat:

Because the change causes bug #1706, I reverted it for Windows. Therefore I reopen this ticket.

see  http://trac.osgeo.org/grass/ticket/1706#comment:7

closing ticket?

Helmut

follow-up: ↓ 9   Changed 2 months ago by annakrat

If I remember the problem correctly, this ticket is still valid for Windows. So on Windows you cannot load attribute table data with pipe characters. Maybe we can change milestone?

in reply to: ↑ 8   Changed 5 weeks ago by neteler

  • version changed from 6.4.1 to svn-releasebranch64

Replying to annakrat:

If I remember the problem correctly, this ticket is still valid for Windows.

Confirmed: g.copy e.g. the "boundary_municip" NC map to myvector, then edit an entry in the "NAME" column and insert "|" there, it (apparently) do that but then no longer show the table. However, a message is shown suggesting to change the separator to a different character.

So on Windows you cannot load attribute table data with pipe characters.

Currently no unless you select a different separator.

Maybe we can change milestone?

To which would you change it? How about a condition if-on-windows predefine a separator different from | by default? Which is more save?

follow-up: ↓ 11   Changed 5 weeks ago by annakrat

There should be an easy workaround for this keeping | as default (r55719). We'll see tomorrow.

in reply to: ↑ 10   Changed 5 weeks ago by annakrat

Replying to annakrat:

There should be an easy workaround for this keeping | as default (r55719). We'll see tomorrow.

It seems to work so I applied it to all branches. More testing welcome.

  Changed 5 weeks ago by hamish

Hi,

sorry I am coming to this a bit late. Could someone explain/provide the command line and parsing that the gui does which causes this? ie why is it a problem on Windows an no where else? is v.db.select outputting with fs='|', but '|' is also in the data, so too many columns in the output?

From a quick read of the report it sounds like as long as varchar() can contain all readable chars <=127, and fs= can be all readable chars <=127, then changing the fs= from some to another obscure char is always going to be a problem for some data which contains that char. How about if fs= some >128 char which varchar() doesn't support?

SEP=`echo -e "\xB7"`
v.db.select roads fs="$SEP"

then .split() on char 183? or might the varchar() string contain that too?

Is the problem because we are using text files as the intermediary format? and not like xargs's -0 to use the null char as the f.sep.?

or could the python csv library help? It's a similar tricky quoting problem for reading csv when the string can contain a comma.  http://docs.python.org/2/library/csv.html

?, Hamish

Note: See TracTickets for help on using tickets.