Opened 12 years ago

Closed 9 years ago

#4798 closed defect (wontfix)

Column names in WIN1257 encoding caused problem

Reported by: vabastiil Owned by: warmerdam
Priority: normal Milestone:
Component: default Version: 1.9.0
Severity: normal Keywords:
Cc:

Description

When trying to convert from Mapinfo to Postgre database, there will be problem with "encoding specific" characters (like ö,ä,ü,õ) in column name.

PGCLIENTENCODING is set to WIN1257. Database encoding = WIN1257

There is no problems with "encoding specific" characters in columns and will be converted without trouble.

Attachments (1)

import.zip (1.6 KB ) - added by vabastiil 12 years ago.
Layer that causes problem

Download all attachments as: .zip

Change History (12)

comment:1 by Jukka Rahkonen, 12 years ago

Perhaps driver follows the ANSI standard for identifiers where such characters are forbidden? They are accepted as "Quoted or Delimited Identifiers" though but I would not consider it very bad if they were not supported. Anyway, it would be good to see from the documentation if "Quoted or Delimited Identifiers" are supported or not and, also as a sort of Quoted Identifiers, if user should do something special for making a difference between column names "CamelCase" and "camelcase", for example.

http://wiki.ispirer.com/sqlways/postgresql/identifiers

comment:2 by Even Rouault, 12 years ago

Is there a precise error message ? Could you attach the file that causes that problem ?

by vabastiil, 12 years ago

Attachment: import.zip added

Layer that causes problem

comment:3 by vabastiil, 12 years ago

There is no error message. I noticed that when i do sql query (after conversion from mapinfo to postgre) with "org.gdal.ogr.DataSource.ExecuteSQL" from information_schema.columns, i get names in as they should be. But if i check to the database - table names are messed up.

So, is that problem related with database driver?

in reply to:  2 comment:4 by vabastiil, 12 years ago

Replying to rouault:

Is there a precise error message ? Could you attach the file that causes that problem ?

There is no error message. I noticed that when i do sql query (after conversion from mapinfo to postgre) with "org.gdal.ogr.DataSource?.ExecuteSQL" from information_schema.columns, i get names in as they should be. But if i check to the database - table names are messed up.

So, is that problem related with database driver?

comment:5 by Even Rouault, 12 years ago

I tried the following :

PGCLIENTENCODING=WIN1257 ogr2ogr -f postgresql pg:dbname=autotest Margala_1997.TAB

(my database is in UTF-8)

and :

$ ogrinfo pg:dbname=autotest Margala_1997 -al

returns :

INFO: Open of `pg:dbname=autotest'
      using driver `PostgreSQL' successful.

Layer name: margala_1997
Geometry: Unknown (any)
Feature Count: 0
Layer SRS WKT:
PROJCS["unnamed",
    GEOGCS["unnamed",
        DATUM["Euref_98",
            SPHEROID["GRS 80",6378137,298.257222101],
            TOWGS84[0,0,0,-0,-0,-0,0]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Lambert_Conformal_Conic_2SP"],
    PARAMETER["standard_parallel_1",58],
    PARAMETER["standard_parallel_2",59.33333333],
    PARAMETER["latitude_of_origin",57.51755393056],
    PARAMETER["central_meridian",24],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",6375000],
    UNIT["Meter",1.0]]
FID Column = ogc_fid
Geometry Column = wkb_geometry
jrknr: Real (0.0)
nimi: String (49.0)
kaart: String (4.0)
p: String (2.0)
k: String (2.0)
a: String (2.0)
vaatleja: String (77.0)
märgalanr: String (8.0)
kktyyp: String (44.0)
põõsarinne: String (2.0)
pvalem: String (30.0)
pvanus: String (1.0)
pyldilme: String (1.0)
lamapuit: String (1.0)
ktyved: String (1.0)
raiumine: String (2.0)
karjatamine: String (2.0)
niitmine: String (2.0)
põlemine: String (1.0)
kuivendamine: String (2.0)
tallamine: String (1.0)
ehitused: String (1.0)
sihid: String (1.0)
elektriliinid: String (1.0)
teed: String (1.0)
taliteed: String (1.0)
turbavarumine: String (1.0)
karjäärid: String (1.0)
saastamine: String (1.0)
prügistamine: String (1.0)
veerežiim: String (3.0)
väärtuslik: String (5.0)
mitmekesisus: String (3.0)
teaduslikult: String (3.0)
didaktika: String (1.0)
kasu: String (4.0)
looduslikkuse_hinnang: String (3.0)
kokkuvõte: String (3.0)

So all works fine.

in reply to:  5 comment:6 by vabastiil, 12 years ago

Replying to rouault:

I tried the following :

PGCLIENTENCODING=WIN1257 ogr2ogr -f postgresql pg:dbname=autotest Margala_1997.TAB

(my database is in UTF-8)

and :

$ ogrinfo pg:dbname=autotest Margala_1997 -al

returns :

INFO: Open of `pg:dbname=autotest'
      using driver `PostgreSQL' successful.

Layer name: margala_1997
Geometry: Unknown (any)
Feature Count: 0
Layer SRS WKT:
PROJCS["unnamed",
    GEOGCS["unnamed",
        DATUM["Euref_98",
            SPHEROID["GRS 80",6378137,298.257222101],
            TOWGS84[0,0,0,-0,-0,-0,0]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Lambert_Conformal_Conic_2SP"],
    PARAMETER["standard_parallel_1",58],
    PARAMETER["standard_parallel_2",59.33333333],
    PARAMETER["latitude_of_origin",57.51755393056],
    PARAMETER["central_meridian",24],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",6375000],
    UNIT["Meter",1.0]]
FID Column = ogc_fid
Geometry Column = wkb_geometry
jrknr: Real (0.0)
nimi: String (49.0)
kaart: String (4.0)
p: String (2.0)
k: String (2.0)
a: String (2.0)
vaatleja: String (77.0)
märgalanr: String (8.0)
kktyyp: String (44.0)
põõsarinne: String (2.0)
pvalem: String (30.0)
pvanus: String (1.0)
pyldilme: String (1.0)
lamapuit: String (1.0)
ktyved: String (1.0)
raiumine: String (2.0)
karjatamine: String (2.0)
niitmine: String (2.0)
põlemine: String (1.0)
kuivendamine: String (2.0)
tallamine: String (1.0)
ehitused: String (1.0)
sihid: String (1.0)
elektriliinid: String (1.0)
teed: String (1.0)
taliteed: String (1.0)
turbavarumine: String (1.0)
karjäärid: String (1.0)
saastamine: String (1.0)
prügistamine: String (1.0)
veerežiim: String (3.0)
väärtuslik: String (5.0)
mitmekesisus: String (3.0)
teaduslikult: String (3.0)
didaktika: String (1.0)
kasu: String (4.0)
looduslikkuse_hinnang: String (3.0)
kokkuvõte: String (3.0)

So all works fine.

Thank you for quick response!

It seems to be ok so far. But now.. When im trying to convert that layer back to mapinfo, i get errors.

ogr2ogr -f MapInfo File Margala_1997_2.TAB "PG:host=localhost.." margala_1997

ogrinfo Margala_1997_2.TAB -al

returns:

INFO: Open of `Margala_1997_2.TAB'
      using driver `MapInfo File' successful.



Layer name: Margala_1997_2
Geometry: Unknown (any)
Feature Count: 0
Extent: (-2.9999999999999996E7, -1.4999999999999998E7) - (2.9999999999999996E7, 1.4999999999999998E7)
Layer SRS WKT:
PROJCS["unnamed",
    GEOGCS["unnamed",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563],
            TOWGS84[0,0,0,-0,-0,-0,0]],
        PRIMEM["Greenwich",0],
        UNIT["degree",0.0174532925199433]],
    PROJECTION["Lambert_Conformal_Conic_2SP"],
    PARAMETER["standard_parallel_1",59.33333333333334],
    PARAMETER["standard_parallel_2",58],
    PARAMETER["latitude_of_origin",57.51755393055556],
    PARAMETER["central_meridian",24],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",6375000],
    UNIT["Meter",1.0]]
jrknr: Real (0.0)
nimi: String (49.0)
kaart: String (4.0)
p: String (2.0)
k: String (2.0)
a: String (2.0)
vaatleja: String (77.0)
m?_rgalanr: String (8.0)
kktyyp: String (44.0)
p?_?_sarinne: String (2.0)
pvalem: String (30.0)
pvanus: String (1.0)
pyldilme: String (1.0)
lamapuit: String (1.0)
ktyved: String (1.0)
raiumine: String (2.0)
karjatamine: String (2.0)
niitmine: String (2.0)
p?_lemine: String (1.0)
kuivendamine: String (2.0)
tallamine: String (1.0)
ehitused: String (1.0)
sihid: String (1.0)
elektriliinid: String (1.0)
teed: String (1.0)
taliteed: String (1.0)
turbavarumine: String (1.0)
karj?_?_rid: String (1.0)
saastamine: String (1.0)
pr?_gistamine: String (1.0)
veere?_iim: String (3.0)
v?_?_rtuslik: String (5.0)
mitmekesisus: String (3.0)
teaduslikult: String (3.0)
didaktika: String (1.0)
kasu: String (4.0)
looduslikkuse_hinnang: String (3.0)
kokkuv?_te: String (3.0)

And you may do little checking - after converting that mapinfo file to postgresql database - use some database management tool to check out what happened in there. I see that table names are messed up! But when i check that postgresql layer with ogrinfo, i see that everything is OK. (same result that you inserted)

comment:7 by Even Rouault, 12 years ago

The issue when converting back to mapinfo is a different issue: the mapinfo driver likely has likely no particular support for encoding.

As far as the issue with postgresql database management tool, I'd rather incriminate them. I've checked with the psql command line utility, and column name look fine :

$ psql -d autotest
psql (8.4.13)
Type "help" for help.

autotest=> \d margala_1997;

                                          Table "public.margala_1997"
        Column         |         Type          |                           Modifiers                            
-----------------------+-----------------------+----------------------------------------------------------------
 ogc_fid               | integer               | not null default nextval('margala_1997_ogc_fid_seq'::regclass)
 wkb_geometry          | geometry              | 
 jrknr                 | double precision      | 
 nimi                  | character varying(49) | 
 kaart                 | character varying(4)  | 
 p                     | character varying(2)  | 
 k                     | character varying(2)  | 
 a                     | character varying(2)  | 
 vaatleja              | character varying(77) | 
 märgalanr             | character varying(8)  | 
 kktyyp                | character varying(44) | 
 põõsarinne            | character varying(2)  | 
 pvalem                | character varying(30) | 
 pvanus                | character varying(1)  | 
 pyldilme              | character varying(1)  | 
 lamapuit              | character varying(1)  | 
 ktyved                | character varying(1)  | 
 raiumine              | character varying(2)  | 
 karjatamine           | character varying(2)  | 
 niitmine              | character varying(2)  | 
 põlemine              | character varying(1)  | 
 kuivendamine          | character varying(2)  | 
 tallamine             | character varying(1)  | 
 ehitused              | character varying(1)  | 
 sihid                 | character varying(1)  | 
 elektriliinid         | character varying(1)  | 
 teed                  | character varying(1)  | 
 taliteed              | character varying(1)  | 
 turbavarumine         | character varying(1)  | 
 karjäärid             | character varying(1)  | 
 saastamine            | character varying(1)  | 
 prügistamine          | character varying(1)  | 
 veerežiim             | character varying(3)  | 
 väärtuslik            | character varying(5)  | 
 mitmekesisus          | character varying(3)  | 
 teaduslikult          | character varying(3)  | 
 didaktika             | character varying(1)  | 
 kasu                  | character varying(4)  | 
 looduslikkuse_hinnang | character varying(3)  | 
 kokkuvõte             | character varying(3)  | 


You should perhaps try having your database in UTF-8 instead of WIN1257 and just use PGCLIENTENCODING to do the conversion from MapInfo to PostgreSQL ?

comment:8 by Jukka Rahkonen, 12 years ago

Hi,

I apologize but because this is character encoding trouble perhaps linking to a document written in Finnish is acceptable. I believe that vabastiil who comes probably from Estonia can read the document and try the example number 2 by using GML as an interim format and report back how it went. Actually, document contains full ogr commands and examples should be pretty understandable for all non-Finnish speakers too. http://latuviitta.org/documents/Mapinfo_GDAL_ogr2ogr_ja_UTF-8.pdf

comment:9 by vabastiil, 12 years ago

Ugh, so, it might also mapinfo driver that is causing problems when converting back?

Actually i use gdal inside my java code. I tried to use PGCLIENTENCODING=UTF8 and database UTF8

now all column names are correct, but now there are problems with content of columns that contain characters ö,ä,ü,õ. (there is no problems with column content when im using PGCLIENTENCODING=WIN1257)

ERROR 1: INSERT command for new feature failed. ERROR: invalid byte sequence for encoding "UTF8": 0xf5676576 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

Mapinfo file is in WIN1257 encoding, maybe that is causing it? :/

Problem: (There was character 'õ' in the content of one column.)

Well.. one step forward and two steps back :) I must.. keep looking for solution :)

jratike80, thank you for that link, i will dig into it

comment:10 by Jukka Rahkonen, 9 years ago

Vabastiil, I suggest to close this ticket for example as "wontfix" and to create a new one "MapInfo File driver should support different character encodings". This ticket has now such a fuzzy title that nobody will ever pick it for making a fix.

comment:11 by Jukka Rahkonen, 9 years ago

Resolution: wontfix
Status: newclosed

Closing this ticket because user vabastiil did not react.

Note: See TracTickets for help on using tickets.