Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#1965 closed defect (fixed)

turkish locale breaks postgis

Reported by: plq Owned by: pramsey
Priority: medium Milestone: PostGIS 2.0.2
Component: postgis Version: 2.0.x
Keywords: Cc: plq, mcayland

Description

plq@iskembe ~  $ createdb -U postgres -T template0 -E utf8 --lc-collate=tr_TR.utf8  --lc-ctype=tr_TR.utf8 some_db
echo create extension postgix | psql -U postgres some_db
plq@iskembe ~  $ echo create extension postgis | psql -U postgres some_db
CREATE EXTENSION
plq@iskembe ~  $ echo "create table some_table(geom geometry(point, 4326))" | psql -U postgres some_db
ERROR:  Invalid geometry type modifier: point
LINE 1: create table some_table(geom geometry(point, 4326))

if you don't have turkish locale;

$ echo tr_TR.UTF-8 UTF-8 >> /etc/locale.gen
$ locale-gen

Change History (13)

comment:1 by plq, 12 years ago

plq@iskembe ~  $ psql -U postgres some_db
psql (9.1.5)                                                                                                                                                                                                                    
Type "help" for help.                                                                                                                                                                                                           
                                                                                                                                                                                                                                
some_db=# select postgis_Full_version();                                                                                                                                                                                        
NOTICE:  Function postgis_topology_scripts_installed() not found. Is topology support enabled and topology.sql installed?                                                                                                       
                                                                       postgis_full_version                                                                                                                                     
-------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                             
 POSTGIS="2.0.1 r9979" GEOS="3.3.3-CAPI-1.7.4" PROJ="Rel. 4.7.1, 23 September 2009" GDAL="GDAL 1.9.1, released 2012/05/15" LIBXML="2.8.0" LIBJSON="UNKNOWN" RASTER                                                              
(1 row)                                                                                                                                                                                                                         

some_db=# select version();
                                                          version                                                          
---------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.5 on x86_64-pc-linux-gnu, compiled by x86_64-pc-linux-gnu-gcc (Gentoo 4.6.3 p1.3, pie-0.5.2) 4.6.3, 64-bit
(1 row)

some_db=#

comment:2 by plq, 12 years ago

Cc: plq added

anything else you need, let me know.

comment:3 by plq, 12 years ago

I'm not sure whether they'd be relevant, but here are the cluster settings:

The database cluster will be initialized with locales
  COLLATE:  C
  CTYPE:    en_US.utf8
  MESSAGES: C
  MONETARY: C
  NUMERIC:  C
  TIME:     C
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to "english".

fixing permissions on existing directory /home/postgresql/9.1/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 24MB
creating configuration files ... ok
creating template1 database in /home/postgresql/9.1/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok

comment:4 by pramsey, 12 years ago

Hm works with fr_CA, so it's not problem with alternate locales in general…

comment:5 by pramsey, 12 years ago

Still works for me with this call:

createdb -U pramsey -T template0 -E utf8 --lc-collate=tr_TR  --lc-ctype=tr_TR some_db

tr_TR.utf8 was considered an unknown locale on my system

comment:6 by plq, 12 years ago

I also have it working with fr_CA.utf8

Is your system an ubuntu? tr_TR.UTF-8 might be the correct locale string there.

comment:7 by pramsey, 12 years ago

Yes, that locale string (tr_TR.UTF-8) worked (OSX), but the table create also still works, so I'm unable so far to duplicate this.

comment:8 by plq, 12 years ago

so far this looks linux (or gentoo)-specific then. I can reproduce this in other machines. unless you have any other pointers for me, i'll fetch the code and fire up gdb.

comment:9 by pramsey, 12 years ago

See what's going on in gserialized_typmod_in.

comment:10 by pramsey, 12 years ago

Cc: mcayland added

OK, replicated on Centos. On linux, under the tr_TR locale, the value of toupper(i) seems to be 'i' instead of 'I'! So 'point' becomes 'POiNT'! Also curious, when I try to work around the problem with

create table some_table(geom geometry(POINT, 4326))

the modifier "POINT" still arrives at the geometry_type_from_string function as "point"! So somewhere in PgSQL the damn thing is already being forced to *lower*!

The behavior of toupper seems like a Linux quirk, unless the upper case value of i in the Turkish locale really is 'i' (while all the other latin characters *are* being sent to their corresponding uppers?)

comment:11 by pramsey, 12 years ago

Resolution: fixed
Status: newclosed

Fixed in 2.0 at r10679, and in trunk at r10678. And I got to learn all about the "turkish I". http://www.i18nguy.com/unicode/turkish-i18n.html

comment:12 by plq, 12 years ago

Hey, thanks for the fix. the function name should probably be dump_toupper and not dump_toupper though.

in reply to:  12 comment:13 by plq, 12 years ago

Replying to plq:

Hey, thanks for the fix. the function name should probably be dump_toupper and not dump_toupper though.

haha. I mean it should be dumb_toupper.

Note: See TracTickets for help on using tickets.