Opened 18 years ago

Closed 17 years ago

#1921 closed defect (fixed)

Problem with Curved labels and multibyte characters

Reported by: vulukut@… Owned by: sdlime
Priority: high Milestone: 5.0 release
Component: MapServer C Library Version: 4.10
Severity: normal Keywords:
Cc: benjcarson@…, kristian.thy@…, craign@…

Description (last modified by sdlime)

Curved labels don't take multi-byte and/or special characters into account and 
does not show the right characters for i.e "þ,ç,ý,ð" but shows characters 
like "Ã,§,Ö,±"
here is the related map file part :
   LABEL
        TYPE TRUETYPE
        FONT "fritqat"
        SIZE 7
        MINSIZE 4
        MAXSIZE 256
        POSITION cc
        OFFSET 0 0 
        ANGLE FOLLOW
        BUFFER 10
        MINDISTANCE -1
        MINFEATURESIZE -1
        COLOR 0 0 0
        OUTLINECOLOR 254 254 254
        ANTIALIAS TRUE
        PARTIALS FALSE
        FORCE FALSE
	ENCODING "ISO-8859-9"
      END

the related row of the font file:
fritqat                         Vera.ttf

on a side note, these characters show normal with labels which doesn't follow 
lines.

Attachments (2)

curvedlabel.jpg (14.1 KB ) - added by vulukut@… 18 years ago.
example screenshot of the bug
curved-bug-1921-screenshot.jpg (11.0 KB ) - added by novorado 17 years ago.
1921 screenshot with Cyrillic data

Download all attachments as: .zip

Change History (18)

by vulukut@…, 18 years ago

Attachment: curvedlabel.jpg added

example screenshot of the bug

comment:1 by sdlime, 18 years ago

Cc: dmorissette@… added
Status: newassigned
Version: 4.84.10

comment:2 by dmorissette, 18 years ago

Milestone: 5.0 release
Too late for 4.10.0. Setting target milestone to 5.0.

Depending on the amount of changes involved with the fix we may or may not
backport to 4.10.x

comment:3 by benjcarson@…, 18 years ago

Cc: benjcarson@… added

comment:4 by kristian.thy@…, 17 years ago

Cc: kristian.thy@… added

comment:5 by craign@…, 17 years ago

Cc: craign@… added
I've experienced the same issue on MS4W 2.2.2 (MapServer 4.10.0) as well as 
MS4W 2.2.3 (MapServer 4.10.1). I believe it is the same issue as my Unicode/
UTF8 stored characters (VISCII and BIG-HKSCS) are being corrupted whenever I 
set the ANGLE to FOLLOW but it works fine on AUTO.

Is there any chance that this bug could be fixed earlier than version 5.0?

comment:6 by sdlime, 17 years ago

Benj: Do you have any thoughts on where to start with this? I can persue the 
fix but could use a jump start.

Steve

comment:7 by dmorissette, 17 years ago

I didn't look deep into this, but I suspect the curved label code will need to
be modified to use iconv() to read characters from the string instead of just
assuming that each byte is a separate char:

http://www.gnu.org/software/libiconv/documentation/libiconv/iconv.3.html

A good start would be to look at how iconv() is used in msGetEncodedString()
(mapgd.c). It might be as simple as borrowing some of that code.


comment:8 by sdlime, 17 years ago

Dan: Could we borrow the entire function to convert the string to it's encoded
form before processing by the curved label functions? I'm not familiar with
iconv myself. If I understand this, and I probably don't, then I'm kinda
surprised the regular processing, especially placement (e.g. POSITION CC),
works. Without passing the encoded string to the gd/freetype size computation
functions I can't see how we're getting the right sizes for placement. 

Could the character set conversion be done before a label is added to the cache?

Also, anyone got a good test case?

Steve

comment:9 by craign@…, 17 years ago

I'm sure I could supply you with a layer and map file demonstrating the 
problem. Should I e-mail it to you?

comment:10 by sdlime, 17 years ago

Actually unless it's huge you can attach it here. I can only handle 5Mb
attachments at work. Alternatively you can ftp it to
ftp.dnr.state.mn.us/pub/incoming and let me know where you drop it.

Steve

comment:11 by sdlime, 17 years ago

Description: modified (diff)

Folks: I'm wanting to work on this but have no sample data to test with. I need:

  • ttf font
  • line data with multibyte chars
  • sample mapfile

Can anyone provide that?

Steve

in reply to:  11 comment:12 by novorado, 17 years ago

Replying to sdlime:

Hi Steve,

how I can contact you by email to provide relevant data with cyrillic characters? I believe fix should be easy, it is really loosing encoding when follow option uses algorithm to make lable non-linear (curved)

Best regards, Dmitry

Folks: I'm wanting to work on this but have no sample data to test with. I need:

  • ttf font
  • line data with multibyte chars
  • sample mapfile

Can anyone provide that?

Steve

by novorado, 17 years ago

1921 screenshot with Cyrillic data

comment:13 by sdlime, 17 years ago

Getting there. With older code if you draw text with the extended character sets you'll notice that placement is off because the iconv isn't being run before placement is computed. I committed a fix for that piece so now POSITION is correct. Paths don't work yet though. Perhaps the storage model we're using isn't sufficient...

Steve

comment:14 by dmorissette, 17 years ago

An update on this bug: Steve has committed some changes to deal with multibyte chars in the ANGLE FOLLOW code which was working but relied on setlocale. We then added in r6242 two new functions msGetNextUTF8Char() and msGetNumUTF8Chars() in mapstring.c which are independent of setlocale and are used in mapprimitive.c and mapgd.c.

This is mostly fixed in SVN trunk now, just need more testing.

comment:15 by dmorissette, 17 years ago

Steve, have you had a chance to test with my modifs (SVN trunk)? Can this bug be closed?

comment:16 by sdlime, 17 years ago

Resolution: fixed
Status: assignedclosed

Works with my simple test case. Would be nice to get confirmation from other folks. Closing for now...

Steve

Note: See TracTickets for help on using tickets.