Ticket #2237 (closed defect: fixed)

Opened 6 years ago

Last modified 6 years ago

msGetEncodedString does not work with wide strings

Reported by: hobu Owned by: sdlime
Priority: normal Milestone: 5.0 release
Component: MapServer C Library Version: svn-trunk (development)
Severity: normal Keywords:
Cc: dmorissette, warmerdam

Description

I have been unable to get msGetEncodedString to work when for drawing UTF-16 data out of ArcSDE. I went digging in the FDO code and found a macro called convert_wide_to_utf8 which seems to do the trick.

 http://trac.osgeo.org/fdo/browser/trunk/Utilities/Common/Inc/FdoCommonStringUtil.h#L210

I'm not sure if the problem is really msGetEncodedString or that I shouldn't be passing it wide strings.

Have we tested msGetEncodedString very much since the changes in r6246 and r6244 were made?

            case SE_NSTRING_TYPE:
                shape->values[i] = (char *)malloc(itemdefs[i].size*sizeof(char)+1);
                memset(shape->values[i], 0, itemdefs[i].size*sizeof(char)+1);
                wide = (SE_WCHAR *)malloc(itemdefs[i].size*2*sizeof(SE_WCHAR)+1);
                memset(wide, 0, itemdefs[i].size*2*sizeof(SE_WCHAR)+1);
                status = SE_stream_get_nstring( sde->connPoolInfo->stream,
                                                (short) (i+1),
                                                wide);
    int convFailed = MS_TRUE;
    if (NULL != wide)
    {
        printf("Wide was not null!!n");
        size_t szStr = wcslen (wide)+1;
        printf("Size was %d\n", szStr);
        wide_string = (char*)alloca (szStr * 6);
        iconv_t rOpen = iconv_open("UTF-8", "UTF-16");
        if ((iconv_t)-1 != rOpen)
        {  
           size_t iSize_Out = szStr * 6;
           size_t iSize_In = sizeof (wchar_t)*szStr;
           printf("Size in: %d  Size out: %d", iSize_In, iSize_Out);
           char* p_utf8 = wide_string;
           const wchar_t* p_w = wide;
           size_t rConv = iconv(rOpen, (char**)&p_w, &iSize_In, &p_utf8,&iSize_Out);
           if ((size_t)-1 != rConv &&  iSize_Out != (szStr * 6))
               convFailed = MS_FALSE;
           iconv_close(rOpen);
        }
    }
    if (convFailed)
        wide_string = NULL;
    if (NULL==wide_string) printf("utf8 string was null!!!");

               strcpy(shape->values[i], wide_string);

Change History

Changed 6 years ago by hobu

  • priority changed from highest to normal
  • summary changed from msGetEncodedString is busted to msGetEncodedString does not work with wide strings

Changed 6 years ago by hobu

I have added msConvertWideStringToUTF8 in r6602. Please review this function carefully, as my skills in these types of things are usually very poor. It was adapted from the FDO macro above.

Changed 6 years ago by dmorissette

  • cc dmorissette added; dmorrisette removed

Changed 6 years ago by hobu

danmo says in irc:

(1) it looks odd to have two very different implementations in the old function and in your new one, and (2) the old function should probably be renamed for consistency

Changed 6 years ago by dmorissette

For the record, the original implementation of msGetEncodedString() was based on code contributed by the guys from Orkney (Japan) in ticket #858.

Changed 6 years ago by sdlime

Nothing in MapServer works with wide chars (wchar_t). Are there any licensing issues with FDO?

Steve

Changed 6 years ago by hobu

ArcSDE does now :)

I took the FDO macro and rewrote quite significantly to fit our style and deal with errors in the way we typically do. I don't believe there is any provenance issue with it, but I'll defer to higher powers on that. If you guys think we should yank it, I'll start over.

Something should be done to bring the two functions together, as Daniel says. I'm willing to do the work, but not willing to say what should be done :)

Changed 6 years ago by tomkralidis

I get the following warnings on FC6 against mapstring.c:

mapstring.c: In function 'msGetEncodedString': mapstring.c:1062: warning: passing argument 2 of 'iconv' from incompatible pointer type mapstring.c: In function 'msConvertWideStringToUTF8': mapstring.c:1119: warning: dereferencing type-punned pointer will break strict-aliasing rules mapstring.c:1119: warning: passing argument 2 of 'iconv' from incompatible pointer type

Changed 6 years ago by hobu

Cleaned up two of them in r6662. I don't know how to fix the type-punned pointer complaint gracefully. Any ideas?

Changed 6 years ago by dmorissette

  • cc warmerdam added

If I'm not mistaken, I don't think we can directly copy LGPL code from FDO into MapServer which is MIT/X11 (unless you get special permission from the copyright owner to do so?).

What are our options then? A simple solution would be to rewrite msConvertWideStringToUTF8 based on the code currently in msGetEncodedString (adapted to deal with wchar_t). msGetEncodedString may not be as clean but doesn't suffer from any licensing issue and it's known to work. That would also solve the issue of keeping both functions consistent.

Adding Frank to CC to get his thoughts on the potential licensing issue.

Changed 6 years ago by warmerdam

Indeed, we cannot directly copy code from MapGuide? or other LGPL sources without special relicensing permission. In this case though I think a mapserver version could be pretty easily written "from scratch" using the mapguide code as an example for the right way to use the various underlying functions and types.

Changed 6 years ago by hobu

I'm no expert, but I think the current incarnation of msConvertWideSTringToUTF8 is now quite different than the ADSK macro on which it was modeled. Theirs was a macro, did stack allocation with alloca, and did hardly any error checking. The biggest functional difference is this attempt converts the entire string at once instead of character at a time like msGetEncodedString does (modeled after the ADSK macro). I don't think there's an LGPL problem here in this case, but as I've said, I'll defer to others' opinions.

Changed 6 years ago by warmerdam

It appears that at this point the code is vaguely inspired by the ADSK macros but not derivative in a copyright sense. So no issue.

Changed 6 years ago by dmorissette

  • status changed from new to closed
  • resolution set to fixed

Let's close this ticket for 5.0.

I have created #2266 to bring both functions closer to each other in 5.2.

Note: See TracTickets for help on using tickets.