Opened 2 years ago

Closed 18 months ago

#2845 closed defect (fixed)

FAILED Server Tests - Int32ToLocaleSpecificString ( Util.cpp) versus MgIsLegalUTF8 (ConvertUTF.c)

Reported by: pcardinal Owned by: jng
Priority: medium Milestone: 4.0
Component: Mapping Service Version:
Severity: minor Keywords: Unbreakable blank space utf-8 server test MgIsLegalUTF8
Cc: pcardinal External ID:

Description

FAILED Server test MappingService

GetMultiPlot


d:\MgDev_4.0\Server\src\UnitTesting\TestMappingService.cpp(338) ............................................................................ d:\MgDev_4.0\Server\src\UnitTesting\TestMappingService.cpp(419): FAILED: explicitly with message:

Invalid argument(s):

[1] = "const string&"

The string is invalid and cannot be converted.

GetPlotUsingOverriddenCenterAndScale


d:\MgDev_4.0\Server\src\UnitTesting\TestMappingService.cpp(478) ............................................................................. d:\MgDev_4.0\Server\src\UnitTesting\TestMappingService.cpp(520): FAILED:

explicitly with message:

Invalid argument(s):

[1] = "const string&"

The string is invalid and cannot be converted.

GetPlotUsingExtents


d:\MgDev_4.0\Server\src\UnitTesting\TestMappingService.cpp(528) ...........................................................................

d:\MgDev_4.0\Server\src\UnitTesting\TestMappingService.cpp(570): FAILED: explicitly with message:

Invalid argument(s):

[1] = "const string&"

The string is invalid and cannot be converted.

Under Windows, the digit grouping symbol in many localisations (and international system) is the Unbrekable blank space (code UTF-8 is decimal 160, hex 0xA0). Example : 200 000 000

With french localisation, the function Int32ToLocaleSpecificString will generate a string with Unbreakble blank space for digit grouping symbol. The integer 200000000 will become the string "200 000 000"

[0] 50 '2' char [1] 48 '0' char [2] 48 '0' char [3] -96 ' ' char [4] 48 '0' char [5] 48 '0' char [6] 48 '0' char [7] -96 ' ' char [8] 48 '0' char [9] 48 '0' char [10] 48 '0' char

However, the function MgIsLegalUTF8 return False for the said string because the Unbrekable blank space caracter has a value of 0xA0 whitch is outside the one byte UTF-8 encoding for a caracter.

https://www.fileformat.info/info/unicode/char/00a0/index.htm

1-byte characters in UTF-8 are ASCII Characters 0-127.

line of code in MgIsLegalUTF8

case 1: if (*source >= 0x80 && *source < 0xC2) return false;

(case 1 is for one byte UTF-8 encoding)

Attachments (1)

Capture0.JPG (27.0 KB ) - added by pcardinal 2 years ago.
.\mgserver.exe test - results

Download all attachments as: .zip

Change History (4)

comment:1 by jng, 2 years ago

If we add an explicit test for unicode nbsp character in MgIsLegalUTF8, does the test pass?

by pcardinal, 2 years ago

Attachment: Capture0.JPG added

.\mgserver.exe test - results

comment:2 by pcardinal, 2 years ago

In function MgIsLegalUTF8

With

case 1: if ((*source >= 0x80 && *source < 0xC2) && (*source != 0xA0))  return false;

Instead of

case 1: if (*source >= 0x80 && *source < 0xC2)  return false;

the test does pass.

comment:3 by jng, 18 months ago

Owner: set to jng
Resolution: fixed
Status: newclosed

In 10005:

Make MgIsLegalUTF8 work under French Localization. Patch by Pierre Cardinal.

Fixes #2845

Note: See TracTickets for help on using tickets.