23 | | The CPLString class will now be assumed to be a potentially multi-byte |
24 | | encoded string, but with no effect within the CPLString to keep track of |
25 | | the encoding it is in. This is left to the higher level code for now. |
26 | | |
27 | | However, the CPLString is extended with some convenient mechanisms for |
28 | | recoding, and for conversion of UTF-8 to/from "wchar_t" (aka UCS-2). |
29 | | |
30 | | It is stressed that normal initialization of a CPLString from "const char *" |
31 | | does not attempt to do any conversions to/from UTF-8. This rule is kept, |
32 | | in part to minimize string processing costs for the common case. When encoding |
33 | | is believed to be an issue the calling code must keep track. |
| 23 | The following three C callable functions will be introduced for recoding strings, and for converting between wchar_t (wide character) and char (multi-byte) formats: |
37 | | // Convert the internal string to a new encoding. |
38 | | char* CPLString::recode( const char *pszSrcEncoding, const char *pszDstEncoding ); |
39 | | |
40 | | // Set value based on input encoded string with CPLString set to UTF-8. |
41 | | // This is equivelent to normal setting, and then a recode() with a destination |
42 | | // encoding of "UTF-8" and thus is just for convenience. |
43 | | |
44 | | CPLString &CPLString::SetAsUTF8( const char *pszInput, const char *pszEncoding = "" ); |
45 | | |
46 | | // Set value based on input encoded string with CPLString set to UTF-8. |
47 | | // This is equivelent to normal setting, and then a recode() with a destination |
48 | | // encoding of "UTF-8" and thus is just for convenience. |
49 | | |
50 | | CPLString &CPLString::SetAsUTF8( const wchar_t *pszInput, const char *pszEncoding = "UCS-2" ); |
51 | | |
52 | | // Construct UTF-8 string object from array of wchar_t elements. |
53 | | CPLString::CPLString( const wchar_t*pszInput, const char *pszEncoding = "UCS-2" ); |
54 | | |
55 | | // Returns a wchar_t string which becomes the ownership responsibility of |
56 | | // the caller (free with CPLFree()). It is assumed the CPLString is UTF-8. |
57 | | wchar_t *CPLString::GetAsWChar( const char *pszDstEncoding = "UCS-2" ); |
| 29 | char *CPLRecodeFromWChar( const wchar_t *pwszSource, |
| 30 | const char *pszSrcEncoding, |
| 31 | const char *pszDstEncoding ); |
| 32 | wchar_t *CPLRecodeToWChar( const char *pszSource, |
| 33 | const char *pszSrcEncoding, |
| 34 | const char *pszDstEncoding ); |
60 | | I have specifically avoided additional constructors or casting operators to |
61 | | to avoid any possible overloading ambiguities or complication in maintaining |
62 | | extra state in the CPLString. Such services can be added in the future based |
63 | | on the above methods if desired. |
| 37 | In each case the returned string is zero terminated, as is the input string, and the returned string should be deallocated with CPLFree(). |
| 38 | In case of error the returned string will be NULL, and the function will issue a CPLError(). The functions will be marked with CPL_DLL and considered part of the public GDAL/OGR API for use of applications as well as internal use. |