| 23 | | The CPLString class will now be assumed to be a potentially multi-byte |
| 24 | | encoded string, but with no effect within the CPLString to keep track of |
| 25 | | the encoding it is in. This is left to the higher level code for now. |
| 26 | | |
| 27 | | However, the CPLString is extended with some convenient mechanisms for |
| 28 | | recoding, and for conversion of UTF-8 to/from "wchar_t" (aka UCS-2). |
| 29 | | |
| 30 | | It is stressed that normal initialization of a CPLString from "const char *" |
| 31 | | does not attempt to do any conversions to/from UTF-8. This rule is kept, |
| 32 | | in part to minimize string processing costs for the common case. When encoding |
| 33 | | is believed to be an issue the calling code must keep track. |
| | 23 | The following three C callable functions will be introduced for recoding strings, and for converting between wchar_t (wide character) and char (multi-byte) formats: |
| 37 | | // Convert the internal string to a new encoding. |
| 38 | | char* CPLString::recode( const char *pszSrcEncoding, const char *pszDstEncoding ); |
| 39 | | |
| 40 | | // Set value based on input encoded string with CPLString set to UTF-8. |
| 41 | | // This is equivelent to normal setting, and then a recode() with a destination |
| 42 | | // encoding of "UTF-8" and thus is just for convenience. |
| 43 | | |
| 44 | | CPLString &CPLString::SetAsUTF8( const char *pszInput, const char *pszEncoding = "" ); |
| 45 | | |
| 46 | | // Set value based on input encoded string with CPLString set to UTF-8. |
| 47 | | // This is equivelent to normal setting, and then a recode() with a destination |
| 48 | | // encoding of "UTF-8" and thus is just for convenience. |
| 49 | | |
| 50 | | CPLString &CPLString::SetAsUTF8( const wchar_t *pszInput, const char *pszEncoding = "UCS-2" ); |
| 51 | | |
| 52 | | // Construct UTF-8 string object from array of wchar_t elements. |
| 53 | | CPLString::CPLString( const wchar_t*pszInput, const char *pszEncoding = "UCS-2" ); |
| 54 | | |
| 55 | | // Returns a wchar_t string which becomes the ownership responsibility of |
| 56 | | // the caller (free with CPLFree()). It is assumed the CPLString is UTF-8. |
| 57 | | wchar_t *CPLString::GetAsWChar( const char *pszDstEncoding = "UCS-2" ); |
| | 29 | char *CPLRecodeFromWChar( const wchar_t *pwszSource, |
| | 30 | const char *pszSrcEncoding, |
| | 31 | const char *pszDstEncoding ); |
| | 32 | wchar_t *CPLRecodeToWChar( const char *pszSource, |
| | 33 | const char *pszSrcEncoding, |
| | 34 | const char *pszDstEncoding ); |
| 60 | | I have specifically avoided additional constructors or casting operators to |
| 61 | | to avoid any possible overloading ambiguities or complication in maintaining |
| 62 | | extra state in the CPLString. Such services can be added in the future based |
| 63 | | on the above methods if desired. |
| | 37 | In each case the returned string is zero terminated, as is the input string, and the returned string should be deallocated with CPLFree(). |
| | 38 | In case of error the returned string will be NULL, and the function will issue a CPLError(). The functions will be marked with CPL_DLL and considered part of the public GDAL/OGR API for use of applications as well as internal use. |