[WMS] Problems with default XML encoding being UTF-8
Subject: WMS Capabilities Encoding
Date: Mon, 04 Feb 2002 14:24:37 -0500
From: Daniel Morissette <morissette@dmsolutions.ca>
To: Stephen Lime <steve.lime@dnr.state.mn.us>
Steve,
Do you know if there is any reason why "UTF-8" was used as the default
encoding for WMS capabilities (in the xml version tag) or is it just a
coincidence?
<?xml version="1.0" encoding="UTF-8" ?>
We used libxml2 to parse some capabilities and have encountered problems
with a server that advertized encoding="UTF-8" but did contain
characters with French accents in ISO-8859-1 encoding. libxml2
complained that these were not valid UTF-8 chars and aborted the
parsing. We're still looking for a way to tell libxml2 to shut up and
ignore the encoding, but that will resolve the issue only for our
specific application. Other WMS clients may run into this misleading
encoding value.
I am thinking about changing the default encoding value in mapwms.c to
"ISO-8859-1" for a few reasons:
1- It is what Windoze uses by default for Western languages, so users
are more likely to be using that encoding by default when editing
mapfiles. It is also very common on other platforms.
2- It's a pure 8 bits encoding, as opposed to UTF-8 which uses 8 bits
for ASCII chars and 16 bits combinations for extended chars (which
is why we get conversion errors in our case). So in theory
labelling any file as ISO-8859-1 (even UTF-8 files) should be less
likely to produce errors.
Do you see any problem with this change? I would make the same change
in mapgml.c.
I was also thinking of adding a "wms_charset" (or wms_encoding) web
metadata to allow users to specify an encoding different from
ISO-8859-1.
What do you think?
Daniel