Opened 11 years ago

Closed 11 years ago

#1231 closed defect (fixed)

[patch] Geopublish doesnt properly send accentued chars in the title

Reported by: landry Owned by: geonetwork-devel@…
Priority: major Milestone: v2.10.0 RC0
Component: General Version: v2.8.0RC2
Keywords: Cc:

Description

An issue i've been tracking down since some weeks, if my metadata title contains accentued chars they're not properly sent to GeoServer when publishing.

Espaces Naturels Sensibles - Département du Cantal - 2013

becomes this in GeoServer layer title

Espaces Naturels Sensibles - D?partement du Cantal - 2013

GeoServer itself properly supports POST/PUT'ing utf-8 encoded xml (ie curl -X -H"Content-type: text/xml" -d@/tmp/t.xml http://localhost:9080/wxs/rest/workspaces/public/datastores/CG15_2013_ENS/featuretypes/CG15_2013_ENS with /tmp/t.xml containing utf-8 chars sets a correctly accentued title/abstract).

So the issue is on Geonetwork side - afaict, when GET'ing geoserver.publisher (action=CREATE) upon publish, the chars are url-encoded :

GET /geocat/srv/fre/geoserver.publisher?_dc=1362049081712&metadataId=355&metadataUuid=a6a5a793-b9d1-4351-974a-fffd20644614&metadataTitle=Espaces%20Naturels%20Sensibles%20-%20%5CnD%C3%A9partement%20du%20Cantal%20-%202013

But when you're in the publisher backend in web/src/main/java/org/fao/geonet/services/publisher/Do.java Exec() the metadataTitle doesnt contain utf-8 chars anymore.

Log.debug(MODULE, "metadataTitle="+metadataTitle);

=>

DEBUG [geonetwork.GeoServerPublisher] - metadataTitle=Espaces Naturels Sensibles - D?partement du Cantal - 2013

So at some point the encoding is lost before PUT'ing/POST'ing xml to GeoServer.

Attachments (1)

0003-Create-StringEntityRequest-with-the-proper-contentTy.patch (1.3 KB ) - added by landry 11 years ago.
Create StringEntityRequest with the proper contentType and encoding

Download all attachments as: .zip

Change History (4)

comment:1 by landry, 11 years ago

There is something definitely weird going on, since my tomcat has URIEncoding properly set to UTF-8 in server.xml, and using URLDecoder.decode on the query string returns ?? for accentued letters. A standalone example in a simple java file works.

import java.net.URLDecoder;
class t {
public static void main (String [] args) {
try {
    System.out.println( java.net.URLDecoder.decode(args[0],"UTF-8") );
} catch (Exception e) {}
}
}

java t foo=bar%C3%A9%C3%A0                                                          
foo=baréà

After adding some debug code to jeeves' ServiceRequestFactory.java:

Log.debug(Log.REQUEST,"Decoded queryString "+ URLDecoder.decode(req.getQueryString(), "UTF-8"));
Log.debug(Log.REQUEST, "Query string  : "+ req.getQueryString());

[jeeves.request] - Query string  : foo=bar%C3%A9%C3%A0
[jeeves.request] - Decoded queryString foo=bar??

comment:2 by landry, 11 years ago

Summary: Geopublish doesnt properly send accentued chars in the title[patch] Geopublish doesnt properly send accentued chars in the title

Finally nailed it, it was a tricky one - in fact after comparing the String lenght and converting it as a byte array, realized the utf chars were still properly in the string, not rendered in debug log. So it was down the call stack... and StringRequestEntity ctor was the culprit - apparently it defaults to encode as default HTTP content charset (ISO-8859-1).

So the following diff properly encodes all the xml sent by sendREST :

--- a/web/src/main/java/org/fao/geonet/services/publisher/GeoServerRest.java
+++ b/web/src/main/java/org/fao/geonet/services/publisher/GeoServerRest.java
@@ -661,7 +658,7 @@ public class GeoServerRest {
                        }
                        if (postData != null) {
                                ((PutMethod) m).setRequestEntity(new StringRequestEntity(
-                                               postData));
+                                               postData, contentType, "UTF-8"));
                        }
                } else if (method.equals(METHOD_DELETE)) {
                        m = new DeleteMethod(url);
@@ -669,7 +666,7 @@ public class GeoServerRest {
                        m = new PostMethod(url);
                        if (postData != null) {
                                ((PostMethod) m).setRequestEntity(new StringRequestEntity(
-                                               postData));
+                                               postData, contentType, "UTF-8"));
                        }
 
                } else {

And the layer definition in GeoServer correctly gets all the accentued chars - which are properly displayed in the WMS getCapability document.

by landry, 11 years ago

Create StringEntityRequest with the proper contentType and encoding

comment:3 by ianwallen, 11 years ago

Resolution: fixed
Status: newclosed

Commit in master ad83740d136a7f7423388b785489e3560a9b0db8

Note: See TracTickets for help on using tickets.