codesprint/201106-bolsena

Bolsena 2011 - draft agenda

This page is a written before the event. See Bolsena 2011 for the final agenda

  1. Community
    1. Committee
    2. Copyright notice in code and Licensing
    3. Contributors
    4. Community tools
    5. Community mini codesprint
  2. Documentation
  3. GeoNetwork
    1. Discussion on architecture updates
    2. New features & fixes
    3. Wishlist
    4. R&D
    5. Presentations

For the 4th time, the  Bolsena OSGeo Hacking Event is going to be held. This page lists some ideas that can be discussed within the GeoNetwork faction.

Please add any idea you have here !

View from the monastery

Community

Committee

  • Update AB ?
  • Propose PSC update ?
    • Rather than remove members of the PSC who did contribute but no longer contribute, perhaps we could create a project reference group so that they could still be formally associated with the project? (We don't want to lose contact with them and we want them to have a role).
  • Propose advisory board update ?
    • The organisations currently represented on the advisory board could be expanded/changed to include others? Members of the PSC could identify and submit names of representatives from these organisations for consideration by the PSC.
    • This may be a way of getting organisations that want changes (eg. those that submitted the long list to Bolsena last year) to get involved and commit $$ to make their list of requirements happen.
  • See & update:

Copyright notice in code and Licensing

(Added by Simon 19/06/2011)

Currently the code has a variety of copyright statements but most (C) UN FAO, some new code has (C) GeoNetwork. The problem is that if anyone violates the copyright or the GPL licence then we need to have the copyright assigned to a legal entity that has the rights, resources and the interest as only the copyright holder has the power to act against violations. Presumably the FAO is not interested in doing this anymore? and definitely not the non-existent GeoNetwork legal entity :-). GPL licensing means that we could assign copyright to the Free Software Foundation as per  http://www.gnu.org/licenses/why-assign.html.

To do this we need:

  • list of past, present contributors
  • assignment of copyright from all contributors and copyright holders
  • all future contributions to assign copyright to the FSF? (add to conditions that have to be accepted by new committers? committer conditions)

Motivation:

  • recently at least one commercial operation has begun distributing a modified version of GeoNetwork - are they obeying the GPL licence conditions that the source code be distributed with their modified version of GeoNetwork?

GPL Licence problems:

Others are cloning the GeoNetwork code, making improvements and have not (and seem to be not willing to) make their improvements available to the community in the form of source code. It's unfortunate that neither GPLv2 or GPLv3 Affero can guarantee that these modifications are returned to the community. GNU Affero is stricter as it has a few more conditions on how source code has to be made available but it still allows the person/organisation making the modifications to make them available to the public for a charge (ie. $$).

Any other comments? (I'm definitely not an expert on any of this and I can see quite a few people probably thinking this is all too hard :-) - if so then perhaps we should abandon the GNU GPL licence (requires agreement from contributors) and use some other licence eg. apache?).

Contributors

(Added by Simon 19/06/2011)

I'd like to see a list of contributors maintained on  http://geonetwork-opensource.org/contributors and I'd be prepared to help out in assembling and maintaining such a list. List of contributors and their organisations could also be shown by the installer flash screen? Contributors page could have information similar to that provided on  http://geoserver.org/display/GEOS/Contributors.

Community tools

  • Community tools (sourceforge.net, ML, ...) should be accessible/manageable by at least 2 persons from the community (check that)

Community mini codesprint

  • 2011 was the first time we've had a mini-codesprint, which is an event taking place some months before the traditional Bolsena meeting. In spite of the rather sparse attendance by GeoNetwork developers, some  great results were achieved, such as good improvements to the Lucene analysis, impressive performance gains in search results display, and general refactoring of dusty code, amongst other things. Thanks to François Prunayre for conceiving and hosting this event. Let's do it every year, and if you weren't there in 2011, make sure you're there in 2012.

Documentation

  • Update  GeoNetwork logo ? There is some discussion that GeoNetwork does not really have a logo; a strong, graphical image that could act as a powerful symbol for GeoNetwork. The current graphics as on the default banner can hardly be considered a 'logo'; no need at all to have the project's name in the logo.
  • Name change : isn't it better to call GeoNetwork "GeoNetwork", rather than "GeoNetwork open source" ? The "open source" suffix adds nothing meaningful and is rather non-standard for a name, as it's not capitalized. If size does matter, it's not in projects' names !
  • Publish French documentation on the website
  • Include a section in the developers manual on schemaPlugins - use the skeletal slides from  http://geonetwork.globaldial.com/testdownloads/GeoNetworkSchemaPlugins.pdf as an outline. This needs to be done for 2.7.

GeoNetwork

  • On going projects ?

(Added by Simon 19/06/2011)

  • An example on going project is an oceanographic information system involving GeoNetwork and the Kepler Science Workflow project ( https://kepler-project.org). We're using Francois' new guiwidgets as the human interface and CSW+ as the programmatic interface between Kepler and GeoNetwork. The development work is not particularly interesting to GeoNetwork developers as it involves adding a CSW client to the Kepler workflow toolset (which will be added to the Kepler module repository once I get round to finishing off the proposal). However, it does demonstrate a way in which the metadata in a GeoNetwork catalog can be searched using CSW to build data source actors in Kepler (eg. data from a database table). It also raises the issue of how a metadata author might store a subset query within the online resource that links to the data or whether such a query should be constructed from service metadata record describing the service that delivers the data? Next step is to allow kepler users to create a metadata record that will describe and contain a linkage to their science workflow and upload that to GeoNetwork. Presentation can be found on  http://geonetwork.globaldial.com/testdownloads/NOISPresentationJune2011.pdf.

  • Another on going project is the redevelopment of the Australian Spatial Data Directory (ASDD) (see  http://www.mymaps.gov.au for a test). This is also using Francois' new guiwidgets sandbox. Idea is to redevelop the old ASDD into a new GeoNetwork based tool. guiwidgets approach was chosen because it effectively reduces the number of technologies that a user interface designer needs to know to Javascript and a little XML. The metadata records (approx 21,000 at present) are harvested from around organisations around Australia using the Z3950 harvester. The project uses 2.7 for schema plugin support and guiwidgets.

Discussion on architecture updates

Components

Agreement on new or existing dependencies

  • Security layer require :
  • Javascript : heikki thinks that  jQuery is in many ways superior to Ext.js, particularly in its programming model and the plethora of readily-available  plugins for jQuery. Yet we need Ext because of GeoExt. Can we not isolate use of Ext to the mapviewer for as long as there is no jQuery equivalent to GeoExt, and use jQuery everywhere else ?

New features & fixes

  • Security layer & GAAP
  • JS Widgets :
    • file upload : where we have file upload functions, it would be nice if we provide a drag & drop function instead of the old, old, filechooser -- something like Gmail has it. Of course with perfect degrading for users of certain browsers. A very good one is  jQuery File Upload
  • i18n : it's a little hell to add a new translation of GeoNetwork. It's not in 1 place where you put some translated strings to i18n keys -- no, all sorts of files including js and sql and xslt files need to be updated. There's no documentation on all the places you need to touch. We shouldn't write that documentation either, now: instead we should make it very much more easy to add a new translation for GeoNetwork.

  • Schema plugins - currently support plugging in metadata schema/profile composed of xslts, XSDs etc (ie. whatever is in a GeoNetwork schema directory) - also needs to support adding jeeves services eg. services to search and retrieve taxonomic metadata from the Australian Plant and Fauna databases for Marine Community Profile or to retrieve keywords from an online Oceanographic thesaurus (neither of these databases use standard interfaces but the searches are important to Marine Community Profile members) - suggestions/thoughts on how to add services to jeeves dynamically?
  • Tracking changes to metadata - need to know who, what and when changes are made to metadata. How to do this in GeoNetwork? Perhaps a database driven approach which uses a metadata history table that is triggered from the metadata table? Some comparison of metadata records from different versions would still be necessary to determine changes because in it's simplest form the history table would have a complete copy of the metadata record in it.
  • A rewrite of the harvester module/component is required at some stage (see proposal  http://trac.osgeo.org/geonetwork/wiki/refactoring_harvesters). However for 2.8 this is a lot of work to get done between now and July when the RC process should start (if we are to release 2.8 in August?) especially if we wanted to keep the documentation up to date. Any word on whether this is still proposed for 2.7?
  • ...

Wishlist

  • Maybe some of these will get funded....which is perhaps the biggest wish of them all!

Wish ID Wish Explanation Anyone else interested?
1 faceted search Search based on stats for terms indexed in Lucene - eg of faceted search is  http://well-formed-data.net/experiments/elastic_lists/ and most likely support for faceted search is the aim of the proposal by Francois  http://trac.osgeo.org/geonetwork/wiki/NarrowYourSearchWidget - see also SOLR above
2 search customisation saving and replaying searches
3 fragments/subtemplate editing Already the subject of proposal  http://trac.osgeo.org/geonetwork/wiki/proposals/SubTemplates
4 tracking changes to metadata See initial ideas for a simple history table above
5 investigate ESRI GeoPortal and GeoNetwork interoperability This would be useful not only for GeoNetwork developers in terms of looking at how someone else has implemented a catalog, but also to do a feature comparison
6 Better documentation of GeoNetwork api Would be very useful for those wishing to get started on making enhancements and modifications
7 Support for schemaPlugins Added in proposal  http://trac.osgeo.org/geonetwork/wiki/pluginprofiles but may need additional work to support services that relate to a schema/profile - see above
8 Support for time zones in temporal search terms Calendar widget needs to include timezone specification
9 Better support for selecting default editor tab Although this appears to be supported in config-gui.xml, strange things happen if you switch off the simple mode editor for example (this is actually a bug I think)
10 Harvested records are not obvious in search results Should appear in search results as visual indicator
11 Massive operations on thumbnails Assign a thumbnail to, delete a thumbnail from one or more records
12 Editor tabs to be redesigned to suggest entry workflow Should suggest a workflow through the metadata rather than an arbitrary grouping of elements based on the standard - an example of what is desired can be seen in the ANZMETLite editor (written in VB.Net) - download from  http://geonetwork.globaldial.com/testdownloads/ANZMETLite_MCP_ALA.zip (warning: will only work under windows)
13 More thought on how to integrate GeoNetwork with existing business systems in an organisation Particularly important if GeoNetwork is seen as a tool for managing an organisations metadata as opposed to simply publishing an organisations metadata to the rest of the world
14 Harvest History mechanism for existing harvester manager See proposal at  http://trac.osgeo.org/geonetwork/wiki/HarvestingHistory which has been implemented in the ANZMEST sandbox - trunk support depends on whether the proposal to rewrite the harvesters and harvester interface is to be completed for 2.7 or not
15 Harvest Z3950 configuration from ISO19119 service metadata records Rather than include a fixed configuration file describing Z3950 services, this configuration should be built by harvesting ISO19119 records and creating the config file - implemented in ANZMEST sandbox

R&D

  • Ontologies & thesaurus
    • Custom Lucene Analyzer
    • Thesaurus browser widget: like  ArborJS ?
  • CSW 3.0
  •  Hadoop,  NoSQL : useful to investigate for GN's future ?
    • Some time was spent at Bolsena 2010 on investigating one nosql approach - apache couchdb (see presentation at  http://geonetwork.globaldial.com/testdownloads/GeoNetworkCouchDB.pdf) - it has many interesting and attractive features for developers and does many things that GeoNetwork does. This presentation provided the opportunity to discuss the nosql direction with various interested parties in AU. The feedback depended on where and how they were trying to integrate GeoNetwork within their organisation:
      • Feedback on nosql direction from those that want to use GeoNetwork to manage their metadata in relational databases is not so good: almost all of these potential users of GeoNetwork (at least those I've spoken to in AU anyway!) want GeoNetwork to evolve toward using an object-relational mapping as they believe that would allow them to integrate GeoNetwork with the metadata they already have in their relational databases.
      • Feedback on nosql direction from those that are happy to continue managing metadata using their existing tools and databases but want to publish metadata to GeoNetwork because of its interoperability and standards support: they probably don't care as they are happy to use components like the WFS GetFeature? harvester to build metadata records from the metadata in their relational databases.
    • Depends on where we want/or where we see GeoNetwork fitting within existing organisations?

Presentations

confirmed:

  • GAAP Security module, by Jose García & Heikki Doeleman

requested (who wants to do it?) :

  • SOLR in practice

Attachments