Topics for GeoNetwork discussion in Bolsena 2010
author: Heikki Doeleman
For the 3rd time, the Bolsena OSGeo Hacking Event is going to be held. This page lists some ideas that can be discussed within the GeoNetwork faction.
Please add any idea you have here !
Also, it would be nice if we can have a number of presentations, like we did last year. Tell us about your projects, or about some interesting technology or tool, or whatever you want. Please volunteer your presentation proposals on this page too.
discussion topics
Community & Documentation
Id Priority (trac #) Topics Comments People interested In Discussion 1 Javadoc let's clean up the existing Javadoc and add new where it is missing. It'd be good to familiarize yourself with how Javadoc works, before doing this; e.g. there should be no blank line between the Javadoc and the method it is about; the first sentence should end in a period; and things like that. heikki: 4 Create a template first 4 1 Javadoc Automatically add the Javadoc pages to this wiki, updated from a Hudson build process? For all of the branches? heikki: 4, Jeroen major 6 1 Community This wiki is a bit of a mess, in my opinion. I think it would be good if we could put maybe 3 people in charge to firstly, clean it up and better structure it; and secondly, to try to keep it that way. heikki: 2, Jeroen: major General website not maintained so much. Merging the 2 websites could be a better option. eg. try Sphinx, look to a way to convert docbook to Sphinx 11 1 #234 Community Would it be an idea to appoint Language Managers for each of the supported translations? They would form the International Internationalization Committee (IIC, or CII in French) and they're summoned to maintain the i18n files for their language, before each new release. This might even be arranged in an OSGEO-wide manner. heikki: 3 Contact points
Meetings points to be discussed
Id Priority (trac #) Topics Comments People interested In Discussion 5 Coding rules Some people really like working with patches; other people prefer using short-lived SVN branches for a similar purpose. Can we all agree on doing it one way or the other? heikki: 4 To be discussed 13 RIA, Framework Any relevant software going on that might be useful for GeoNetwork? Think of MapProxy, Chiba and GeoJQuery. Other ones? heikki: 2, Just, Francois, Jose Decide which JS libs to use, Add libraries to use and why in proposal template 14 Editor enhancement In the NGR project, a modification to the code around the editor called Inflation and Vacuum is implemented, that makes it much easier to create valid metadata from scratch. In essence it takes the function of update-fixed-info.xsl
(which also tries to do some automatic adjustments to help things along) a whole seven miles further. What do the developers think of this? (I'll provide documentation sometime soon).heikki: 3 Demonstration
GeoNetwork
Id Priority (trac #) Topics Comments People interested In Discussion 2 harvesters Let's remove the harvesters' configuration from the "settings" table to its own, first-class-citizen table. Now, if you have many harvesters, it is nigh impossible to find anything in "settings". heikki: 1 | jose: critical | francois : major No project on-going on that 3 2 harvesters Related to before topic: rewrite harvesters client side code to remove unrequired ajax stuff. Just make "normal" forms for harvesters maintainment avoiding using ajax, except if really required for any functionality. heikki: 1 | jose: critical | francois : major 8 1 #232 Database Can we agree that we'll provide SQL scripts to create the database, and SQL scripts to fill it with sample data? And let's phase out those DDF files and the unfortunate GAST altogether? And that we provide update SQL scripts with new versions of GeoNetwork, both for changes to database schema and for content (like, Settings !) ? heikki: 3 Create a SQL script to do the migration from one release to another. Trigger on startup a migration task if DB version is older than the running GeoNetwork 9 2 Release Strategy Can we release GeoNetwork 3.0 (with the CSW/ebRIM interface)? Maybe we can have simultaneous "current releases" in both the GN2.x and GN3.x lineages, as do for example Lucene and Tomcat? heikki: 3 Integrate to main trunk. Make a 2.4.2+ebRim release 10 1 #234 Database Does anyone like the function of the installer that it overwrites your JDBC credentials with randomly generated values? I certainly don't, as my DB lives very much longer than the many GeoNetwork installations I always do, so I have to edit config.xml
everytime. How's about removing that?heikki: 2 Another option is to make the overwrite an optional installer pack. This has been done in some code in Australia - needs to be committed to trunk. 12 2 #233 Code Refactoring The class DataManager.java
and its sisterXMLSerializer.java
are in particularly bad shape, in my opinion. There are literally dozens of public methods that all do more or less the same thing. Of course it's not clearly documented why they are all there or when to use which. Would it be too drastic to propose that we keep 1 single public method for each of the functions createMetadata, updateMetadata, validateMetadata, etc. ?heikki: 1 | jose: critical Make the Java doc first 2 REST services and provide a Jeeves JSON outputs 16 Editor enhancement (XForms) GeoNetwork needs a range of metadata editors and the XForms Editor (from geonetworkui sandbox) should be available as part of this range. An XForms engine is an alternative technology that potentially hides details of HTML and JavaScript from developers. (The usefulness of the XForms editor will be determined to a large extent by how well it works across browsers and how responsive it is. What does the "potentially hides details" bit actually mean? That's just wishful thinking, and adding XForms means yet another complicated technology for developers to master. Justification/Action: Develop XForms interface as providing a user friendly interface with the flexibility to meet the needs of different users. How does it relate to Chiba? heikki: 3 No integration planned for the time being. No more work in the sandbox ? 17 Feedback / Enhancement GeoNetwork needs a range of metadata editors and the ANZMet Lite (a wizard based editor available for download from here) should be part of the toolkit. ANZMet Lite needs to be open sourced under (GPL) to be distributed with GeoNetwork. Comments: If the web interface were improved, the need for ANZMet Lite would be reduced. There is a need for “offline” metadata creation when researchers or data collectors are not connected to the Internet – this is where ANZMet Lite has unique value. Why not improve the existing GeoNetwork editor (see geocat.ch editor, merge some of the features into the trunk)? Justification/Action: Add ANZMet Lite as a user friendly, Wizard based PC editing interface with the flexibility to meet the needs of different users. Simon Pigot has already added GeoNetwork upload/download to ANZMet Lite. heikki: 4 You can find the information to interface an editor with GeoNetwork in the xml services section the Administration manual 18 API enhancements GeoNetwork services and JavaScript API need to be documented so that the user interface can be replaced and/or the existing functionality reused or customized. A different user interface skin should be easy to apply. The new Jeeves test framework offers an opportunity to document the inputs and outputs of the services. Action/Justification: The existing JavaScript API (web/geonetwork/scripts/core) needs to be documented and extended – existing code that doesn’t use the API needs to be refactored. Note: GeoNetwork xml services documentation exists in manual. heikki: 2 Guiwidgets sandbox provides a Javascript only user interface that can be installed as a separate module. It uses the ext-js javascript library which is well documented already and openlayers for map windows. 19 Code cleaning (client part) The technologies that are used in the user interface are not homogenous: XSLT, HTML and JavaScript are often mixed and hard to separate - this makes development and modification of the user interface difficult - but given the current architecture of GeoNetwork, a complete separation into components based on implementation language is impossible. Action: Separate the HTML, XML and JavaScript from each other so that a skilled interface designer does not need to know all three technologies to change the interface. heikki: 4 | jose: major Guiwidgets sandbox provides a Javascript only user interface that can be installed as a separate module. When this is integrated most if not all of the XSLT that produces HTML can be deprecated. 20 XML fragment Reusing fragments of metadata (XML) – “object reuse”. Fragments are implemented in various sandboxes. Metadata records can be composed from fragments using XLinks and there is an XLinks URL Resolver. Community action needs to be consolidated through the fragments proposal. Many organisations would like GeoNetwork to be able to harvest fragments from relational databases as they often generate full metadata records from relational databases using custom software. If the database information changes, these records then need to be re-harvested. Some organisations would also like to be able to edit the fragments in GeoNetwork and return them to the database from which they were harvested. Action/Justification: Integrating fragments of metadata that are managed in an external system (i.e. relational database, authentication directory). There is a mechanism for implementation for metadata fragment harvesting from relational databases via a WFS in the BlueNetMEST sandbox. This work needs to be consolidated with work in the geocat.ch and geosource sandboxes and added to the trunk. This work should also be extended to allow metadata fragments in the relational database to be updated after editing in GeoNetwork. Harvesting of fragments from authentication directories (eg. LDAP) should be added. All implemented except for harvesting from authentication directories. 21 Metadata versioning GeoNetwork needs some form of version control to track changes made to a metadata record over time. Action/Justification: This can be done inside the database without needing to externalise the metadata records. That way you can index and search on the old versions as well, if desired. Alternatively it could be done externally using perhaps a Java interface to subversion or through an interface to existing enterprise document management systems or perhaps using a different database approach for the documents eg. CouchDB. See also this approach to versioning WFS content? heikki: 3 Nice to have - maybe a collaboration between WMO and some Australian groups will produce this as WMO are also interested. 22 Community Some aspects of project planning for GeoNetwork are not visible to those outside the project steering committee. Action: Continue to adopt and implement OSGeo best practise (e.g. GeoServer). heikki: 2 Developer mailing list, IRC, trac and especially proposals are beginning to show more of this detail. 23 Documentation / Community Documentation for ‘Implementing GeoNetwork into your organisation’ should be provided. Rather than changing the perspective of the current documentation from "how to" from "it does", perhaps you can have different documentation for different audiences. The “how to” section of the Trac is very useful. Action: As the “how to” section of the OSGeo GeoNetwork trac site expands, it could be linked into the documentation. heikki: 2 Migration to sphinx is taking place - this will not only provide a more attractive presentation of the documentation, it will also allow text from these pages to be more easily included/linked. 24 Index enhancement GeoNetwork’s current Lucene field / index names and the mapping of metadata fields to these Lucene field names are ad hoc. This has the potential to prevent search interoperability between catalogues. Action: GeoNetwork should use an established mapping such as the geo profile of Z3950 (including attributes, data and relations) to define Lucene field names and the mapping from metadata elements to Lucene fields for all metadata schemas. heikki: 2 One interim option is to create a mapping table in the documentation but it is agreed that the names should be standardized (see ticket #409). 25 Validation enhancement XSD and Schematron Validators return errors that are meaningless to most users. Ability to customise the error messages easily would be useful. Action: Code containing XSD validation messages needs to be modified to include alternative or additional messages to those already in use. Schematron diagnostics specified in rules should be made more useful to users. setErrorHandler already in use - could me modded to support more meaningful messages? Francois has updated schematron to schematron validation and reporting language. heikki: 1 Improve XSD error reporting - functions to do this have been committed to 2.7 - see ticket #441 26 Documentation GeoNetwork requires a generic capability for element help, code list choices and suggestions to be linked to metadata guidelines provided with profiles/standards. Action: GeoNetwork to call documentation components from external sources (e.g. mouse over tool tips from profile/standard and code list documentation). Partially done in NGR by Jose On going 27 Metadata categories GeoNetwork categories are not related to metadata content – should be configurable from content. Action: GeoNetwork should be able to configure dynamic categories from a Lucene field. Eg. An administrator could create category names as unique values of the Lucene field name purpose (which might be mapped to gmd:purpose for ISO) – records would belong to the category described by purpose cf. also discussion on dynamic categories ie. categories that are placeholders for a saved search. heikki: 4 Categories are local to the GeoNetwork instance. This could be achieved within the index-fields XSLT mechanism. 28 Tag cloud GeoNetwork currently does not manage its own tag cloud / Folksonomy. Action: GeoNetwork could optionally manage these things internally rather than using a third party social networking site. Ticket 96 suggests a way of doing this heikki: 4 Good idea ! 29 Harvester / Network-crawling Network-crawling for geo-resources. Action: GeoNetwork needs to continue to be aware of and exploit initiatives for automatic harvesting of metadata from geo-resources. Eg. Metadata extraction tools such as the Talend Spatial Data Integrator suite etc heikki: 3 | jose: major Documenting services that could be used by extraction tools and providing harvesters such as the OGC WFS Feature harvester are steps in this ongoing task. 30 Metadata identifier GeoNetwork lacks the ability to consistently reproduce a unique identifier for the same geo resource (e.g. same dataset stored in two different locations) and/or use persistent identifier services. This is somewhere along the range from "easy enough" to "very difficult", need to spell out the precise details of the set of features you have in mind. Action: GeoNetwork needs to be able to generate, store and use metadata identifiers (eg, gmd:fileIdentifier) as well as data identifiers using the current stand alone UUID, but also (for data objects) MD5 (including what the checksum was generated from) and identifiers from external persistent identifier services (It should be possible to obtain persistent identifiers for both metadata and data from external persistent identifier services). heikki: 4 Accepted but no progress yet. 31 Interoperability / resources discovery Better inter-application interoperability. GeoNetwork needs to rethink the interoperability with the emerging FOSS such as the way that OpenLayers is designing / redeveloping its interface. e.g. use of GeoExt; e.g. GeoNetwork needs to provide simple mechanisms to allow discovered resources to be exploited and utilised in complementary open source software; e.g. drag-and-drop discovered resources into OpenLayers or GeoServer. Action: Better intra-application interoperability. GeoNetwork needs to coordinate the discovery of resources with the publication of those same resources in FOSS such as GeoServer. heikki: 2 | jose: major Two issues: the user interface is most likely to be addressed through projects such as the javascript guiwidgets sandbox. Automated publishing of metadata and datasets to GeoServer has been implemented and committed to 2.7 (see ticket #159) 32 Data management GeoNetwork assumes resources that are tagged as data for download in gmd:protocol are local. Action: GeoNetwork needs to allow for the fact that data tagged as data for download may not be local. heikki: 1 You can switch off GeoNetwork interpretation of gmd:protocol in 2.7 but further refinement is required. 33 Remote search Remote search across a number of sites returns a pre-selected number of hits from all remote sites (pre-selected number is a search option) – it should return these hits from each site. Action: Presentation of pre-selected number of hits from each remote site – may require more delving into JZKit. heikki: 3 Is it possible to make it abstract in order to extend to other protocols ? Possibly - awaits further investigation of JZKit3 34 Remote search Presentation of returned hits from remote sites may be very slow because search is limited by the speed of the slowest site. Action: Presentations of first returned hits from first responding remote site should not have to wait on the slowest site – may require more delving into JZKit. heikki: 3 This 'feature' is not present in JZKit3 and remote search has been restored in version 2.7 (see RemoteSearchForm+TabbedSearchForms. 35 Configuration There are too many configuration files in too many places eg. repositories.xml.tem and not all configuration options are supported by the existing admin interfaces. Action: Continue to consolidate configuration options in the system configuration interface. heikki: 2 This is ongoing. 36 Web map client There is no documentation for the implementation of alternative web map clients to Intermap and this makes it appear that the process is far harder than it actually is. Given the enthusiasm for an OpenLayers-based interface, what "interface" there currently is will probably soon be rapidly-evolving - if not replaced completely. Action: Document the interface that GeoNetwork uses to call a web map client so that sites can substitute their own. heikki: 1 | jose: critical Intermap has been replaced with OpenLayers - the functions that manipulate the OpenLayers client still need to be documented. 37 Distributed search Current capability of GeoNetwork to use distributed searching is given a low priority and not being developed when compared with the local search. Action: More consideration is required towards distributed searches and proper attention should be given to it. heikki: 2 Restored in 2.7 - see RemoteSearchForm+TabbedSearchForms 38 Distributed CSW search Distributed CSW searches are not available. Action: All OGC CSW standards and specifications should be implemented. heikki: 1 May be related to issue 33 39 XML validation Potential for remotely accessed information to be malicious. Action: GeoNetwork should validate all XML inputs and responses (eg. as it does for CSW) and check expected MIME types e.g. you ask for a GIF, you get a GIF. And indicate / reject non-conforming content with a warning? heikki: 1 No progress as yet. 40 Perfs enhancement (XSL) GeoNetwork does too much expensive processing of XML documents with XSLT. Action: Continue to seek out and remove unnecessary XSLT processing. heikki: 4 Ongoing - note that caching of compiled XSLT makes this less expensive 41 Configuration The way that GeoNetwork handles timeouts to remote requests is not configurable. Action: In GeoNetwork, timeout on remote requests e.g. WMS, should be configurable via the administration interface. heikki: 2 42 Project management / sandbox strategy Developments in "sand boxes" are not pushed back into the trunk in a timely manner. Action: The PSC should publish and enforce tighter processes relating to sandboxes. If possible, all sand box developments should be pushed back into the trunk in a predetermined time period (this should be a condition of being granted permission to set up a sandbox). If the sand box feature can't be pushed into the trunk because the trunk code doesn't have the capability (e.g. Pluggable profiles, pluggable skins) then priority should be given to developing that capability in the trunk so that the sand box feature can be included into the trunk (relates to project management comment/observation above). heikki: 1 43 Customization GeoNetwork is not distributed with multiple skins and it does not allow pluggable skins. Action: GeoNetwork should be released with multiple skins that can be optionally selected and are pluggable. These skins should be easily modified for an organisation’s needs and not be contained within the XSL or Java code. heikki: 3 See javascript guiwidgets sandbox. This is an example and the generic guiwidgets skin is already included as an option in the ANZMEST installer. 44 Installer / Application server There is no (installer) option to choose Tomcat as an alternative to Jetty. Comment: The current situation reflects GeoNetwork’s origins, particularly its funding bodies. Adding Tomcat and supporting it would require fixing some current defects - a good thing. But it would be a lot of work to maintain it, in particular, it would significantly increase the time required for testing and release preparation. Action: GeoNetwork should use the existing BlueNet MEST Tomcat configuration to provide an option within the installer to choose Tomcat instead of Jetty as the servlet container. Jetty should continue to be the default. heikki: 4 Done in the maven migration 45 Parent/child policy Parent/child/sibling bidirectional navigation for metadata records Finding the parent or child of given record is painful Action: Use of parent/child/sibling metadata records in the search results as a way to cope with varying levels of record granularity. For example, listing all children under the parent and presenting this within a collapsed tree GUI component. Perhaps provide a way to limit results to only parents and toggle this option on/off. heikki: 3 Not displayed in search results but in the metadata top right corner 46 Search Community is seeking a way to deal with varying granularity of metadata records, such that fine scale records don’t swamp fewer broad scale records. Many fine scale records (highly granular) make the metadata system more powerful (useful). Being forced to limit granularity only as a work around for basic search result presentation/visualisation would be a shame. This issue is not unique to GeoNetwork. 47 Vocab / Thesaurus Support for external vocabulary services Vocab services are becoming more common and an ability to connect to a custodians vocab service would be beneficial and reduce duplication and creation of stale vocabs/ thesauruses in GN. Action: An interface is required to query for vocab definitions from external sources. heikki: proposes OWL/ebRIM integration, see http://geonetwork.tv/owl/ These interfaces are slowly becoming available - see for example the taxonomic species name searching for the Marine Community Profile records in ANZMEST 51 Hierarchical keywords Keywords from external vocabularies should utilize hierarchical broader/narrower structures to ease searching capability. heikki: proposes http://geonetwork.tv/owl/, mathieu (major) 48 Reusable Objects Reusable (Controlled) Objects, allow fields to be reusable. Currently, if a user were to enter multiple records, for each record that user would have to re-enter “owner” their details. Worse, if that person’s details were to change, they remain the same in old records for which they have edited. The person’s details should be held as a managed object for which all records reference. This would allow the updating of details be reflected in each record that uses them - see the fragment harvesting of contact info above. heikki: 3 See composed metadata records proposal committed in 2.5 (see ticket #201) 49 HTTPS support HTTPS support. Currently all logins to GeoNetwork are going unsecured through HTTP and the GN configuration doesn’t allow the use of HTTPS enabling account sniffing attacks. heikki: 2 URLs are hard coded with http in a number of places in GeoNetwork - see ticket #448 50 CRS management EPSG code data from external service Action: At the moment EPSG codes have to be entered manually, but external online services are available with that data. GN should utilize this. Done in 2.5 52 Indexation enhancement Using Apache Tika to index content from files attached to metadata records in GeoNetwork? 53 Indexation enhancement Replace Lucene interface in GN with Apache Solr? heikki(blocker), mathieu(major) 54 Schema management SchemaManager - redesign proposed by Mathieu: * use org.geonetwork.utils.xsd.XSD in the project "geonetwork-services-ebrim" to read schemas and query contents for driving the editor * GN uses a number of schemas for validation purposes eg. in OAI, could these be managed by schemamanager so that they do not need to be retrieved from net? * sometimes a document may introduce a new schema eg. ListSets response in OAI harvester can introduce the oai_dc schema eg. when harvesting jOAI - if we need to validate these responses then creating a validator with a file based schema will cause the validation to fail as the schema is not present on disk, alternative to is create a validator with no file based schema which means that all schemas will be obtained from the net but use an entityResolver object to intercept those which are local so as to avoid unnecessary retrieval or perhaps use a Java Cache System (JCS) instance to cache all schemas locally like XLinks? heikki: 3 Done in 2.7 55 Multi-editing Attempting to introduce the ability to edit more than one document makes existing trunk interface confusing eg. editing documents in tabs both of which have login details. Things get out of sync - what to do? Maybe something like BlueNetMEST which is based on one window - the main search window (tabs for remote, advanced and mapviewer) with search results - editing/viewing by clicking on title opens editor/viewer in new window (multiediting is supported), clicking any of the menu options on main screen uses modalbox dialog and separate windows to keep search interface and results window untouched, log out/log in closes all editor/viewer windows to close, if editing in progress log out not allowed - this is not perfect but might be a way of thinking about how to introduce things like multiediting to trunk. heikki: 5 Done in 2.7 56 Customization Improve CSS management, clean CSS file and references to unused styles, replacing tables by divs, discuss on ThemeCustomization heikki: 1 | jose: critical | francois : major 58 Settings management While at it can we change the code so that you can save settings from the GUI even if not all expected settings are present in your database? heikki: 1 59 SMTP Converge mail configuration (feedback + wmc.mailcontext use same config) heikki: 1 Replacing Intermap should fix that 60 SMTP enable configuration of smtp servers that require authentication. i.e. configure autherticate/secure y/n, username, password. heikki: 1 #239 61 CatalogSearcher is not an extension of MetaSearcher LuceneSearcher, UnusedSearcher and Z3950Searcher are all extensions of MetaSearcher but CatalogSearcher for CSW is not. This is ok until CatalogSearcher is put into the session as Geonet.Session.SEARCH_RESULT where up until now only the extensions of MetaSearcher were expected. Because CatalogSearcher is not a MetaSearcher the nice polymorphism/dynamic binding used for example when closing the searcher (get a MetaSearcher and do searcher.close()) is broken and we need to do ugly if (object instanceof LuceneSearcher) then .... else etc etc etc. Action: make CatalogSearcher an extension of MetaSearcher or put it somewhere else in the session. #238
Architecture
Id Priority (trac #) Topics Comments People interested In Discussion 57 1 Maven migration Move to Maven as described here : Maven Mathieu (major) | heikki: 1 | jose: major | francois : major 15 Framework / Refactoring Whether it's going to happen soon or not, I still think it good to repeat the subject of what to choose if we ever get to a drastic make-over of current GeoNetwork code, especially in terms of (1) GUI: use Wicket? GWT? (2) MVC: use Struts? Spring MVC? (3) Persistence: use Hibernate? use JPA/EJB3? (4) Web Services: use Axis2? Jax-RS? heikki: 3 | jose: major
Comments
- From 16 to 58 : Topics extracted from Australia/New Zealand Community GeoNetwork Feedback
- 59 and 60 : added on request of Gavin Fleming
- 61 added by Simon
- More 'way out' stuff :-)?
- Verbal annotations for YouTube videos - verbal annotations for metadata records?
- CouchDB to hold metadata documents, couchdb-lucene to build lucene indexes, geocouch for spatial queries? Could couchdb be worth investigating further?
- GeoNetwork should be to metadata and data management as iTunes is to managing music or TimeMachine is to backup? ie. we have a great engine what about building the 'killer' interface? who would do this?
presentation proposals
- Australia/NZ Community GeoNetwork Feedback: Simon Pigot - see attached document (basic points extracted to discussion topics)
Last modified
14 years ago
Last modified on 03/24/11 11:01:06
Attachments (1)
-
bolsena.jpg
(36.3 KB
) - added by 15 years ago.
Bolsena
Download all attachments as: .zip
Note:
See TracWiki
for help on using the wiki.