wiki:refactoring_harvesters

Version 2 (modified by mcoudert, 13 years ago) ( diff )

--

Refactor harvesting

Date 2011/03/21
Contact(s) Mathieu, Julien
Last edited
Status draft
Assigned to release probably 2.7
Resources

Overview

Here is a proposal for harvesting part refactoring. This module is quite old, with a lot of duplicate code (ie. "align" methods). Moreover, the client side is quite complex with lots of Ajax query, XSL transformations on the client side, old javascript librairies used...

Proposal Type

  • Type: GUI Change, Core Change
  • App: GeoNetwork
  • Module: Harvester, Kernel, Data Manager

Voting History

  • Vote proposed by X on Y, result was +/-n (m non-voting members).

Motivations

Remove duplicate code. Use librairies up to date. Store harvesting task into specific model.

Proposal

Both server side and client will be refactored based on an object model and on JSON object. We try to think this new model as generic as it could be in order to be easily extended by any new harvester.

This new model uses specifics tables in the database to store information about harvesters (HarvestingTask, HarvestingTaskResults...).

For now, we did a new interface based on ExtJs only for the OAI PMH harvester and we only move this harvester to the new model, there is some work to move all harvesters and to get them working into the trunk.

CREATE TABLE HarvestingTask
  (
	id int,
	uuid         varchar(250)   not null,
        name     varchar(32)    not null,
        harvestingType     varchar(32)    not null,
        validationMode     varchar(32)    not null,
        isrecurrent     char(1)        default 'n' not null,
        recurrentPeriod     int,
        lastRun   varchar(24),
        backup     varchar(32),
        status     varchar(32)    not null,
        isSynchronization   char(1)        default 'n' not null,
        isIncremental  char(1)        default 'n' not null,
        categoryid int not null,
	
        primary key(id),
	
        foreign key(categoryid) references Categories(id),
	
        unique(uuid)
  );

CREATE TABLE HarvestingTaskResult
  (
  	harvestingTaskResultId int,
        dateResult   varchar(24)    not null,
	total int,
	added int,
	updated int,
	unchanged int,
	locallyRemoved int,
	unknownSchema int,
	unretrievable int,
	badFormat int,
	doesNotValidate int,
	ignored int,
        errors text,
	harvestingTaskId int,
	
	primary key(harvestingTaskResultId),
	foreign key(harvestingTaskId) references HarvestingTask(id)
	
  );
  

CREATE TABLE HarvestingTaskConfiguration
  (
  	configurationId int,
        attr   varchar(24)    not null,
	val    varchar(250)    not null,
	harvestingTaskId int,
	
	primary key(configurationId),
	foreign key(harvestingTaskId) references HarvestingTask(id)
	
  );

And also a foreign key in the Metadata table : 
    foreign key(harvestingTask) references HarvestingTask(id),

Backwards Compatibility Issues

New libraries added

JSON librairies (Jackson) http://jackson.codehaus.org/

Guava for Utils on Java objects. http://code.google.com/p/guava-libraries/

Risks

Participants

  • Mathieu, Julien
  • OpenWIS project.
Note: See TracWiki for help on using the wiki.