Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#4297 closed enhancement (wontfix)

add support for "history" metadataitem which keeps track of all GDAL command line operations

Reported by: etourigny Owned by: etourigny
Priority: normal Milestone:
Component: Utilities Version: unspecified
Severity: normal Keywords: netcdf, metadata, history
Cc: matt.wilkie@…, warmerdam

Description

A useful feature implemented in most netcdf files and applications is the use of a "history" attribute which logs every operation done on a file. It appears that ESRI offers a similar mechanism.

It would be very useful to have such a feature within GDAL. However, current code lacks a mechanism to record a command.

As part of improvements to the netcdf driver, I have implemented this into the CreateCopy?() function for the driver. The function GDALGeneralCmdLineProcessor() is called in every application , therefore it is a suitable target for saving the command line (without modifying any existing application). Attached is a patch to gdal.h and gdal_misc.cpp, with new functions GDALSetCmdLine() and GDALGetCmdLine() and a modification to GDALGeneralCmdLineProcessor().

An open question is how can the metadata be saved to file without having to re-write every application and/or every driver. Perhaps it could be embedded into GDALDataset or GDALDriver with per-driver options allowing the history to be saved? There are various occasions which do not warrant a save of command-line, for exampls calls to gdalinfo.

Any hints or comments are welcome.

Attachments (3)

patch-addcmdline.txt (3.4 KB) - added by etourigny 5 years ago.
patch for implementation of SetCmdLine?() and GetCmdLine?()
code-addcmdline-netcdf.txt (2.1 KB) - added by etourigny 5 years ago.
code for the implementation in netcdfdataset.cpp
patch-addcmdline2.txt (2.9 KB) - added by etourigny 5 years ago.

Download all attachments as: .zip

Change History (10)

Changed 5 years ago by etourigny

Attachment: patch-addcmdline.txt added

patch for implementation of SetCmdLine?() and GetCmdLine?()

comment:1 Changed 5 years ago by etourigny

Create() and CreateCopy?() are obvious entry points, but these exist for every driver, so there is no generic code.

How about when an existing file is modified? Perhaps GDALOpen() in update mode would be a suitable place?

What tests / variables can be used to check if the dataset can update metadata (i.e. without PAM as to not create a PAM file).

comment:2 Changed 5 years ago by Even Rouault

I'm not sure we really need to add a generic mechanism to add history. Personnaly, it seems like a marginal need, and sometimes even an undesirable one. I can imagine contexts where the data producer doesn't want to indicate the tool with which they produced the data. So I think this should be an optional choice, presumably a creation option (or configuration option if you also want to support the GDALOpen(,GA_Update) case) : REGISTER_HISTORY or whatever appropriate name that you could find ;-)

To answer your last question, there's no way to know if a dataset can update metadata. Unless to introduce a new capability (something similar to DCAP_VIRTUALIO) that drivers would advertize. There are not so many drivers that support metadata updating, and often it comes at a cost (for TIFF, changing the metadata requires rewriting the whole IFD, which lead to wasting bytes as the algorithm that allocates space is pretty dump and will loose the space of the previous IFD).

comment:3 Changed 5 years ago by etourigny

OK I understand. I will look into a Configuration option so that it doesn't happen without the user explicitly asking for it. I'll just add it to netcdf for now, probably gtiff also.

Do you object if I apply this patch to trunk, which adds 2 functions to gdal.h and 2 static vars (char[1024]) in gdal_misc.cpp ? I can also change it so that it uses on 1 variable. I should also add a short doc for the functions.

comment:4 Changed 5 years ago by Even Rouault

You might want to ask FrankW's opinion on this. Also I'm wondering if the whole stuff wouldn't require a small RFC so that all aspects of the picture are well captured, as it involves additions in core stuff and in various drivers.

I'd note that you should remove any C++ stuff (int bComplete=FALSE) from gdal.h. It would not work for C code that includes gdal.h

comment:5 Changed 5 years ago by etourigny

Cc: warmerdam added

Right sorry, I will remove the conditional test, and only store one variable.

Frank, can you please comment on this?

My intention is to add 2 functions to gdal.h, store the commandline in a static (hidden) variable in GDALGeneralCmdLineProcessor() and optionally (with a configoption) add the commandline to a metadataitem when calling Create(), CreateCopy?() and GDALOpen(,GA_Update).

If you want I can write some code for gtiff and prepare a small RFC.

Changed 5 years ago by etourigny

Attachment: code-addcmdline-netcdf.txt added

code for the implementation in netcdfdataset.cpp

Changed 5 years ago by etourigny

Attachment: patch-addcmdline2.txt added

comment:6 Changed 5 years ago by etourigny

Resolution: wontfix
Status: newclosed

Will not implement this as it causes security issues, if for example a user opened a database connection with a user name and password, that information would be saved to the netcdf history metadata.

comment:7 Changed 5 years ago by etourigny

The driver save a summary line when Create() or CreateCopy?() is called, with no details except the output file.

Note: See TracTickets for help on using tickets.