Opened 7 years ago

Closed 15 months ago

#4903 closed enhancement (fixed)

[PATCH] Support for writing RasterAttributeTables by default for HFA/Imagine

Reported by: Robert Coup Owned by: Robert Coup
Priority: normal Milestone:
Component: GDAL_Raster Version: 1.9.2
Severity: normal Keywords: imagine rat rasterattributetable
Cc: thjc

Description

ERDAS Imagine (HFA) .img files can have RasterAttributeTable data, which is read via GDAL and the GDALRasterAttributeTable class. But HFADataset::CreateCopy() doesn't write out RAT data by default (via SetDefaultRAT()), so gdal_translate -of HFA some.img another.img doesn't preserve RAT data.

Attachments (11)

4903.diff (1.1 KB) - added by Robert Coup 7 years ago.
Initial patch
4903.1.diff (3.4 KB) - added by Robert Coup 7 years ago.
second patch, makes gdalwarp copy RATs as well when creating the output file.
4903.3.diff (8.1 KB) - added by Robert Coup 7 years ago.
Support for gdal_translate when not using CreateCopy?() (ie. via VRT), and preserving RATs in gtiff->gtiff conversions
4903.4.diff (10.8 KB) - added by thjc 7 years ago.
0001-Better-support-for-writing-RATs-4903-4903.5.diff.patch (329.8 KB) - added by thjc 7 years ago.
0001-Better-support-for-writing-RATs-4903.6.txt (32.7 KB) - added by thjc 7 years ago.
patchset v6
4903.7.diff (26.6 KB) - added by cdestigter 6 years ago.
4903.8.patch (40.0 KB) - added by cdestigter 5 years ago.
New patch updated to apply to latest trunk
4903.9.patch (39.7 KB) - added by cdestigter 5 years ago.
Removes a std::cout
4903.10.patch (39.5 KB) - added by cdestigter 5 years ago.
clean for latest trunk
4903.11.patch (40.0 KB) - added by stephend 5 years ago.
Update for latest trunk, KEA driver support

Download all attachments as: .zip

Change History (25)

Changed 7 years ago by Robert Coup

Attachment: 4903.diff added

Initial patch

comment:1 Changed 7 years ago by Robert Coup

Status: newassigned

Patch against trunk attached.

Now the only differences in gdalinfo output after an .img > .img conversion using gdal_translate are:

# no overviews (expected) # the type of the 'Histogram' field is 1 (float) instead of 0 (integer) # precision changes on float fields (eg. 0.65 -> 0.6509803921568628)

None of which are real problems afaict.

I'm getting some more intensive testing done by Imagine users on a wide range of datasets, and I'll add unit tests before committing this.

Changed 7 years ago by Robert Coup

Attachment: 4903.1.diff added

second patch, makes gdalwarp copy RATs as well when creating the output file.

Changed 7 years ago by Robert Coup

Attachment: 4903.3.diff added

Support for gdal_translate when not using CreateCopy?() (ie. via VRT), and preserving RATs in gtiff->gtiff conversions

comment:2 Changed 7 years ago by Even Rouault

A few comments about 4903.3.diff :

  • for gdalwarp, resampling methods other than nearest will alter original pixel values, possibly creating new ones. It is doubtful than copying the source RAT makes sense in that case
  • for the VRT driver, serialization/deserialization into/from the VRT XML should also be implemented

Changed 7 years ago by thjc

Attachment: 4903.4.diff added

comment:3 Changed 7 years ago by thjc

attached patch v4 that drops RAT's when using non nearest/mode transformations in gdalwarp if the layer is thematic.

General discussion around serialisation was that this can be put off until there is a request for the feature. In particular large RAT's may not work well in the VRT XML (see rfc 40)

comment:4 Changed 7 years ago by Even Rouault

4903.4.diff looks better. But I still have additional thoughts :

1) I think that the explicit copying of RAT is not necessary in the GTiff driver. It looks like it is already done by the PAM mechanism (gcore/gdalpamrasterband.cpp lines 613-624) thanks to the poDS->CloneInfo?( poSrcDS, GCIF_PAM_DEFAULT & ~GCIF_MASK ); call at line 8521. I've just tried "gdal_translate ../autotest/gcore/data/byte.img out.tif" with current trunk and found that the PAM out.tif.aux.xml contains the histogram as a RAT (so RATs are already serialized in PAM, which could cause problem for large RATs...). Not sure if this applies to the HFA driver too.

2) In gdal_translate.cpp and gdalwarp.cpp, there might be cases where RAT copying is inappropriate. For example for an histogram RAT, if you don't copy the whole extent of the source raster, then the source histogram isn't relevant. In gdal_translate.cpp, in CopyBandInfo?(), there's a flag that is passed to know if statistics can be copied. It could be used also in conjunction of examining the RAT nature to determine if copying it is appropriate. In gdalwarp, there's no such mechanism, but generally all operations done in gdalwarp will likely invalidate an existing histogram.

3) What would be great is to add new test cases for regression purpose in autotest/utilities/test_gdal_translate.py and autotest/utilities/test_gdalwarp.py to ensure that the RAT is copied (or not) during those operations.

comment:5 Changed 7 years ago by thjc

1) I will test this and confirm, Rob recalls that hfa->tiff->hfa didn't work without this code, maybe there is something subtle there, or maybe trunk has changed since his testing.

2) This starts to get quite complex, but here are some thoughts...

We have a few different cases...the first major pivot is whether the RAT is related to a thematic or an athematic layer. RAT data associated with a thematic layer is much more britle and is more likely to be useless after transformations. In order to simplify this I think we could promote thematic/athematic to a first class property and build it into the gdal_rat structure.

athematic (or continuous) RAT layers are pretty robust. Most transformations (including color interpolation) should still work out meaningful to some extent. Colour space scaling potentially messes with the data, but even then if its a relative scale it can still make sense...so I think we should preserve the RAT data for athematic layers.

RATs for thematic layers are fine with cropping etc, you may end up with some redundant data but the rest should be fine. colour interpolation with nearest (or median) should also be fine, so some gdalwarp operations still make sense. Any scaling or non nearest colour interpolation will of invalidate an athematic RAT.

Then we come to histograms, they are invalidated by any operation except 1 to 1 translation to another format. This is true whether it is thematic or athematic. However, we should be able to recalculate the histogram data anyway.

So to summarise:

A) track thematic/athematic in the RAT class

B) discard non histogram RAT for thematic bands where colour level changes occur (scaling, or non nearest interpolation)

C) discard histogram data for anything except 1:1 translations, and recalculate for formats that have built in histogram concept (i.e. HFA). (The alternative here is to recalculate for all output formats if we had histogram to start with...)

D) never discard non histogram RAT data for athematic layers (it might be useful, we can't really judge)

3) I will work on some test cases.

comment:6 Changed 7 years ago by thjc

Cc: thjc added

comment:7 Changed 7 years ago by thjc

okay, new patchset (number 5)

1) Your right, seems to work fine with this addition removed, so its removed now.

2) Added RAT Table Type and rules around discarding data in gdalwarp/gdal_translate

3) Added some test cases for gdal_translate. Existing test cases for gdalwarp seem to cover the basics of whether statistics are kept or not already...

comment:8 Changed 7 years ago by Even Rouault

I think that poNewRAT is leaked in gdal_translate.cpp

Would be cool to add @since GDAL 2.0 in the doxygen of the new methods/functions.

For the sake of readability of the changeset, I'd prefer the regenerated Python bindings not to be part of it (or stick with SWIG 1.3.40 for now)

I think it would be appropriate to advertize on the mailing list about the proposed addition of the RAT Table Type to have more eyes on that. Are we sure that thematic is the appropriate default value ? The doc could likely be extended to explain a bit more about those concepts.

Changed 7 years ago by thjc

patchset v6

comment:9 Changed 7 years ago by thjc

Fixed leak, cleaned SWIG files out and added GDAL 2.0 comments.

Changed 6 years ago by cdestigter

Attachment: 4903.7.diff added

comment:10 Changed 6 years ago by cdestigter

New patch fixes a bug in SetDefaultRAT() (it wrote the RAT to disk but didn't actually set band->poDefaultRAT )

comment:11 Changed 6 years ago by Even Rouault

Note that RFC 40 patches have landed in trunk now. Your patch probably needs "some" adaptation.

Changed 5 years ago by cdestigter

Attachment: 4903.8.patch added

New patch updated to apply to latest trunk

Changed 5 years ago by cdestigter

Attachment: 4903.9.patch added

Removes a std::cout

Changed 5 years ago by cdestigter

Attachment: 4903.10.patch added

clean for latest trunk

comment:12 Changed 5 years ago by Jukka Rahkonen

Summary: Support for writing RasterAttributeTables by default for HFA/Imagine[PATCH] Support for writing RasterAttributeTables by default for HFA/Imagine

Thanks for the contribution. I added PATCH to the summary for making it easier to find the patch for local compilation before someone finds time for making a review and integration.

comment:13 Changed 5 years ago by Even Rouault

Robert is a committer so this patch is hopefully not stuck

Changed 5 years ago by stephend

Attachment: 4903.11.patch added

Update for latest trunk, KEA driver support

comment:14 Changed 15 months ago by Even Rouault

Resolution: fixed
Status: assignedclosed

Implemented per https://github.com/OSGeo/gdal/pull/743 for GDAL 2.4

Note: See TracTickets for help on using tickets.