Opened 17 years ago

Closed 17 years ago

Last modified 17 years ago

#1757 closed enhancement (fixed)

[PATCH] Add supports for A.TOC (NITF driver)

Reported by: Even Rouault Owned by: warmerdam
Priority: normal Milestone: 1.5.0
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords: NITF A.TOC
Cc: warmerdam

Description

This patch adds support for the 'A.TOC' file used for NITF datasets. The A.TOC is a table of contents of sub-datasets. Each sub-dataset is a matrix of single NITF files.

Basically this patch brings the same functionnality that was available with the RPF driver in OGDI. However I've found that OGDI RPF was suffering from bugs that are already corrected in GDAL itself (and OGDI development is not very active anymore anyway), so I wasn't very enthousiastic about fixing them again in OGDI. The code for parsing A.TOC is directly transposed from OGDI source code.

This patch supports A.TOC file with or without the NITF header.

gdalinfo /home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/A.TOC

gives

Driver: NITF/National Imagery Transmission Format
Size is 7680, 6144
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        TOWGS84[0,0,0,0,0,0,0],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9108"]],
    AXIS["Lat",NORTH],
    AXIS["Long",EAST],
    AUTHORITY["EPSG","4326"]]
Origin = (-86.802030456852791,41.379310344827587)
Pixel Size = (0.001783419381543,-0.001346105816721)
Subdatasets:
  SUBDATASET_1_NAME=NITF_TOC_ENTRY:CADRG_1M_2_0:/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/A.TOC
  SUBDATASET_1_DESC=CADRG:1M:2:0
Corner Coordinates:
Upper Left  ( -86.8020305,  41.3793103) ( 86d48'7.31"W, 41d22'45.52"N)
Lower Left  ( -86.8020305,  33.1088362) ( 86d48'7.31"W, 33d 6'31.81"N)
Upper Right ( -73.1053696,  41.3793103) ( 73d 6'19.33"W, 41d22'45.52"N)
Lower Right ( -73.1053696,  33.1088362) ( 73d 6'19.33"W, 33d 6'31.81"N)
Center      ( -79.9537000,  37.2440733) ( 79d57'13.32"W, 37d14'38.66"N)

Now, you can get info on a specific subdataset :

gdalinfo NITF_TOC_ENTRY:CADRG_1M_2_0:/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/A.TOC

gives

Driver: VRT/Virtual Raster
Size is 7680, 6144
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        TOWGS84[0,0,0,0,0,0,0],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9108"]],
    AXIS["Lat",NORTH],
    AXIS["Long",EAST],
    AUTHORITY["EPSG","4326"]]
Origin = (-86.802030456852791,41.379310344827587)
Pixel Size = (0.001783419381543,-0.001346105816721)
Metadata:
  FILENAME_0=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000GJ00B.ON2
  FILENAME_1=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000GK00B.ON2
  FILENAME_2=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000GL00B.ON2
  FILENAME_3=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000GM00B.ON2
  FILENAME_4=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000GN00B.ON2
  FILENAME_5=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000CN00B.ON2
  FILENAME_6=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000CP00B.ON2
  FILENAME_7=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000CQ00B.ON2
  FILENAME_8=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000CR00B.ON2
  FILENAME_9=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/000CS00B.ON2
  FILENAME_10=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0008S00B.ON2
  FILENAME_11=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0008T00B.ON2
  FILENAME_12=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0008U00B.ON2
  FILENAME_13=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0008V00B.ON2
  FILENAME_14=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0008W00B.ON2
  FILENAME_15=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0004W00B.ON2
  FILENAME_16=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0004X00B.ON2
  FILENAME_17=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0004Y00B.ON2
  FILENAME_18=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0004Z00B.ON2
  FILENAME_19=/home/even/downloads/mapserver/CADRG/CADRG_L22/RPF/0005000B.ON2
Corner Coordinates:
Upper Left  ( -86.8020305,  41.3793103) ( 86d48'7.31"W, 41d22'45.52"N)
Lower Left  ( -86.8020305,  33.1088362) ( 86d48'7.31"W, 33d 6'31.81"N)
Upper Right ( -73.1053696,  41.3793103) ( 73d 6'19.33"W, 41d22'45.52"N)
Lower Right ( -73.1053696,  33.1088362) ( 73d 6'19.33"W, 33d 6'31.81"N)
Center      ( -79.9537000,  37.2440733) ( 79d57'13.32"W, 37d14'38.66"N)
Band 1 Block=128x128 Type=Byte, ColorInterp=Red
Band 2 Block=128x128 Type=Byte, ColorInterp=Green
Band 3 Block=128x128 Type=Byte, ColorInterp=Blue
Band 4 Block=128x128 Type=Byte, ColorInterp=Alpha

I've tested it with the CADRG datasets available at http://www.aeroplanner.com/cadrg/ (A.TOC files without NITF header) and with other CADRG datasets (A.TOC files with NITF header). I have no A.TOC available for other kinds of NITF datasets (CIB, ... ?). Hope that it'll work for them too.


Now, if you look at the source code, you can notice that it uses a mechanism of 'proxy datasets' (instead of putting directly the raster bands into the VRT) because some big sub-datasets can be made of several hundreds of NITF images. This could lead to reach the maximum limit of opened file descriptors, thus the need for this mechanism.

Attachments (2)

gdal_svn_nitf_a_dot_toc.patch (78.0 KB ) - added by Even Rouault 17 years ago.
gdal_svn_trunk_rasterio_optim.patch (3.5 KB ) - added by Even Rouault 17 years ago.
Optimization for GDALCopyWords

Download all attachments as: .zip

Change History (17)

comment:1 by warmerdam, 17 years ago

Milestone: 1.5.0
Status: newassigned

Hmm, this patch will require some review....

comment:2 by Even Rouault, 17 years ago

Updated with an enhanced version of the patch that adds metadata 'NITF_SERIES_ABBREVIATION' and 'NITF_SERIES_NAME' to know the precise type of the map (mainly for CADRG). For example whether it's a JOG, ONC, JNC map.

comment:3 by Even Rouault, 17 years ago

I've a question about VRTDataSet and GetDescription().

I want to create a in-memory VRTDataSet for the mosaic of individual NITF images. So, I create it with :

poVirtualDS = poDriver->Create( "",
                                    sizeX * entry->nHorizFrames,
                                    sizeY * entry->nVertFrames,
                                    0, GDT_Byte, NULL);

So, I was expecting that it would have no name and that there wouldn't any attempt to write a physical .VRT file. However, I get the following warning :

ERROR 1: Failed to write .vrt file in FlushCache().

This warning shouldn't happen if the GetDescription() would return "" as I was expecting. I finally discovered that in GDALOpen you have the following code :

        poDS = poDriver->pfnOpen( &oOpenInfo );
        if( poDS != NULL )
        {
            if( strlen(poDS->GetDescription()) == 0 )
                poDS->SetDescription( pszFilename );
    
            if( poDS->poDriver == NULL )
                poDS->poDriver = poDriver;
    
            
            CPLDebug( "GDAL", "GDALOpen(%s) succeeds as %s.",
                      pszFilename, poDriver->GetDescription() );
    
            return (GDALDatasetH) poDS;
        }

so my VRTDataSet ends up with a description which is the filename passed to the Open method of the driver... My question is : how can I force an empty Description for my VRT ? I'm realizing that I may do something not expected : I return a VRTDataSet in the Open method of a driver that is supposed to return a NITFDataSet...

I was thinking about a nasty workaround that would consist in setting a fake description to my dataset instead of an empty one... For example "<VRTDataset"... (see VRTDataSet::FlushCache). But that's very ugly. So I prefer to live with this warning until someone comes with a better idea.

comment:4 by Even Rouault, 17 years ago

Hum, after writing done things, I've suddenly a better idea... Finally, it's not that bad that my VRTDataSet description is the filename passed to GDALOpen. What I don't want is that the VRTDataSet is written. So what about adding a method in VRTDataSet,

void SetWritable(int bWritable) { this->bWritable = bWritable; }

with bWritable set to TRUE by default in the constructor. So in my case I would just have to call SetWritable(FALSE) and test the value of the flag in FlushCache(). What do you think about it ?

comment:5 by warmerdam, 17 years ago

Even,

I agree with your followup analysis. Feel free to add the SetWritable() method and support for it in VRTDataset. We definately want the dataset to have the filename of the TOC file as application code expects to be able to fetch the description of a dataset, and use to that reopen it again later.

Ideally you should derive a GDALRPFTOCDataset class from VRTDataset so that you can provide some customized behavior for this format. In fact, I think all this TOC support belongs in a distinct RPFTOC driver rather than wedged into nitfdataset.cpp. Of course this other closely related driver could live in the gdal/frmts/nitf directory.

comment:6 by Even Rouault, 17 years ago

I've updated the patch with the addition of SetWritable in VRTDataset and I use it in my code. It fixes my issue efficiently and in a simple way. This was useful too because I discovered that I didn't include the right version of VRTDataset.h. I included the VRTDataset.h installed in /usr/include... Oops... GNUMakefile and makefile.vc are now adding "-I../vrt" to the include flags.

Your proposal of making a dedicated TOC driver sounds nice. I'll probably try to go in that direction.

comment:7 by Even Rouault, 17 years ago

Reworking the patch was not that difficult and it makes things much clearer.

So there are 3 new files :

  • RPFTOCFile.c : decoding of the A.TOC file
  • RPFTOCLib.h : header
  • RPFTOCDataset.cpp :
    • A specific GDAL driver is created
    • RPFTOCDataset extends GDALPamDataset
    • RPFTOCSubDataset extends VRTDataset

NITFDataset.cpp has now just very few modification. Basically to prevent A.TOC file in NITF format from being handled by the NITF driver.

by Even Rouault, 17 years ago

comment:8 by warmerdam, 17 years ago

Even,

Excellent - go ahead and commit the NITF / TOC changes.

Could you explain the rasterio.cpp change? I'm not clear on what this case is supposed to do.

comment:9 by Even Rouault, 17 years ago

The change in rasterio.cpp is just an optimization. I ran gdal_translate on a a.toc subdataset and profiled it with kcachegrind and sysprof. And that showed that a significant CPU time (like 20%) was for the GDALCopyWords call done at the beginning of VRTSourcedRasterBand::IRasterIO :

        for( iLine = 0; iLine < nBufYSize; iLine++ )
        {
            GDALCopyWords( &dfWriteValue, GDT_Float64, 0, 
                           ((GByte *)pData) + nLineSpace * iLine, 
                           eBufType, nPixelSpace, nBufXSize );
        }

But there's still place for optimizations in the driver itself. I think that decoding the 4 bands at the same time could help to reduce the CPU time spend in my ::Expand method (currently it's around 40-60%). To be investigated...

comment:10 by Even Rouault, 17 years ago

Patch commited (except the optimization in RasterIO.cpp) in trunk in r11920

comment:11 by Even Rouault, 17 years ago

In r11967, I've commited a change where paletted mode is the default instead of RGBA. Here's the explanation :

    /* In most cases, all the files inside a TOC entry share the same */
    /* palette and we could use it for the VRT. */
    /* In other cases like for CADRG801_France_250K (TOC entry CADRG_250K_2_2), */
    /* the file for Corsica and the file for Sardegna do not share the same palette */
    /* however they contain the same RGB triplets and are just ordered differently */
    /* So we can use the same palette */
    /* In the unlikely event where palettes would be incompatible, we can use the RGBA */
    /* option through the config option RPFTOC_FORCE_RGBA */

Performances are much better on large datasets when using paletted mode instead of RGBA.

comment:12 by Even Rouault, 17 years ago

Resolution: fixed
Status: assignedclosed

comment:13 by Even Rouault, 17 years ago

Cc: warmerdam added

Frank, when commiting the RPFTOC driver, I left out an optimization in rasterio whose interest you were a bit wondering about. Indeed it could be made more general (when nSrcPixelOffset==0) and I'm attaching a patch (svn_gdal_trunk_rasterio_optim.patch) with an updated version that offers the same speed improvements.

VRTSourcedRasterBand::IRasterIO and VRTDerivedRasterBand::IRasterIO begin by intializing the destination buffer with the noData value. Without the patch, this goes to the general case of GDALCopyWords that for each destination pixels will get the source pixel (always the same here!), check that the value is in the range of acceptable values for the destination type, cast it and write it. The optimization relies in the fact that we can factor all the first steps except the write part of course.

Let the figures talk :

I've run "time gdal_translate NITF_TOC_ENTRY:CADRG_ONC_1M_2_0:CADRG/CADRG_L22/RPF/A.TOC tmp.tif" repeatedly (this dataset can be downloaded at http://www.aeroplanner.com/cadrg/Samples/CADRG_L22.zip) such as I/O caching mechanisms have similar effects on all the runs and that 'time' gets results that are in a small range around a mean value.

A typical run without the optimization gives : real 0m2.941s user 0m2.404s sys 0m0.360s

A typical run with the optimization gives : real 0m0.710s user 0m0.260s sys 0m0.348s

On a much larger CADRG dataset I have, without the optimzation : real 0m51.779s user 0m33.302s sys 0m2.976s

with the optimization : real 0m29.884s user 0m6.496s sys 0m4.324s

What do you think about it ?

by Even Rouault, 17 years ago

Optimization for GDALCopyWords

comment:14 by warmerdam, 17 years ago

Even,

I'm sorry, but I don't have time to review this patch for at least a few days. So instead, if you are comfortable with it, please go ahead and apply it in trunk.

comment:15 by Even Rouault, 17 years ago

gdal_svn_trunk_rasterio_optim.patch commited in trunk in r12243

Note: See TracTickets for help on using tickets.