Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#3295 closed defect (fixed)

Accessing some JPEG2000 files takes an extraordinary amount of memory.

Reported by: warmerdam Owned by: warmerdam
Priority: high Milestone: 1.7.2
Component: GDAL_Raster Version: unspecified
Severity: normal Keywords: jp2kak
Cc: gaopeng


(as reported to the kakadu_jpeg2000 mailing list by myself)

I have a 230MB jpeg2000 file with which I am experiencing expectedly high memory use when I decode it for viewing with Kakadu. The -record report on the file is:


I have found if I transcode it like this, viewing memory use is dramatically reduced:

kdu_transcode -i N05-E132.jp2 -o optimized.jpc
   'Cprecincts={128,128}' Corder=RPCL ORGgen_plt=yes

I find that "kdu_expand -i N05-E132.jp2 -o out.tif" promptly rises to around 395MB resident memory use with the unoptimized file, while the optimized file only rises to bout 72MB. I assume the original jp2 was generated in an organization that is hostile for low-memory viewer access, but my core question is if there are things I can do with my viewer application or even kdu_expand to reduce memory use on the original file.

Change History (8)

comment:1 Changed 11 years ago by warmerdam

Priority: normalhigh
Status: newassigned

David Taubman reports:

> Indeed, the image has no doubt been compressed without the
> use of precincts and with very large tiles so it is absolutely
> unsuitable for random access. I imagine it does not have any
> random access pointers inside it, but even if it does there cannot
> be sufficient of them to be useful because there are not enough
> distinct entities to point to.
> Interactive apps like kdu_show open the codestream in a persistent
> mode, which means that information which must be parsed
> sequentially will not be flushed from memory -- otherwise the
> interactive app would have to parse it all over again from scratch.
> For an image like this, however, you could set up a decompression
> engine to decompress/render just a single region of interest while
> keeping the codestream in non-persistent mode -- and then if
> you want another region, you can open the whole thing again from
> scratch, again in non-persistent mode. To see how effective this
> would be in keeping memory down, you could invoke kdu_expand
> on a reduced region of interest (-region argument) and see
> how much memory it consumes.

David Burken also indicated on the list that he has had success reducing memory use with use of the region decompressor on an image that appears to have the same configuration.

I'm going to experiment with the region decompressor, though I suspect we will be sacrificing speed for reduced memory footprint.

comment:2 Changed 11 years ago by warmerdam

In testing, I have confirmed that use of non-persistent reading keeps memory use managable; however, it can be at a severe cost in performance. Basically the file is being reopened for each read pass.

I have implemented support for non-persistent reading in in the JP2KAKDataset::DirectRasterIO() method as an option. Currently it is enabled if the JP2KAK_PERSIST configuration option is NO, or if it is unset and the image has more than 100 million pixels and Cuse_precincts is false. I *think* it is the lack of precincts in large images that is killing us though I may need to clarify this with Dr. Taubman or via experiments.

Currently this non-persistent approach is not used in the IReadBlock() method which I would like to try and restructure to go through DirectRasterIO() too at some point.

The changes for non-persistent reading are currently only in trunk (r18341). I would like to address the IReadBlock() issue before trying to migrate to 1.6-esri branch.

comment:3 Changed 10 years ago by warmerdam

A preliminary effort has been completed implementing IReadBlock() in terms of DirectRasterIO(). In the process I have removed "support" for Float32 bands, and also the optimization of doing all rgb bands at once when the image is ycbcr encoded has been lost - trunk (r18659).

comment:4 Changed 10 years ago by warmerdam

I have added support for YCbCr images being read in one pass, and fixed a few bugs in trunk (r18660).

I have backported this driver in it's entirety into the 1.6-esri branch (r18661). It is quite possible I have broken some things in this backport, though rudimentary checks show the new features seem to be operational.

Please let me know if there are problems.

comment:5 Changed 10 years ago by gaopeng

It works on one simple test image, but crashes on another, which uses the new feature. The problem image, N05-E132.jp2. is at ftp::GDAL@….

comment:6 Changed 10 years ago by warmerdam

I have reproduced a problem with this image:

warmerda@gdal64[9]% gdal_translate -outsize 1000 1000 N05-E132.jp2 out.tif
JP2KAK: Using 1 threads.
JP2KAK: Cuse_precincts=0, PreferNonPersistentReads=1
JP2KAK: order=LRCP
JP2KAK: ycc=true
JP2KAK: nResCount=6
GDALJP2Metadata: Got projection from GeoJP2 (geotiff) box: GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433],AUTHORITY["EPSG","4326"]]
GDAL: GDALOpen(N05-E132.jp2, this=0x61c460) succeeds as JP2KAK.
Input file size is 43221, 36021
0GDAL: GDALOpen(GTIFF_RAW:out.tif, this=0x63b4a0) succeeds as GTiff.
GDAL: GDALDatasetCopyWholeRaster(): 1000*1000 swaths, bInterleave=1
GDAL: GDALDefaultOverviews::OverviewScan()
JP2KAK: DirectRasterIO() for 0,0,43221,36021 -> 1350x1125 -> 1000x1000 (non-persistent)
gdal_translate: ../compressed/codestream.cpp:704: void kd_compressed_input::seek(kdu_long): Assertion `!throw_markers' failed.
Abort (core dumped)

Digging deeper.

comment:7 Changed 10 years ago by warmerdam

Milestone: 1.7.2
Resolution: fixed
Status: assignedclosed

Calling the set_fussy() method sets throw_markers which seems to make it illegal to do seeks under some circumstances. The set_fussy() (and set_resilient()) method does not appear to serve any purpose other than producing extra warnings for improper jpeg2000 streams, so I have disabled the call and now things seem to work for the problem image.

The change has been applied to trunk (r18787), 1.7-branch (r18788) and 1.6-esri branch (r18789).

comment:8 Changed 10 years ago by gaopeng

Tested, and it works great. Thanks.

Note: See TracTickets for help on using tickets.