Opened 15 years ago

Closed 15 years ago

#2871 closed defect (invalid)

Histogram with Python 2.5 different than with 2.3.5

Reported by: Flewellyn Owned by: hobu
Priority: high Milestone:
Component: PythonBindings Version: 1.6.0
Severity: major Keywords:
Cc:

Description

I've recently begun migrating my Python scripts which use GDAL/OGR to Python 2.5, and I noticed that GetHistogram() returns a substantially different histogram for a given image under 2.5 from what it returns under 2.3.5. The file in question is a GeoTIFF, and in both cases I am calling GetHistogram() on the same raster band, band 1.

Under 2.3.5, this is the histogram I get:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 93, 243, 253, 464, 988, 1127, 754, 982, 1348, 1501, 1251, 1808, 2318, 2607, 2940, 2629, 2366, 2427, 2498, 2021, 1698, 999, 579, 206, 148, 52, 35, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 65504]

While, under 2.5, this is the result:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 35, 65, 98, 159, 204, 113, 177, 168, 340, 393, 475, 586, 531, 545, 341, 291, 268, 252, 258, 186, 125, 64, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12022]

I have attached the GeoTIFF in question to this report.

Attachments (1)

km0jxa38d4dwbae3.tif (297.6 KB ) - added by Flewellyn 15 years ago.
GeoTIFF used as input.

Download all attachments as: .zip

Change History (5)

by Flewellyn, 15 years ago

Attachment: km0jxa38d4dwbae3.tif added

GeoTIFF used as input.

comment:1 by Even Rouault, 15 years ago

I've called GetHistogram() on your dataset with Python 2.5.2, 2.4.5 and 2.3.3, and with all 3 versions, I get the same result as you with Python 2.5.

Are you sure that you're running Python 2.3 GDAL bindings with the same GDAL version as Python 2.5 ? Are you sure that you've compiled properly Python 2.3 GDAL bindings with the right headers ?

comment:2 by Flewellyn, 15 years ago

Yikes. I went and checked again, and apparently the Python 2.3 installation was using GDAL 1.4.0! This is definitely not good.

I will have to figure out how and why GDAL's Histogram calculations changed from 1.4 to 1.6.

The problem is, from what I can see, the 1.4 version seems more "correct" to me, in that it matches better with what the density slice I use does. If I use the histogram returned by 1.6, the image ends up with "holes" that aren't classified.

comment:3 by Flewellyn, 15 years ago

I've done some more testing, and I believe the histogram calculations using GDAL 1.6.0 are incorrect.

Using Python, I loaded the image, grabbed raster band 1, and then grabbed the band data using ReadAsArray(). When I iterated over the array counting the values, I got a histogram identical to the one produced by the 1.4 version of GDAL.

I ran this script using both Python 2.3 with GDAL 1.4, and Python 2.5 with GDAL 1.6. The histograms I calculated using this exhaustive method are identical to the histogram produced by calling GetHistogram() from Python 2.3 with GDAL 1.4.

comment:4 by Flewellyn, 15 years ago

Resolution: invalid
Status: newclosed

Okay, I found the problem.

Apparently, in the new version, the "approx_ok" parameter to GetHistogram() defaults to 1, instead of 0, as it used to. So when I was calling GetHistogram(), I was getting an approximation, when what I need is an exact calculation.

So all I have to do is change all my calls to GetHistogram() to specify "approx_ok=0", and I'm good to go.

I will mark this bug as closed.

Note: See TracTickets for help on using tickets.