Ticket #930 (closed task: fixed)

Opened 2 years ago

Last modified 2 years ago

[raster] ST_SummaryStats

Reported by: dustymugs Owned by: dustymugs
Priority: medium Milestone: PostGIS 2.0.0
Component: raster Version: trunk
Keywords: history Cc:

Description

Like how ST_AsGDALRaster is the backend to ST_AsTIFF, ST_AsJPEG and ST_AsPNG, ST_SummaryStats is the backend for several summary stats:

1. Count of the population/sample included in the stats

2. Mean (ST_Mean or is ST_Average better?)

3. Standard Deviation (ST_StdDev)

4. Min/Max (ST_MinMax)

The proposed variations are:

1. ST_SummaryStats(rast raster, nband int, ignore_nodata boolean) -> record

returns one record of five columns (count, mean, stddev, min, max)

nband: index of band to process on

ignore_nodata: if TRUE, any pixel who's value is nodata is ignored.

ST_SummaryStats(rast, 2, TRUE)

2. ST_SummaryStats(rast raster, nband int) -> record

assumes ignore_nodata = TRUE

ST_SummaryStats(rast, 2)

3. ST_SummaryStats(rast raster, ignore_nodata boolean) -> record

assumes band index = 1

ST_SummaryStats(rast, FALSE)

4. ST_SummaryStats(rast raster) -> record

assumes band index = 1 and ignore_nodata = TRUE

ST_SummaryStats(rast)

Four approximation functions are also proposed sacrificing some accuracy for speed, especially on large rasters (10000 x 10000).

1. ST_SummaryStats(rast raster, nband int, ignore_nodata boolean, sample_percent double precision) -> record

sample_percent: a value between 0 and 1 indicating the percentage of the raster band's pixels to consider when determining the min/max pair.

ST_SummaryStats(rast, 3, FALSE, 0.1)

ST_SummaryStats(rast, 1, TRUE, 0.5)

2. ST_SummaryStats(rast raster, ignore_nodata boolean, sample_percent double precision) -> record

assumes that nband = 1

ST_SummaryStats(rast, FALSE, 0.01)

ST_SummaryStats(rast, TRUE, 0.025)

3. ST_SummaryStats(rast raster, sample_percent double precision) -> record

assumes that nband = 1 and ignore_nodata = TRUE

ST_SummaryStats(rast, 0.25)

4. ST_SummaryStats(rast raster) -> record

assumes that nband = 1, ignore_nodata = TRUE and sample_percent = 0.1

ST_SummaryStats(rast)

New tickets for ST_Mean and ST_StdDev will be posted next.

Functions that can depend upon the basic stats (ST_Histogram and ST_Quantile) will be proposed later.

Attachments

st_summarystats.patch Download (27.5 KB) - added by dustymugs 2 years ago.
Incremental patch adding ST_SummaryStats function. ST_Band patch must be merged first.

Change History

follow-up: ↓ 2   Changed 2 years ago by pracine

"ignore_nodata boolean" should be replaced with "hasnodata boolean " to be more consistent with the way we already specify to take into account or ignore nodata values in ST_Intersects and eventually in ST_DumpAsPolygons and ST_Intersection. The logic, in this case, is inverted: when FALSE we do not take nodata values into account.

Users can also normally just do ST_SummaryStats(ST_SetBandNoDataValue(rast, NULL)) to get the same result.

We expect to have a similar set of function taking a geometry (any kind: multipoint, lines, polygons) to limit the stats to the area of this geometry.

Thanks dustymugs

in reply to: ↑ 1   Changed 2 years ago by dustymugs

Replying to pracine:

"ignore_nodata boolean" should be replaced with "hasnodata boolean " to be more consistent with the way we already specify to take into account or ignore nodata values in ST_Intersects and eventually in ST_DumpAsPolygons and ST_Intersection. The logic, in this case, is inverted: when FALSE we do not take nodata values into account.

Thanks for the correction. I'll make the appropriate changes to what has been written.

We expect to have a similar set of function taking a geometry (any kind: multipoint, lines, polygons) to limit the stats to the area of this geometry.

I'll get to those once I get the simple case complete.

  Changed 2 years ago by dustymugs

  • status changed from new to assigned

A set of ST_SummaryStats and ST_ApproxSummaryStats variations for processing coverages:

1. ST_SummaryStats(rastertable text, rastercolumn text, nband int, hasnodata boolean) -> double precision

ST_SummaryStats('tmax_2010', 'rast', 1, FALSE)

ST_SummaryStats('precip_2011', 'rast', 1, TRUE)

2. ST_SummaryStats(rastertable text, rastercolumn text, nband int) -> double precision

hasnodata is set to FALSE

ST_SummaryStats('tmax_2010', 'rast', 1)

3. ST_SummaryStats(rastertable text, rastercolumn text, hasnodata boolean) -> double precision

nband is set to 1

ST_SummaryStats('precip_2011', 'rast', TRUE)

4. ST_SummaryStats(rastertable text, rastercolumn text) -> double precision

nband is set to 1 and hasnodata is set to FALSE

ST_SummaryStats('tmin_2009', 'rast')

Variations for ST_ApproxSummaryStats are:

1. ST_ApproxSummaryStats(rastertable text, rastercolumn text, nband int, hasnodata boolean, sample_percent double precision) -> double precision

ST_ApproxSummaryStats('tmax_2010', 'rast', 1, FALSE, 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', 1, TRUE, 0.2)

2. ST_ApproxSummaryStats(rastertable text, rastercolumn text, nband int, sample_percent double precision) -> double precision

hasnodata is set to FALSE

ST_ApproxSummaryStats('tmax_2010', 'rast', 1, 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', 1, 0.2)

3. ST_ApproxSummaryStats(rastertable text, rastercolumn text, hasnodata boolean, sample_percent double precision) -> double precision

nband is set to 1

ST_ApproxSummaryStats('tmax_2010', 'rast', FALSE, 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', TRUE, 0.2)

4. ST_ApproxSummaryStats(rastertable text, rastercolumn text, sample_percent double precision) -> double precision

nband is set to 1 and hasnodata is set to FALSE

ST_ApproxSummaryStats('tmax_2010', 'rast', 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', 0.2)

5. ST_ApproxSummaryStats(rastertable text, rastercolumn text) -> double precision

nband is set to 1, hasnodata is set to FALSE and sample_percent is set to 0.1

ST_ApproxSummaryStats('tmax_2010', 'rast')

ST_ApproxSummaryStats('precip_2011', 'rast')

The mean returned in these functions is a weighted mean of the means of each raster tile. The standard deviation returned is the cumulative standard deviation of all raster tiles.

Changed 2 years ago by dustymugs

Incremental patch adding ST_SummaryStats function. ST_Band patch must be merged first.

  Changed 2 years ago by dustymugs

Attached patch for ST_SummaryStats function. ST_SummaryStats is the base function for ST_Mean, ST_StdDev, ST_MinMax, ST_Histogram and ST_Quantile. This patch merges cleanly with r7145.

The patch for ST_Band must be merged first before merging this patch.

  Changed 2 years ago by dustymugs

  • keywords history added
  • status changed from assigned to closed
  • resolution set to fixed

Added in r7148.

  Changed 2 years ago by dustymugs

  • milestone changed from PostGIS Raster Future to PostGIS 2.0.0
Note: See TracTickets for help on using tickets.