Opened 14 years ago

Closed 14 years ago

Last modified 14 years ago

#930 closed task (fixed)

[raster] ST_SummaryStats

Reported by: dustymugs Owned by: dustymugs
Priority: medium Milestone: PostGIS 2.0.0
Component: raster Version: master
Keywords: history Cc:

Description

Like how ST_AsGDALRaster is the backend to ST_AsTIFF, ST_AsJPEG and ST_AsPNG, ST_SummaryStats is the backend for several summary stats:

  1. Count of the population/sample included in the stats
  1. Mean (ST_Mean or is ST_Average better?)
  1. Standard Deviation (ST_StdDev)
  1. Min/Max (ST_MinMax)

The proposed variations are:

  1. ST_SummaryStats(rast raster, nband int, ignore_nodata boolean) → record

returns one record of five columns (count, mean, stddev, min, max)

nband: index of band to process on

ignore_nodata: if TRUE, any pixel who's value is nodata is ignored.

ST_SummaryStats(rast, 2, TRUE)
  1. ST_SummaryStats(rast raster, nband int) → record

assumes ignore_nodata = TRUE

ST_SummaryStats(rast, 2)
  1. ST_SummaryStats(rast raster, ignore_nodata boolean) → record

assumes band index = 1

ST_SummaryStats(rast, FALSE)
  1. ST_SummaryStats(rast raster) → record

assumes band index = 1 and ignore_nodata = TRUE

ST_SummaryStats(rast)

Four approximation functions are also proposed sacrificing some accuracy for speed, especially on large rasters (10000 x 10000).

  1. ST_SummaryStats(rast raster, nband int, ignore_nodata boolean, sample_percent double precision) → record

sample_percent: a value between 0 and 1 indicating the percentage of the raster band's pixels to consider when determining the min/max pair.

ST_SummaryStats(rast, 3, FALSE, 0.1)

ST_SummaryStats(rast, 1, TRUE, 0.5)
  1. ST_SummaryStats(rast raster, ignore_nodata boolean, sample_percent double precision) → record

assumes that nband = 1

ST_SummaryStats(rast, FALSE, 0.01)

ST_SummaryStats(rast, TRUE, 0.025)
  1. ST_SummaryStats(rast raster, sample_percent double precision) → record

assumes that nband = 1 and ignore_nodata = TRUE

ST_SummaryStats(rast, 0.25)
  1. ST_SummaryStats(rast raster) → record

assumes that nband = 1, ignore_nodata = TRUE and sample_percent = 0.1

ST_SummaryStats(rast)

New tickets for ST_Mean and ST_StdDev will be posted next.

Functions that can depend upon the basic stats (ST_Histogram and ST_Quantile) will be proposed later.

Attachments (1)

st_summarystats.patch (27.5 KB ) - added by dustymugs 14 years ago.
Incremental patch adding ST_SummaryStats function. ST_Band patch must be merged first.

Download all attachments as: .zip

Change History (7)

comment:1 by pracine, 14 years ago

"ignore_nodata boolean" should be replaced with "hasnodata boolean " to be more consistent with the way we already specify to take into account or ignore nodata values in ST_Intersects and eventually in ST_DumpAsPolygons and ST_Intersection. The logic, in this case, is inverted: when FALSE we do not take nodata values into account.

Users can also normally just do ST_SummaryStats(ST_SetBandNoDataValue(rast, NULL)) to get the same result.

We expect to have a similar set of function taking a geometry (any kind: multipoint, lines, polygons) to limit the stats to the area of this geometry.

Thanks dustymugs

in reply to:  1 comment:2 by dustymugs, 14 years ago

Replying to pracine:

"ignore_nodata boolean" should be replaced with "hasnodata boolean " to be more consistent with the way we already specify to take into account or ignore nodata values in ST_Intersects and eventually in ST_DumpAsPolygons and ST_Intersection. The logic, in this case, is inverted: when FALSE we do not take nodata values into account.

Thanks for the correction. I'll make the appropriate changes to what has been written.

We expect to have a similar set of function taking a geometry (any kind: multipoint, lines, polygons) to limit the stats to the area of this geometry.

I'll get to those once I get the simple case complete.

comment:3 by dustymugs, 14 years ago

Status: newassigned

A set of ST_SummaryStats and ST_ApproxSummaryStats variations for processing coverages:

  1. ST_SummaryStats(rastertable text, rastercolumn text, nband int, hasnodata boolean) → double precision
ST_SummaryStats('tmax_2010', 'rast', 1, FALSE)

ST_SummaryStats('precip_2011', 'rast', 1, TRUE)
  1. ST_SummaryStats(rastertable text, rastercolumn text, nband int) → double precision

hasnodata is set to FALSE

ST_SummaryStats('tmax_2010', 'rast', 1)
  1. ST_SummaryStats(rastertable text, rastercolumn text, hasnodata boolean) → double precision

nband is set to 1

ST_SummaryStats('precip_2011', 'rast', TRUE)
  1. ST_SummaryStats(rastertable text, rastercolumn text) → double precision

nband is set to 1 and hasnodata is set to FALSE

ST_SummaryStats('tmin_2009', 'rast')

Variations for ST_ApproxSummaryStats are:

  1. ST_ApproxSummaryStats(rastertable text, rastercolumn text, nband int, hasnodata boolean, sample_percent double precision) → double precision
ST_ApproxSummaryStats('tmax_2010', 'rast', 1, FALSE, 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', 1, TRUE, 0.2)
  1. ST_ApproxSummaryStats(rastertable text, rastercolumn text, nband int, sample_percent double precision) → double precision

hasnodata is set to FALSE

ST_ApproxSummaryStats('tmax_2010', 'rast', 1, 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', 1, 0.2)
  1. ST_ApproxSummaryStats(rastertable text, rastercolumn text, hasnodata boolean, sample_percent double precision) → double precision

nband is set to 1

ST_ApproxSummaryStats('tmax_2010', 'rast', FALSE, 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', TRUE, 0.2)
  1. ST_ApproxSummaryStats(rastertable text, rastercolumn text, sample_percent double precision) → double precision

nband is set to 1 and hasnodata is set to FALSE

ST_ApproxSummaryStats('tmax_2010', 'rast', 0.5)

ST_ApproxSummaryStats('precip_2011', 'rast', 0.2)
  1. ST_ApproxSummaryStats(rastertable text, rastercolumn text) → double precision

nband is set to 1, hasnodata is set to FALSE and sample_percent is set to 0.1

ST_ApproxSummaryStats('tmax_2010', 'rast')

ST_ApproxSummaryStats('precip_2011', 'rast')

The mean returned in these functions is a weighted mean of the means of each raster tile. The standard deviation returned is the cumulative standard deviation of all raster tiles.

by dustymugs, 14 years ago

Attachment: st_summarystats.patch added

Incremental patch adding ST_SummaryStats function. ST_Band patch must be merged first.

comment:4 by dustymugs, 14 years ago

Attached patch for ST_SummaryStats function. ST_SummaryStats is the base function for ST_Mean, ST_StdDev, ST_MinMax, ST_Histogram and ST_Quantile. This patch merges cleanly with r7145.

The patch for ST_Band must be merged first before merging this patch.

comment:5 by dustymugs, 14 years ago

Keywords: history added
Resolution: fixed
Status: assignedclosed

Added in r7148.

comment:6 by dustymugs, 14 years ago

Milestone: PostGIS Raster FuturePostGIS 2.0.0
Note: See TracTickets for help on using tickets.