Ticket #903 (closed enhancement: fixed)

Opened 2 years ago

Last modified 2 years ago

[raster] ST_Reclass

Reported by: dustymugs Owned by: dustymugs
Priority: medium Milestone: PostGIS 2.0.0
Component: raster Version: trunk
Keywords: history Cc:

Description

Due to limitations in the JPEG (8BUI) and PNG (8BUI and 16BUI) raster formats regarding supported pixel/data types, a method must be provided that can convert a band of a larger data type to 8BUI, amongst other uses.

As proposed by Pierre...

ST_Reclass(rast raster, nband int, reclassexpr text, pixeltype text, [nband int, reclassexpr text, pixeltype text]...)

The above allows the function to reclass one or more bands at the same time.

The reclassexpr argument is a string like

'rangefrom:rangeto, [rangefrom:rangeto]'

where the ranges are 'int-int' or just 'int' (float or double if appropriate).

So, you could reclass one band to three new R, G and B bands:

red: min(covmin, 0)-0:0,0-max(covmax, 0):0-255

green: min(covmin, 0)-0:200,0-max(covmax, 0):0-255

blue: min(covmin, 0)-0:255,0-max(covmax, 0)/2:0, max(covmax, 0)/2-max(covmax, 0):0-255

CREATE TABLE elevationcov AS
SELECT ST_Reclass(rast, 
                  1, LEAST(covmin, 0)::text || '-0:0,0-' || GREATEST(covmax, 0)::text || ':0-255', '8BUI',
                  1, LEAST(covmin, 0)::text || '-0:200,0-' || GREATEST(covmax, 0)::text' || ':0-255','8BUI',
                  1, LEAST(covmin, 0)::text || '-0:255,0-' || (GREATEST(covmax, 0)/2)::text' || ':0,' || (GREATEST(covmax, 0)/2)::text' || ':' || GREATEST(covmax, 0)::text || ':0-255', '8BUI')
FROM mycoverage

Attachments

st_reclass.patch Download (37.2 KB) - added by dustymugs 2 years ago.
Adds ST_Reclass
st_reclass.2.patch Download (37.5 KB) - added by dustymugs 2 years ago.
updated patch to fix logic bug
st_reclass.3.patch Download (37.5 KB) - added by dustymugs 2 years ago.
st_reclass.2.patch has a malformed hunk. use this one instead
st_reclass.4.patch Download (37.5 KB) - added by dustymugs 2 years ago.
final minor tweak of C comments
st_reclass.5.patch Download (38.2 KB) - added by dustymugs 2 years ago.
No changes made. Update diff due to incremental changes in patches that this one depends on
st_reclass.6.patch Download (38.1 KB) - added by dustymugs 2 years ago.
No changes made. Update diff due to incremental changes in patches that this one depends on
st_reclass.7.patch Download (37.8 KB) - added by dustymugs 2 years ago.
No changes made. Update diff due to incremental changes in ST_MinMax.
st_reclass.8.patch Download (38.0 KB) - added by dustymugs 2 years ago.
Code cleanup and refactored for memory handling changes

Change History

  Changed 2 years ago by dustymugs

  • version changed from 1.5.X to trunk
  • milestone changed from PostGIS 2.0.0 to PostGIS Raster Future

  Changed 2 years ago by dustymugs

I have two questions/concerns about the current concept.

1. ST_Reclass() with a variable number of function parameters would be ideal. But I don't believe pl/plgsql is able to do that. It does support variadic arguments but nothing like C's argc and argv (as far as I can tell). Variadic argument was added to PostgreSQL in 8.4 so I also wonder what is the minimum version that PostGIS supports. If the minimum in 8.2 or 8.3, variadic functions may not be an option.

Maybe we should consider two separate approaches, one less efficient and one more so that may not be quite as friendly...

ST_Reclass(rast raster, nband int, reclassexpr text, pixeltype text)

This would be fine if you're only going to reclassify one band but is absolutely inefficient for multiple bands...

ST_Reclass(rast, 1, SOME_EXPRESSION, '8BUI')

and

ST_Reclass(
  ST_Reclass(
    ST_Reclass(
      ST_Band(rast, ARRAY[1,1,1])
      , 1, SOME_EXPRESSION, '8BUI'
    )
    , 2, SOME_EXPRESSION, '8BUI'
  )
  , 3, SOME_EXPRESSION, '8BUI'
)

The second example is absolutely inefficient. So, a more efficient method could be

ST_Reclass(rast, reclassexpr_set text[])

where reclassexpr_set is a two dimension array like

ARRAY[

[nband int, reclassexpr text, pixeltype text],

[nband int, reclassexpr text, pixeltype text],

...

[nband int, reclassexpr text, pixeltype text],

]

2. I have a question about the range in the reclassexpr.

If we look at the following, I wonder what the min(covmin, 0)-0 and 0-max(covmax, 0) really means?

min(covmin, 0)-0:0,0-max(covmax, 0):0-255

Does it mean...

min(covmin, 0) <= x < 0 and 0 <= x < max(covmax, 0)

or

min(covmin, 0) <= x <= 0 and 0 <= x <= max(covmax, 0)

or

min(covmin, 0) < x <= 0 and 0 < x <= max(covmax, 0)

something else?

So what happens when x is at the limit of the range?

Thoughts?

follow-up: ↓ 7   Changed 2 years ago by pracine

1) Could it not be VARIADIC of a composite type? e.g.:

CREATE TYPE reclassarg AS (

nband int, reclassexpr text, pixeltype text

);

CREATE FUNCTION ST_Reclass(rast, raster, VARIADIC args reclassarg[])...

I don't know if this works. I think this comes to the same as your last ARRAY solution but I'm not sure.

I don't think we mind too much about not supporting PostgreSQL 8.3 and 8.2.

2) I was planning "rangemin <= x < rangemax". To solve the rangemax issue when x = rangemax you could introduce the keyword "rangemax" saying "if you use the "rangemax" keyword for rangemax, '<=' is used instead of '<'" in the comparison. To be convenient and symetrical you could also support the "rangemin" keyword even if we don't really need it.

Don't forget that we might want to recall a range of values to NULL (nodata).

follow-up: ↓ 6   Changed 2 years ago by dustymugs

1. If we're not concerned about supporting versions of PostgreSQL less than 8.4, then the VARIADIC parameter of the reclassarg type would be perfect with the ROW() function.

2. I concur with the addition of RANGEMIN and RANGEMAX keywords. That should help with ensuring that all values of a band are accounted for. Thanks for reminding me about setting values to the NULL NODATA value.

I'm thinking of also adding two additional keywords BANDMIN and BANDMAX as placeholders for the min and max values returned from ST_MinMax. So, for the example

min(covmin, 0)-0:0,0-max(covmax, 0):0-255

the user doesn't have to store covmin and covmax somewhere (in table columns or a user-written function with variables) that needs to be passed to ST_Reclass. Rather,

min(BANDMIN, 0)-0:0,0-max(BANDMAX, 0):0-255

would have ST_Reclass internally substitute the min and max values from ST_MinMax.

I hope this makes sense.

  Changed 2 years ago by dustymugs

Now that I know that PostgreSQL 8.3 will not be supported in future versions of PostGIS, VARIADIC parameters are just fine.

Also, ignore my last part about BANDMIN and BANDMAX as the same can be achieved with CTEs.

in reply to: ↑ 4   Changed 2 years ago by pracine

Replying to dustymugs:

I'm thinking of also adding two additional keywords BANDMIN and BANDMAX as placeholders for the min and max values returned from ST_MinMax. So, for the example min(covmin, 0)-0:0,0-max(covmax, 0):0-255 the user doesn't have to store covmin and covmax somewhere (in table columns or a user-written function with variables) that needs to be passed to ST_Reclass. Rather, min(BANDMIN, 0)-0:0,0-max(BANDMAX, 0):0-255 would have ST_Reclass internally substitute the min and max values from ST_MinMax.

The problem with this is that ST_MinMax is a slow process...

in reply to: ↑ 3   Changed 2 years ago by dustymugs

Replying to pracine:

2) I was planning "rangemin <= x < rangemax". To solve the rangemax issue when x = rangemax you could introduce the keyword "rangemax" saying "if you use the "rangemax" keyword for rangemax, '<=' is used instead of '<'" in the comparison. To be convenient and symetrical you could also support the "rangemin" keyword even if we don't really need it.

In thinking about how to properly detail the range intervals, I think using keywords such as rangemax is crude. I'm thinking about notating the intervals similar to the math world.

1. [a-b] = a <= x <= b

2. (a-b] = a < x <= b

3. [a-b) = a <= x < b

4. (a-b) = a < x < b

#3 above would be the default evaluation of x in the range a-b. The use of square brackets and parentheses are optional, so the examples below would be permitted. Missing notations substitute the appropriate notation from #3 above.

[a-b = a <= x < b

(a-b = a < x < b

a-b] = a <= x <= b

a-b) = a <= x < b

  Changed 2 years ago by pracine

Sounds very good to me.

  Changed 2 years ago by dustymugs

  • status changed from new to assigned

  Changed 2 years ago by dustymugs

In writing ST_Reclass, I noticed there was an inherent weakness regarding the user defining a nodata value for the reclassified band. As such, I believe the new type "reclassarg" needs to be expanded:

CREATE TYPE reclassarg AS (
  nband int,
  reclassexpr text,
  pixeltype text,
  nodataval double precision
);

By specifying a nodata value, ST_Reclass can automatically existing nodata values to the new nodata value and allows the reclassexpr to convert other values to the nodata value.

Also, ST_Reclass should have several variations.

1. ST_Reclass(rast raster, VARIADIC reclassargset reclassarg[])

The default function

2. ST_Reclass(rast raster, nband int, reclassexpr text, pixeltype text, nodataval double precision)

Convert a single band instead of a set of parameters rather than using reclassarg

ST_Reclass(rast, 1, '0-100:1-10, 101-500:11-150,501 - 10000: 151-254', '8BUI', 255)

3. ST_Reclass(rast raster, nband int, reclassexpr text, pixeltype text)

New band has no nodata value unlike in #2

ST_Reclass(rast, 1, '0-100:1-10, 101-500:11-150,501 - 10000: 151-254', '8BUI')

4. ST_Reclass(rast raster, reclassexpr text, pixeltype text)

Assume band index is 1 and has no nodata value

ST_Reclass(rast, '0-100:1-10, 101-500:11-150,501 - 10000: 151-254', '8BUI')

  Changed 2 years ago by pracine

I don't understand what is the advantage of having an extra argument for nodata value over supporting "NULL" in reclassexpr. e.g.

ST_Reclass(rast, 1, '0-100:1-10, 101-500:11-150,501 - 10000: 151-254, 255: NULL', '8BUI')

instead of:

ST_Reclass(rast, 1, '0-100:1-10, 101-500:11-150,501 - 10000: 151-254', '8BUI', 255)

  Changed 2 years ago by dustymugs

The reason I added the nodata parameter is that when the new band is created (old band -> reclass -> new band), the attributes hasnodata and nodataval of the new rt_band can be assigned. Without that assignment, setting the new band's pixel value to NULL is ambiguous.

As for your example...

ST_Reclass(rast, 1, '0-100:1-10, 101-500:11-150,501 - 10000: 151-254, 255: NULL', '8BUI')

I'm assuming that the format for "255:NULL" is different from "0-100:1-10" and does not conform to rangefrom:rangeto. Is this correct? And that "255:NULL" means "The value 255 is the NODATA value for the output band"? If so, isn't this a bit confusing?

  Changed 2 years ago by pracine

For me rangefrom and rangeto can be range 'x-y:a-b' but also single values. e.g. 'x-y:a' but also 'x:a'. So in my head "255:NULL" fits into this scheme. "NULL" is just a special value with the constraint that there can not be two rangeto set to NULL.

follow-up: ↓ 15   Changed 2 years ago by dustymugs

Is the 255 in the context of the old band or the new reclassified band? So, is the 255 a value in the old band reclassified as NULL in the new band? Or is the 255 a value in the new band that is considered as NULL/NODATA?

in reply to: ↑ 14   Changed 2 years ago by pracine

Replying to dustymugs:

Is the 255 in the context of the old band or the new reclassified band?

Everthing before the colon is always in the context of the old band. Everything after is in the context of the new.

So, is the 255 a value in the old band reclassified as NULL in the new band?

Yes

Or is the 255 a value in the new band that is considered as NULL/NODATA?

No

So you can do '0-200:NULL' to reclassify most of the values to nodata.

It is true that this syntax does not allow you to specify a new nodata value for the new raster but I think this is not the role of ST_Reclass since we have ST_SetBandNodataValue for that. We could also add your nodataval parameter... But still it is convenient to be able to refer to the nodata value by 'NULL'.

There is still the issue when there was no nodata value set in the old band and we do not use your nodataval parameter but still, we set some values to 'NULL'. Not supporting 'NULL' would solve that. Back to square one... Probably your proposition is better then.

Just make sure we support 'x-y:a'...

  Changed 2 years ago by dustymugs

The reclassexpr supports:

1. x-y:a-b

2. x-y:a

3. x:a

Changed 2 years ago by dustymugs

Adds ST_Reclass

  Changed 2 years ago by dustymugs

Attached incremental patch for adding ST_Reclass support. Patch can be applied with the following in the base postgis source directory.

patch -p1 < st_reclass.patch

Patches for ST_Band and ST_MinMax should be applied first in the order listed.

Changed 2 years ago by dustymugs

updated patch to fix logic bug

Changed 2 years ago by dustymugs

st_reclass.2.patch has a malformed hunk. use this one instead

Changed 2 years ago by dustymugs

final minor tweak of C comments

Changed 2 years ago by dustymugs

No changes made. Update diff due to incremental changes in patches that this one depends on

Changed 2 years ago by dustymugs

No changes made. Update diff due to incremental changes in patches that this one depends on

Changed 2 years ago by dustymugs

No changes made. Update diff due to incremental changes in ST_MinMax.

Changed 2 years ago by dustymugs

Code cleanup and refactored for memory handling changes

  Changed 2 years ago by dustymugs

Adds ST_Reclass function. Merges cleanly against r7145.

The following patches must be merged first for this patch to merge cleanly:

1. ST_Band

2. ST_SummaryStats

3. ST_Mean

4. ST_StdDev

5. ST_MinMax

6. ST_Histogram

7. ST_Quantile

  Changed 2 years ago by dustymugs

  • keywords history added
  • status changed from assigned to closed
  • resolution set to fixed

Added in r7154

  Changed 2 years ago by robe

  • milestone changed from PostGIS Raster Future to PostGIS 2.0.0

  Changed 2 years ago by robe

  • type changed from task to enhancement
Note: See TracTickets for help on using tickets.