Opened 7 years ago

Closed 5 years ago

#593 closed task (fixed)

[raster] Add a ST_BandIsNoData function

Reported by: pracine Owned by: jorgearevalo
Priority: medium Milestone: PostGIS 2.0.2
Component: raster Version: 2.0.x
Keywords: Cc:

Description

Returns true if and only if the band is filled only with nodata values.

This function is usefull to optimize many functions.

This function requires a new flag (similar to the hasnodatavalue flag) to be set a the band level.

Change to the loader are also required. Any editing function changing a pixel value from nodata to with data should also reset the flag.

We might want to add a ST_ResetBandISNoData function later.

Change History (25)

comment:1 Changed 7 years ago by pracine

Status: newassigned

comment:2 Changed 7 years ago by pracine

Owner: changed from pracine to jorgearevalo
Status: assignednew

comment:3 Changed 7 years ago by jorgearevalo

Status: newassigned

comment:4 Changed 7 years ago by jorgearevalo

Should the function check all the pixel values? If yes, the flag is not useful. If not, the out-of-synch situation could happen.

comment:5 Changed 7 years ago by jorgearevalo

Ok, approach used:

1.- Check if the band is a NODATA band inside the band creation functions (rt_band_new_inline, rt_band_from_wkb), and set the flag.

2.- In rt_band_set_pixel, rt_band_set_nodata reset the flag if needed.

So, we avoid the out-of-synch situation, and the ST_BandIsNoData function only needs to check the flag value.

Only 2 questions:

  • What should we do in case of offline bands? We could check the values using GDAL.
  • Correct me if I'm wrong, but we don't need the isnodata flag to be stored in WKB string

I'll commit the code after testing it (and updating doc)

comment:6 in reply to:  5 ; Changed 7 years ago by pracine

  • What should we do in case of offline bands? We could check the values using GDAL.

This job should be done by raster2pgsql.py

  • Correct me if I'm wrong, but we don't need the isnodata flag to be stored in WKB string

Yes. It must be persistent. The code should look exactly like the code for hasnodata.

Most of the work has to be done in raster2pgsql.py (and every set pixel functions).

comment:7 Changed 7 years ago by pracine

As you might guess there is no easy way to set the flag in case a ST_SetValue would set the last unique pixel withdata of a raster. There should be a ST_BandIsNoData(rast, TRUE) that would scan every raster to reset the flag if necessary. This function would be slow as it needs to scan every pixels. This function would not be called everytime by pixel setting functions but just some times by the user (like a VACUUM ANALYSE sql command).

comment:8 in reply to:  7 ; Changed 7 years ago by jorgearevalo

Replying to pracine:

As you might guess there is no easy way to set the flag in case a ST_SetValue would set the last unique pixel withdata of a raster. There should be a ST_BandIsNoData(rast, TRUE) that would scan every raster to reset the flag if necessary. This function would be slow as it needs to scan every pixels. This function would not be called everytime by pixel setting functions but just some times by the user (like a VACUUM ANALYSE sql command).

Actually, I've coded the function with that option, just now. The flag forces a new checking.

comment:9 in reply to:  6 ; Changed 7 years ago by jorgearevalo

Replying to pracine:

  • What should we do in case of offline bands? We could check the values using GDAL.

This job should be done by raster2pgsql.py

OK

  • Correct me if I'm wrong, but we don't need the isnodata flag to be stored in WKB string

Yes. It must be persistent. The code should look exactly like the code for hasnodata.

Ok. We still have 2 unused bits per band.

Most of the work has to be done in raster2pgsql.py (and every set pixel functions).

In rt_band_set_pixel the flag is reset if the pixel is different from nodata. And in rt_band_set_nodata, we must check if the new nodata value matchs the rest of the pixels.

These are the 2 entries for storing new raster data. In raster2pgsql.py we must set the flag when dealing with offdb bands. The rest of the cases are managed by the 2 core functions mentioned (rt_band_set_pixel, rt_band_set_nodata).

Am I missing something?

comment:10 Changed 7 years ago by jorgearevalo

Ok, summarizing:

  • The isnodata flag must be persistent, as the hasnodata flag is (it must be included in the WKB string)
  • Keeping the isnodata flag synchronized is too slow if we pretend to check it every time ST_SetValue / ST_SetBandNodataValue are called. We need a way to force synchronization. For example, a special version of ST_BandIsNoData function, with a boolean flag, to force this checking (doubt: if no band is specified, I guess band 1 is used by default, no all the raster bands). Other suggestions accepted.

comment:11 in reply to:  8 ; Changed 7 years ago by pracine

Replying to jorgearevalo:

Replying to pracine:

As you might guess there is no easy way to set the flag in case a ST_SetValue would set the last unique pixel withdata of a raster. There should be a ST_BandIsNoData(rast, TRUE) that would scan every raster to reset the flag if necessary. This function would be slow as it needs to scan every pixels. This function would not be called everytime by pixel setting functions but just some times by the user (like a VACUUM ANALYSE sql command).

Actually, I've coded the function with that option, just now. The flag forces a new checking.

By default the function should NOT check every pixel. It should do it just when the boolean parameter is set to TRUE, which means "reverify every pixels". (I wrote the opposite above. But I think it is better if we do it when TRUE).

comment:12 in reply to:  9 Changed 7 years ago by pracine

These are the 2 entries for storing new raster data. In raster2pgsql.py we must set the flag when dealing with offdb bands. The rest of the cases are managed by the 2 core functions mentioned (rt_band_set_pixel, rt_band_set_nodata).

Am I missing something?

raster2pgsql.py do not do this only for off-db raster. It does it for every rasters. And it must do it while reading the file without affecting performance.

There should also be a boolean option to rt_band_set_nodata (and ST_SetBandNodata) to recheck every pixels. The default behavior is always to avoid this, unless the users explicitely ask for a recheck (with the option). We could add this option to ST_SetValue also.

Really, by default we try to avoid rechecking as this will decrease performance and is required only on edited coverage. In most situations the flag set by the loader will never have to change (unless there is edition). Even in the case of edition, the recheck must happen only on request.

comment:13 Changed 7 years ago by pracine

This boolean option, for efiting functions, should be considered as a set of future optimizations (not a priority) since tiles containing only nodata values are rare and edition as well.

Only the flag set by raster2pgsql.py should be considered as a priority since otherwise the flag is never set.

This is just to make sure that we put the priority on the one raster version of MapAlgebra?.

comment:14 in reply to:  11 Changed 7 years ago by jorgearevalo

Replying to pracine:

Replying to jorgearevalo:

Replying to pracine:

As you might guess there is no easy way to set the flag in case a ST_SetValue would set the last unique pixel withdata of a raster. There should be a ST_BandIsNoData(rast, TRUE) that would scan every raster to reset the flag if necessary. This function would be slow as it needs to scan every pixels. This function would not be called everytime by pixel setting functions but just some times by the user (like a VACUUM ANALYSE sql command).

Actually, I've coded the function with that option, just now. The flag forces a new checking.

By default the function should NOT check every pixel. It should do it just when the boolean parameter is set to TRUE, which means "reverify every pixels". (I wrote the opposite above. But I think it is better if we do it when TRUE).

Yes, I did it using FALSE as default. Much more sense.

comment:15 Changed 7 years ago by jorgearevalo

Almost finished in r6678. Todo tasks:

  • Change the loader
  • Add documentation of new function

comment:16 Changed 6 years ago by jorgearevalo

Resolution: fixed
Status: assignedclosed

Completed in r6716. Please check it and reopen the ticket if needed.

comment:17 Changed 6 years ago by pracine

Would it not be wiser/faster to check for nodata values in the stream of value imported by raster2pgsql.sql instead of just calling ST_BandIsNoData?

comment:18 Changed 6 years ago by pracine

Actually, tell me if I'm wrong, your change in raster2pgsql.py do not change anything in the database. To actually change the raster permanently you must do an UPDATE.

And still it is not wise to load the raster and then UPDATE it. raster2pgsql.py should check the pixel values while loading them and set the flag properly BEFORE writing the INSERT statement. (I don't know if this is possible right now because the flag has to be set so early).

Doing a full check of one table to reset the flag if needed is also problematic since ST_BandIsNoData just return TRUE or FALSE, not the updated raster with the flag correctly set. I guess we will need a ST_SetBandIsNoData returning a raster so we can do:

UPDATE rastertable SET rast = ST_SetBandIsNodata(rast, 2) WHERE ST_BandIsNodata(rast, 2) != ST_BandIsNodata(rast, 2, TRUE);

to update a complete table. (Any better way?)

Anyway this ST_BandIsNoData is not a big optimization and should be low priority in comparison with the the other things needed for MapAlgebra?. I would maybe just deactivate it until we have all the functions to synch it better with all the other edit functions. Sorry if this wasn't clear enough.

comment:19 Changed 6 years ago by jorgearevalo

Fully agree Pierre. The modifications made in loader scripts don't really work. I mixed the idea I was going to implement a function to set the flag and modify the raster at db with the existence of the st_bandisnodata function, that actually doesn't modify the raster at all. Stupid mistake, sorry. Too many things on my mind. I'm going to delete it from the loader (and change documentation)

We have a basic ST_BandIsNodata function that checks the flag. Enough, by now, to MapAlgebra? implementation. I'll finish it before optimizing this function, just as you've said.

comment:20 Changed 6 years ago by jorgearevalo

Resolution: fixed
Status: closedreopened

Anyway, I think the isnodata flag must be set to true in these 2 cases:

  • Calling ST_SetBandNoDataValue with TRUE as last argument
  • Calling ST_SetPixelValue with TRUE as last argument

Optionally, to simply "clean" the flag without changing any other band value, we may want, as you said, a ST_SetBandIsNodata function.

So, we can update the raster at database by calling one of that 3 functions over each raster object.

Agree?

comment:21 Changed 6 years ago by jorgearevalo

Commited some changes in r6736:

  • Created function ST_SetBandIsNodata
  • Deleted the isnodata checking from loader (it did nothing). Check the pixel values while loading them is not a trivial task. The WKB for band header is generated before the WKB for band data. So, we should modify WKB for band header after checking pixels when needed. I think that is not a priority right now.

comment:22 Changed 6 years ago by jorgearevalo

Little buf in previous release. Fixed in r6737.

comment:23 Changed 6 years ago by pracine

Milestone: PostGIS 2.0.0PostGIS Raster Future

comment:24 Changed 5 years ago by pracine

Milestone: PostGIS Raster FuturePostGIS Future

comment:25 Changed 5 years ago by dustymugs

Milestone: PostGIS FuturePostGIS 2.0.2
Resolution: fixed
Status: reopenedclosed
Version: trunk2.0.x

The function described here exists for 2.0. Closing ticket.

Note: See TracTickets for help on using tickets.