Opened 12 years ago

Closed 12 years ago

Last modified 11 years ago

#1319 closed task (fixed)

[raster] Make raster_columns a constraint-based view, replace AddRasterColumn with AddRasterConstraints and expand # of constraints on raster columns

Reported by: robe Owned by: Bborie Park
Priority: critical Milestone: PostGIS 2.0.0
Component: raster Version: master
Keywords: Cc: mateusz@…

Description

All are we in agreement on this -ditch the raster_columns table and replace with a view that reads from table constraints. We will make raster_columns a view that reads the properties from the constraints on the tables.

constraints in mind so far SRID, band types, scale x, scale y.

The other change to make this useful which I think we discussed and including in this ticket is to have

AddRasterColumns enforce band types which will provide the necessary info and enforcement of band types, num bands for the raster_columns table)

Please add to this other columns we will be keeping.

Just like geometry, people have the choice of relaxing constraints and those will just appear as NULL in the view or in case of SRID our unknown.

Attachments (1)

patch.diff (59.3 KB ) - added by Bborie Park 12 years ago.
patch for raster_columns and raster_overviews as views

Download all attachments as: .zip

Change History (50)

comment:1 by pracine, 12 years ago

So if I understand well some properties are defined when defining the type of the geometry column (this is the typmod) and more properties can be added to the view by defining constraints on the table? Can we list/differenciate what should be defined using the typmod and what should be defined using the constraints?

comment:2 by robe, 12 years ago

Nothing would be constrained using typmod for the raster case, it would all be constraint based and still use AddRasterColumn. If you can't fit everything you need to constrain in typmod, there is no point in using typmod at all.

typmod is only useful if people can build a raster like:

ALTER TABLE someraster_table ADD COLUMN rast raster(ARRAY['8BUI','8BUI','8BUI'], 4326,0.5,-0.5, …);

Since there is too much to define for raster, it can't use typmod or at least not until PostgreSQL expands on the space allowed.

Alternatively users would be able to do

ALTER TABLE someraster_table ADD COLUMN rast raster;

populate their table.

Then run something like a

SELECT populate_raster_columns();

Which would inspect the table data and apply constraints to the table based on the bandmetadata and metadata of the table.

That is pretty much what populate_geometry_columns() does (well it defaults to typmod, and override for constraint).

We might want to call it something different though since it is anow a misnomer since it really isn't populating any table. It's adding contraints/typmodifiying the data.

Hope I haven't confused you more.

comment:3 by robe, 12 years ago

Pierre, let me clarify the benefits I see:

Constraining/view has 3 key benefits 1) Its fast to inspect a constraint rather than the raw data so the view querying is fairly fast 2) The view is always in synch with the actual table, because no one can add data that violates the constraints unless they remove a constraint 3)If they were to remove a constraint, then the view would register NULL for that piece of info.

where it lacks elegance that typmod has is that its still at least 2 step process requiring maintenance functions.

e.g.

-- Option 1 --
CREATE TABLE myrasttable(rid serial primary key, rast raster);
INSERT DATA ...

SELECT populate_raster_columns('myrasttable'::regclass);
--- Option 2 --
CREAT TABLE myrasttable(rid serial primary key);
SELECT AddRasterColumn(.....);
INSERT DATA ... (though you can insert after since the constraints have already been established by the AddrasterColumn call)

comment:4 by Bborie Park, 12 years ago

I'm fine with the lack of typmod support (not just because of the size issue). The two step process is what people are used so no harm there.

I'll need to poke at how the geometry_columns view is getting its information as I've never had time to look at it.

I'd like us to decide on and list which raster attributes should have constraints and thus appear in the raster_columns view.

comment:5 by pracine, 12 years ago

I guess we will have to make raster_overviews a view as well…

comment:6 by robe, 12 years ago

I suppose we could. There is no property we can read from a raster to determine it is an overview raster though so we would therefore have to go by pure naming convention.

e.g. a table is a raster over view if it's got a name like

o_<anumber>_an_existing_raster_table (where a number is the pyramid level)

and it has a raster column

I guess the benefit of that is if someone drops an over view table, it would automatically disappear from the raster_overview catalog view.

How do people feel about going by naming conventions. Right now its just raster2pgsql.py making overviews and it follows the saming naming convention each time so easy to parse out.

comment:7 by Bborie Park, 12 years ago

Basic raster_columns view. Will need to add constraint checking functions to check for scalex, scaley and bandtype.

CREATE OR REPLACE VIEW v_raster_columns AS
	SELECT
		current_database()::character varying(256) AS r_table_catalog,
		n.nspname::character varying(256) AS r_table_schema,
		c.relname::character varying(256) AS r_table_name,
		a.attname::character varying(256) AS r_raster_column,
		COALESCE(postgis_constraint_srid(n.nspname::text, c.relname::text, a.attname::text), (SELECT ST_SRID('POINT(0 0)'::geometry))) AS srid,
		-- scalex double precision
		-- scaley double precision
		-- bandtype[] text[]
	FROM
		pg_class c,
		pg_attribute a,
		pg_type t,
		pg_namespace n
	WHERE t.typname = 'raster'::name
		AND a.attisdropped = false
		AND a.atttypid = t.oid
		AND a.attrelid = c.oid
		AND c.relnamespace = n.oid
		AND (c.relkind = 'r'::"char" OR c.relkind = 'v'::"char")
		AND NOT pg_is_other_temp_schema(c.relnamespace);

comment:8 by Bborie Park, 12 years ago

Added scalex and scaley. Will need to write a lower-level function to get the band pixtypes of a raster quickly.

CREATE OR REPLACE VIEW v_raster_columns AS
	SELECT
		current_database()::character varying(256) AS r_table_catalog,
		n.nspname::character varying(256) AS r_table_schema,
		c.relname::character varying(256) AS r_table_name,
		a.attname::character varying(256) AS r_raster_column,
		COALESCE(raster_constraint_srid(n.nspname::text, c.relname::text, a.attname::text), (SELECT ST_SRID('POINT(0 0)'::geometry))) AS srid,
		COALESCE(raster_constraint_scalex(n.nspname::text, c.relname::text, a.attname::text), 0) AS scalex,
		COALESCE(raster_constraint_scaley(n.nspname::text, c.relname::text, a.attname::text), 0) AS scaley
		-- bandtype[] text[]
	FROM
		pg_class c,
		pg_attribute a,
		pg_type t,
		pg_namespace n
	WHERE t.typname = 'raster'::name
		AND a.attisdropped = false
		AND a.atttypid = t.oid
		AND a.attrelid = c.oid
		AND c.relnamespace = n.oid
		AND (c.relkind = 'r'::"char" OR c.relkind = 'v'::"char")
		AND NOT pg_is_other_temp_schema(c.relnamespace);

CREATE OR REPLACE FUNCTION raster_constraint_srid(rastschema text, rasttable text, rastcolumn text)
  RETURNS integer AS
$BODY$
SELECT replace(replace(split_part(s.consrc, ' = ', 2), ')', ''), '(', '')::integer
		 FROM pg_class c, pg_namespace n, pg_attribute a, pg_constraint s
		 WHERE n.nspname = $1
		 AND c.relname = $2
		 AND a.attname = $3
		 AND a.attrelid = c.oid
		 AND s.connamespace = n.oid
		 AND s.conrelid = c.oid
		 AND a.attnum = ANY (s.conkey)
		 AND s.consrc LIKE '%srid(% = %';
$BODY$
  LANGUAGE sql STABLE STRICT
  COST 100;

CREATE OR REPLACE FUNCTION raster_constraint_scalex(rastschema text, rasttable text, rastcolumn text)
  RETURNS double precision AS
$BODY$
	SELECT
		replace(replace(split_part(split_part(s.consrc, ' = ', 2), '::', 1), ')', ''), '(', '')::double precision
	FROM pg_class c, pg_namespace n, pg_attribute a, pg_constraint s
	WHERE n.nspname = $1
		AND c.relname = $2
		AND a.attname = $3
		AND a.attrelid = c.oid
		AND s.connamespace = n.oid
		AND s.conrelid = c.oid
		AND a.attnum = ANY (s.conkey)
		AND s.consrc LIKE '%scalex(% = %';
$BODY$
  LANGUAGE sql STABLE STRICT
  COST 100;

CREATE OR REPLACE FUNCTION raster_constraint_scaley(rastschema text, rasttable text, rastcolumn text)
  RETURNS double precision AS
$BODY$
	SELECT
		replace(replace(split_part(split_part(s.consrc, ' = ', 2), '::', 1), ')', ''), '(', '')::double precision
	FROM pg_class c, pg_namespace n, pg_attribute a, pg_constraint s
	WHERE n.nspname = $1
		AND c.relname = $2
		AND a.attname = $3
		AND a.attrelid = c.oid
		AND s.connamespace = n.oid
		AND s.conrelid = c.oid
		AND a.attnum = ANY (s.conkey)
		AND s.consrc LIKE '%scaley(% = %';
$BODY$
  LANGUAGE sql STABLE STRICT
  COST 100;

comment:9 by pracine, 12 years ago

So the importer will call something like populate_raster_columns() which will automatically determine which constraint (srid, scalex, scaley, bandtype) is applicable to a table? If one and only one raster do not satisfy the constraint the constraint is not applyed right?

Maybe adding nodataValue and numberOfBand?

What about the extent? Can not be a constraint but it can be determined by the view? Too slow?

in reply to:  8 ; comment:10 by pracine, 12 years ago

Replying to dustymugs:

Added scalex and scaley. Will need to write a lower-level function to get the band pixtypes of a raster quickly.

I'm afraid there is no way to do that quick. Maybe for the first band but not for the other bands. Remember when you wrote deserialize_header… Would that mean that everything has to be deserialized to determine all the pixeltypes and the nodata values of all the bands of all the tables? Scary…

comment:11 by Bborie Park, 12 years ago

The importer will call an update AddRasterColumn which only needs to work on the table having the raster column added.

I'm planning on having the importer run by default in a strict mode (where all rasters passed in when the importer is called have the same scale x/y, srid and band types. There will be a set of flags to disable specific constraints though.

I don't know how useful nodatavalue is because there could be different nodatavalues between two rasters' bands of the same index because the values might be in different ranges. Say raster 1 band 1 has values between 0 - 125 while raster 2 band 1 has values between 126 - 255. Though the two rasters are aligned, they have different nodata values.

Number of bands will be added as the pixtypes array might not be correct but the number of bands might be.

Though extent might be valuable as it would be done using…

SELECT ST_ConvexHull(ST_Collect(tile)) FROM tmax_2010

It isn't fast though. On my tmax_2010 table with 36712 rows with tile size of 100x100, it takes 1 second. So for every year between 1965 to 2011, the view would take 47 seconds (assuming 1 second for each table). Add my tmin and precip datasets for the same years, we're talking 141 seconds.

in reply to:  10 comment:12 by Bborie Park, 12 years ago

Replying to pracine:

Replying to dustymugs:

Added scalex and scaley. Will need to write a lower-level function to get the band pixtypes of a raster quickly.

I'm afraid there is no way to do that quick. Maybe for the first band but not for the other bands. Remember when you wrote deserialize_header… Would that mean that everything has to be deserialized to determine all the pixeltypes and the nodata values of all the bands of all the tables? Scary…

Actually, I'm playing with a new function to see if I can selectively slice a band's metadata (and data if desired) without deserializing the entire raster. I'll see what the performance is like when I'm done.

in reply to:  11 ; comment:13 by pracine, 12 years ago

Replying to dustymugs:

The importer will call an update AddRasterColumn which only needs to work on the table having the raster column added.

I guess this is useless if we 'CREATE TABLE foo AS', right? This is why we also need an equivalent of populate_raster_columns()?

I'm planning on having the importer run by default in a strict mode (where all rasters passed in when the importer is called have the same scale x/y, srid and band types. There will be a set of flags to disable specific constraints though.

Why not automatically maintaining a set of 'samescalex', 'samescaley', 'samesrid' and 'samebandtype' flags as it loads many rasters so you can determine at the end which constraint you can set. Actually this is exactly what populate_raster_columns() would do… Again it is not wise to do outside the database (like creating the overviews) what is useful to be done inside (updating the constraints). I think AddRasterColumn() become obsolete with 2.0 and should be replaced with the equivalent of Populate_Raster_Columns() (or Apply_Raster_Constraints()?)

I don't know how useful nodatavalue is because there could be different nodatavalues between two rasters' bands of the same index because the values might be in different ranges. Say raster 1 band 1 has values between 0 - 125 while raster 2 band 1 has values between 126 - 255. Though the two rasters are aligned, they have different nodata values.

I don't follow you. Why would two raster storing a different range of values would have different nodata values?

Though extent might be valuable as it would be done using…

SELECT ST_ConvexHull(ST_Collect(tile)) FROM tmax_2010

It isn't fast though. On my tmax_2010 table with 36712 rows with tile size of 100x100, it takes 1 second. So for every year between 1965 to 2011, the view would take 47 seconds (assuming 1 second for each table). Add my tmin and precip datasets for the same years, we're talking 141 seconds.

So we will tell people to query the specific table with ST_Extent(rast::geometry) to get the extent…

in reply to:  13 ; comment:14 by Bborie Park, 12 years ago

Replying to pracine:

Replying to dustymugs:

The importer will call an update AddRasterColumn which only needs to work on the table having the raster column added.

I guess this is useless if we 'CREATE TABLE foo AS', right? This is why we also need an equivalent of populate_raster_columns()?

Yes. AddRasterColumn adds a new empty raster column and constraints to an existing table while populate_raster_column() only adds appropriate constraints to an existing and populated raster column.

I'm planning on having the importer run by default in a strict mode (where all rasters passed in when the importer is called have the same scale x/y, srid and band types. There will be a set of flags to disable specific constraints though.

Why not automatically maintaining a set of 'samescalex', 'samescaley', 'samesrid' and 'samebandtype' flags as it loads many rasters so you can determine at the end which constraint you can set. Actually this is exactly what populate_raster_columns() would do… Again it is not wise to do outside the database (like creating the overviews) what is useful to be done inside (updating the constraints). I think AddRasterColumn() become obsolete with 2.0 and should be replaced with the equivalent of Populate_Raster_Columns() (or Apply_Raster_Constraints()?)

Sounds like a plan to me. The loader won't use apply_raster_constraints (sounds more appropriate for the actions to be done) as that requires another pass over the able while the loader has already looked at the data. So the loader will set constraints as long as the import type isn't append.

I don't know how useful nodatavalue is because there could be different nodatavalues between two rasters' bands of the same index because the values might be in different ranges. Say raster 1 band 1 has values between 0 - 125 while raster 2 band 1 has values between 126 - 255. Though the two rasters are aligned, they have different nodata values.

I don't follow you. Why would two raster storing a different range of values would have different nodata values?

Both rasters have the a band of same index of type PT_8BUI. All data in raster 1 band 1 has values between 0 - 125. All data in raster 2 band 1 has values between 126 - 255. And both bands have nodata pixels. So, raster 1 band 1 would need a nodata value greater than 125 while raster 2 band 1 would need a nodata value less than 126. Hence, same pixeltype for both bands at same index but different nodata values.

Though extent might be valuable as it would be done using…

SELECT ST_ConvexHull(ST_Collect(tile)) FROM tmax_2010

It isn't fast though. On my tmax_2010 table with 36712 rows with tile size of 100x100, it takes 1 second. So for every year between 1965 to 2011, the view would take 47 seconds (assuming 1 second for each table). Add my tmin and precip datasets for the same years, we're talking 141 seconds.

So we will tell people to query the specific table with ST_Extent(rast::geometry) to get the extent…

Yes. There's no way around this limitation.

in reply to:  14 ; comment:15 by pracine, 12 years ago

Replying to dustymugs:

Replying to pracine:

Replying to dustymugs:

Why not automatically maintaining a set of 'samescalex', 'samescaley', 'samesrid' and 'samebandtype' flags as it loads many rasters so you can determine at the end which constraint you can set. Actually this is exactly what populate_raster_columns() would do… Again it is not wise to do outside the database (like creating the overviews) what is useful to be done inside (updating the constraints). I think AddRasterColumn() become obsolete with 2.0 and should be replaced with the equivalent of Populate_Raster_Columns() (or Apply_Raster_Constraints()?)

Sounds like a plan to me. The loader won't use apply_raster_constraints (sounds more appropriate for the actions to be done) as that requires another pass over the able while the loader has already looked at the data. So the loader will set constraints as long as the import type isn't append.

apply_raster_constraints() can be called on a single table so this extra pass should take about 1 second right? Not a big deal. The advantage of getting the job done by apply_raster_constraints as a post loading query is that we can get rid of AddRasterColumn()… and write a simpler loader (no metadata check, just load everything blindly and the post process will apply the right constraints). I'm just trying to save you work and KISS…

I don't know how useful nodatavalue is because there could be different nodatavalues between two rasters' bands of the same index because the values might be in different ranges. Say raster 1 band 1 has values between 0 - 125 while raster 2 band 1 has values between 126 - 255. Though the two rasters are aligned, they have different nodata values.

I don't follow you. Why would two raster storing a different range of values would have different nodata values?

Both rasters have the a band of same index of type PT_8BUI. All data in raster 1 band 1 has values between 0 - 125. All data in raster 2 band 1 has values between 126 - 255. And both bands have nodata pixels. So, raster 1 band 1 would need a nodata value greater than 125 while raster 2 band 1 would need a nodata value less than 126. Hence, same pixeltype for both bands at same index but different nodata values.

The same logic applies to pixeltypes. There is no more reason a nodata value should be different from one raster to the other in a consistent dataset composed of many rasters than there are reasons the pixeltypes should be different from one raster to another. Anyway the constraint is just not set if this happen, right? In most case it should not happen and some people already fetch the nodata values from raster_column.

in reply to:  15 ; comment:16 by Bborie Park, 12 years ago

Replying to pracine:

Replying to dustymugs:

Replying to pracine:

Replying to dustymugs:

Why not automatically maintaining a set of 'samescalex', 'samescaley', 'samesrid' and 'samebandtype' flags as it loads many rasters so you can determine at the end which constraint you can set. Actually this is exactly what populate_raster_columns() would do… Again it is not wise to do outside the database (like creating the overviews) what is useful to be done inside (updating the constraints). I think AddRasterColumn() become obsolete with 2.0 and should be replaced with the equivalent of Populate_Raster_Columns() (or Apply_Raster_Constraints()?)

Sounds like a plan to me. The loader won't use apply_raster_constraints (sounds more appropriate for the actions to be done) as that requires another pass over the able while the loader has already looked at the data. So the loader will set constraints as long as the import type isn't append.

apply_raster_constraints() can be called on a single table so this extra pass should take about 1 second right? Not a big deal. The advantage of getting the job done by apply_raster_constraints as a post loading query is that we can get rid of AddRasterColumn()… and write a simpler loader (no metadata check, just load everything blindly and the post process will apply the right constraints). I'm just trying to save you work and KISS…

I think it'll take longer because it needs to check each row to see what the table's constraints should be. So no way around that issue. The only way that this can be fast is to blindly attempt to apply all the constraints.

I don't know how useful nodatavalue is because there could be different nodatavalues between two rasters' bands of the same index because the values might be in different ranges. Say raster 1 band 1 has values between 0 - 125 while raster 2 band 1 has values between 126 - 255. Though the two rasters are aligned, they have different nodata values.

I don't follow you. Why would two raster storing a different range of values would have different nodata values?

Both rasters have the a band of same index of type PT_8BUI. All data in raster 1 band 1 has values between 0 - 125. All data in raster 2 band 1 has values between 126 - 255. And both bands have nodata pixels. So, raster 1 band 1 would need a nodata value greater than 125 while raster 2 band 1 would need a nodata value less than 126. Hence, same pixeltype for both bands at same index but different nodata values.

The same logic applies to pixeltypes. There is no more reason a nodata value should be different from one raster to the other in a consistent dataset composed of many rasters than there are reasons the pixeltypes should be different from one raster to another. Anyway the constraint is just not set if this happen, right? In most case it should not happen and some people already fetch the nodata values from raster_column.

Alright. Works for me. From the responses we got from raster plugin developers, none of them use raster_columns. I want to know why anyone would be using the static raster_columns table.

in reply to:  16 comment:17 by pracine, 12 years ago

Replying to dustymugs:

Replying to pracine:

Replying to dustymugs:

Replying to pracine:

Replying to dustymugs:

Alright. Works for me. From the responses we got from raster plugin developers, none of them use raster_columns. I want to know why anyone would be using the static raster_columns table.

Forwarded you the messages you were not CC on…

comment:18 by Bborie Park, 12 years ago

So far, it looks like the raster_columns view will have the following columns:

r_table_catalog r_table_schema r_table_name r_raster_column srid scale_x scale_y num_bands pixel_types (assuming the check constraint can be fast) nodata_values (assuming the check constraint can be fast)

The columns currently undecided are:

out_db regular_blocking blocksize_x blocksize_y

The columns not being carried over are:

extent (performance)

comment:19 by mloskot, 12 years ago

Cc: mateusz@… added

comment:20 by Bborie Park, 12 years ago

What purpose does out_db fulfill? Since it is a simple boolean field (single TRUE or FALSE), what does it mean when out_db = TRUE?

Does that mean every raster's bands are out-of-db? One or more raster's bands are out-of-db? Personally, I find this parameter non-specific and hence useless. If this parameter was a boolean array, that'd make more sense as that'd provide the ability to indicate that the second band of every raster in the table is out-of-db.

comment:22 by Bborie Park, 12 years ago

I believe the column "regular_blocking" should be renamed to "same_alignment" as it is an analog to ST_SameAlignment.

comment:23 by Bborie Park, 12 years ago

Priority: highcritical

Instead of having the column "out_db", the column should be renamed to "all_in_db" as that is more descriptive of the purpose of the column. So instead of the current definition for "out_db" as described in #152, the new definition for "all_in_db" is:

All bands of all rasters in the column of the table is in-db. No bands are out-of-db.

comment:24 by Bborie Park, 12 years ago

Owner: changed from robe to Bborie Park
Summary: [raster] Make raster_columns a view and AddRasterColumn enforce more[raster] Make raster_columns a constraint-based view, replace AddRasterColumn with ApplyRasterConstraints and expand # of constraints on raster columns

comment:25 by Bborie Park, 12 years ago

Status: newassigned

in reply to:  22 ; comment:26 by mloskot, 12 years ago

Replying to dustymugs:

I believe the column "regular_blocking" should be renamed to "same_alignment" as it is an analog to ST_SameAlignment.

AFAICT from the current docs on ST_SameAlignment, the regular_blocking is not the same concept.

The regular_blocking flag indicates if tiles in a table/coverage form a regular rectangular grid, or not.

in reply to:  23 ; comment:27 by mloskot, 12 years ago

Replying to dustymugs:

Instead of having the column "out_db", the column should be renamed to "all_in_db"

Is it possible to have a raster partially "in_db" and partially "out_db". If it is not possible, "all_" prefix is superfluous.

in reply to:  26 ; comment:28 by pracine, 12 years ago

Replying to mloskot:

Replying to dustymugs:

I believe the column "regular_blocking" should be renamed to "same_alignment" as it is an analog to ST_SameAlignment.

AFAICT from the current docs on ST_SameAlignment, the regular_blocking is not the same concept.

The regular_blocking flag indicates if tiles in a table/coverage form a regular rectangular grid, or not.

samealignment is necessary but not sufficient. Regularlyblocked also needs samewidth, sameheight, hasnogaps and hasnooverlaps. I am pretty confident that those 5 condition makes regular blocking. See Objective FV.20 in the specs.

in reply to:  27 ; comment:29 by pracine, 12 years ago

Replying to mloskot:

Replying to dustymugs:

Instead of having the column "out_db", the column should be renamed to "all_in_db"

Is it possible to have a raster partially "in_db" and partially "out_db". If it is not possible, "all_" prefix is superfluous.

out-db is a band property so theoretically it is possible that one band is in-db and another band is out-db… This is why we need an array for out-db. But I would not set it as a constraint now as out-db is still inconsistently implemented.

in reply to:  28 ; comment:30 by mloskot, 12 years ago

Replying to pracine:

Replying to mloskot:

Replying to dustymugs:

I believe the column "regular_blocking" should be renamed to "same_alignment" as it is an analog to ST_SameAlignment.

AFAICT from the current docs on ST_SameAlignment, the regular_blocking is not the same concept.

The regular_blocking flag indicates if tiles in a table/coverage form a regular rectangular grid, or not.

samealignment is necessary but not sufficient. Regularlyblocked also needs samewidth, sameheight, hasnogaps and hasnooverlaps.

Indeed, ST_SameAlignment can not be used as replacement for the regular_blocking flag.

in reply to:  30 ; comment:31 by pracine, 12 years ago

Replying to mloskot:

Replying to pracine:

Replying to mloskot:

Replying to dustymugs:

I believe the column "regular_blocking" should be renamed to "same_alignment" as it is an analog to ST_SameAlignment.

AFAICT from the current docs on ST_SameAlignment, the regular_blocking is not the same concept.

The regular_blocking flag indicates if tiles in a table/coverage form a regular rectangular grid, or not.

samealignment is necessary but not sufficient. Regularlyblocked also needs samewidth, sameheight, hasnogaps and hasnooverlaps.

Indeed, ST_SameAlignment can not be used as replacement for the regular_blocking flag.

Agreed. But you don't need regular_blocking anymore right? We are not planning to set this constraint, right Bborie?

in reply to:  31 comment:32 by Bborie Park, 12 years ago

Agreed. But you don't need regular_blocking anymore right? We are not planning to set this constraint, right Bborie?

I don't really see an overt need for regular_blocking unless you're interested in the "perfect" condition of a table with no overlaps and no gaps. I know all my raster tables would fail this constraint as I put a year's worth or daily rasters into one table so for the whole table, there definitely would be overlap.

Having same alignment be a constraint is desirable because users don't need to be concerned of resampling a raster before conducting a 2-raster operation.

in reply to:  29 comment:33 by Bborie Park, 12 years ago

Replying to pracine:

Replying to mloskot:

Replying to dustymugs:

Instead of having the column "out_db", the column should be renamed to "all_in_db"

Is it possible to have a raster partially "in_db" and partially "out_db". If it is not possible, "all_" prefix is superfluous.

out-db is a band property so theoretically it is possible that one band is in-db and another band is out-db… This is why we need an array for out-db. But I would not set it as a constraint now as out-db is still inconsistently implemented.

So, does this mean we should remove it from the view for now? Maybe come back in the future when the out-of-db implementation is fully developed?

in reply to:  27 ; comment:34 by Bborie Park, 12 years ago

Replying to mloskot:

Replying to dustymugs:

Instead of having the column "out_db", the column should be renamed to "all_in_db"

Is it possible to have a raster partially "in_db" and partially "out_db". If it is not possible, "all_" prefix is superfluous.

Yes. Part of the problem is that out-of-db support is insufficient for actual use and needs a lot of work to make it usable. But that won't happen until some time after 2.0. Hence, I'm for getting rid of the out_db column.

in reply to:  34 comment:35 by pracine, 12 years ago

Replying to dustymugs:

Replying to mloskot:

Replying to dustymugs:

Instead of having the column "out_db", the column should be renamed to "all_in_db"

Is it possible to have a raster partially "in_db" and partially "out_db". If it is not possible, "all_" prefix is superfluous.

Yes. Part of the problem is that out-of-db support is insufficient for actual use and needs a lot of work to make it usable. But that won't happen until some time after 2.0. Hence, I'm for getting rid of the out_db column.

+1

in reply to:  34 comment:36 by mloskot, 12 years ago

Replying to dustymugs:

Hence, I'm for getting rid of the out_db column.

+1

in reply to:  31 ; comment:37 by mloskot, 12 years ago

Replying to pracine:

Replying to mloskot:

Indeed, ST_SameAlignment can not be used as replacement for the regular_blocking flag.

Agreed. But you don't need regular_blocking anymore right?

This constraint is supposed to serve certain use-cases where a table stores regular grid of tiles, so a client application can recognize this kind of configuration quickly and easily, and treat it in such specific case. The tiles storage/cache is one of important use cases for PostGIS Raster. Please, with all my respect, don't be scientific-ignorant here guys and don't make PostGIS Raster just another GRASS, please.

If you decide to get rid of this flag, then the only way to determine if a table is tiles storage or not is to perform slow (!) query.

So, for the sake of best possible performance, I do use it myself.

Per analogia, where vector folks here store SRID information?

in reply to:  37 comment:38 by pracine, 12 years ago

Replying to mloskot:

Replying to pracine:

Replying to mloskot:

Indeed, ST_SameAlignment can not be used as replacement for the regular_blocking flag.

Agreed. But you don't need regular_blocking anymore right?

This constraint is supposed to serve certain use-cases where a table stores regular grid of tiles, so a client application can recognize this kind of configuration quickly and easily, and treat it in such specific case. The tiles storage/cache is one of important use cases for PostGIS Raster. Please, with all my respect, don't be scientific-ignorant here guys and don't make PostGIS Raster just another GRASS, please.

You will have to explain me what is "scientific-ignorant" here before I do anything… Is there a "proprietary-ignorant" or "company-stupid" equivalent? Why do people at Cadcord always have to do personnal attack when defending their cases? Is this British humor?

in reply to:  37 ; comment:39 by Bborie Park, 12 years ago

Replying to mloskot:

Replying to pracine:

Replying to mloskot:

Indeed, ST_SameAlignment can not be used as replacement for the regular_blocking flag.

Agreed. But you don't need regular_blocking anymore right?

This constraint is supposed to serve certain use-cases where a table stores regular grid of tiles, so a client application can recognize this kind of configuration quickly and easily, and treat it in such specific case. The tiles storage/cache is one of important use cases for PostGIS Raster. Please, with all my respect, don't be scientific-ignorant here guys and don't make PostGIS Raster just another GRASS, please.

scientific-ignorant? You'll have to elaborate on that. Evidently, the research I'm involved with isn't scientific according to you.

If you decide to get rid of this flag, then the only way to determine if a table is tiles storage or not is to perform slow (!) query.

From the looks of it in the AddRasterColumn function, regular_blocking is just a boolean value passed with no true validation of making sure that no tiles overlap and there are no gaps. As the appropriate constraints won't be added until after PostGIS 2.0 is released, we could just add a garbage constraint to indicate that a column is regularly blocked.

Per analogia, where vector folks here store SRID information?

You're going to have to be more descriptive here. Typically, SRID is found in the serialized data of geometry or raster.

comment:40 by Bborie Park, 12 years ago

Summary: [raster] Make raster_columns a constraint-based view, replace AddRasterColumn with ApplyRasterConstraints and expand # of constraints on raster columns[raster] Make raster_columns a constraint-based view, replace AddRasterColumn with AddRasterConstraints and expand # of constraints on raster columns

Instead of ApplyRasterConstraints, use AddRasterConstraints as it is phonetically similar to the true SQL of "ALTER TABLE ADD CONSTRAINT". For dropping constraints, use DropRasterConstraints (ALTER TABLE DROP CONSTRAINT).

comment:41 by robe, 12 years ago

+1 Add is shorter to type too.

in reply to:  39 ; comment:42 by mloskot, 12 years ago

Replying to dustymugs:

Replying to mloskot:

Please, with all my respect, don't be scientific-ignorant here guys and don't make PostGIS Raster just another GRASS, please.

scientific-ignorant? You'll have to elaborate on that.

I have a constant impression that the PostGIS Raster is becoming more fully-featured analytical GIS system where storage use cases are forgotten. I apologies if I've touched you or anyone.

If you decide to get rid of this flag, then the only way to determine if a table is tiles storage or not is to perform slow (!) query.

From the looks of it in the AddRasterColumn function, regular_blocking is just a boolean value passed with no true validation of making sure that no tiles overlap and there are no gaps. As the appropriate constraints won't be added until after PostGIS 2.0 is released, we could just add a garbage constraint to indicate that a column is regularly blocked.

AFAIR, constraint on regular blocking has never been planned. It would be slow anyway. The uploader is supposed to play the key role here and take care of assuring tiles in a table are regularly blocked if such configuration is requested. IMHO, the fact that there is no constraint attached does not make the flag useless or invalid concept.

Per analogia, where vector folks here store SRID information?

You're going to have to be more descriptive here.

Nevermind, this is wrong analogy.

in reply to:  42 ; comment:43 by Bborie Park, 12 years ago

Replying to mloskot:

Replying to dustymugs:

Replying to mloskot:

Please, with all my respect, don't be scientific-ignorant here guys and don't make PostGIS Raster just another GRASS, please.

scientific-ignorant? You'll have to elaborate on that.

I have a constant impression that the PostGIS Raster is becoming more fully-featured analytical GIS system where storage use cases are forgotten.

Now that make sense. Yes, I agree with that as my needs are a mix of analytical and storage/management. What I consider the biggest problem with PostGIS Raster at the moment is the enormous number of items that people want done in the project (particularly the imminent feature freeze for PostGIS 2.0) versus the limited amount of time for new development and maintenance.

I apologies if I've touched you or anyone.

I have a thick skin. I always tell people to stop with trying to use "with all respect", "with no disrespect", "no offense" or any variant as they're almost always the opposite. Be honest and speak your mind. I plan to or hold my tongue if whatever I have to say could be REALLY offensive.

If you decide to get rid of this flag, then the only way to determine if a table is tiles storage or not is to perform slow (!) query.

From the looks of it in the AddRasterColumn function, regular_blocking is just a boolean value passed with no true validation of making sure that no tiles overlap and there are no gaps. As the appropriate constraints won't be added until after PostGIS 2.0 is released, we could just add a garbage constraint to indicate that a column is regularly blocked.

AFAIR, constraint on regular blocking has never been planned. It would be slow anyway. The uploader is supposed to play the key role here and take care of assuring tiles in a table are regularly blocked if such configuration is requested. IMHO, the fact that there is no constraint attached does not make the flag useless or invalid concept.

But what happens if the end user appends more rasters to that column flagged as regular_blocking? Ideally, regular_blocking should be a constraint of some sort as if you want a column to be regular_blocking, you should be willing to enforce a rule to that effect.

To get regular_blocking though, there are two items that need to happen:

  1. ST_Overlaps(raster, raster) to make sure no two rasters overlap as isn't planned for 2.0.
  1. something to test that a table has no gaps.

For #1, we can do a table EXCLUDE constraint to make sure that no rasters overlap.

As for #2, that'll require some thinking.

in reply to:  43 ; comment:44 by mloskot, 12 years ago

Replying to dustymugs:

Replying to mloskot:

I apologies if I've touched you or anyone.

I have a thick skin. I always tell people to stop with trying to use "with all respect", (…) I plan to or hold my tongue if whatever I have to say could be REALLY offensive.

I criticise ideas, not people. So, make it clear, I use such phrases to indicate respect to people which I always have.

IMHO, the fact that there is no constraint attached does not make the flag useless or invalid concept.

But what happens if the end user appends more rasters to that column flagged as regular_blocking?

  1. I assume, in regular blocking use case, it is very rare case, unlikely to happen.
  2. I assume lack of constraint is documented and user is warned about what happens when new tiles are added.
  3. There can be a function user can execute explicitly to validate table and answer if regular blocking still holds
  4. Given 1, 2 and 3, I see no problem.

Ideally, regular_blocking should be a constraint of some sort as if you want a column to be regular_blocking, you should be willing to enforce a rule to that effect.

It will be expansive operation. However, Constraint is removable, fortunately, so performance hit can be removed as well.

To get regular_blocking though, there are two items that need to happen:

  1. ST_Overlaps(raster, raster) to make sure no two rasters overlap as isn't planned for 2.0.

OK

  1. something to test that a table has no gaps.

There is a concept I have briefly discussed with Pierre when we met last time in Denver which is linked tiles. If there are images with repeated content (e.g. all pixels are NODATA), such tiles could be linked instead of storing copies. This could solve gaps: if there is a gap, link NODATA tile.

Finally, clients rendering tiles should be able to handle gaps easily.

For #1, we can do a table EXCLUDE constraint to make sure that no rasters overlap.

Yes.

As for #2, that'll require some thinking.

Indeed.

in reply to:  44 ; comment:45 by Bborie Park, 12 years ago

Replying to mloskot:

Replying to dustymugs:

But what happens if the end user appends more rasters to that column flagged as regular_blocking?

  1. I assume, in regular blocking use case, it is very rare case, unlikely to happen.

Appending more rasters should work if a table constraint on regular blocking (see below) is delayed until the end of transaction.

  1. I assume lack of constraint is documented and user is warned about what happens when new tiles are added.

At the present, the documentation on the raster_columns and raster_overviews tables is far and few in-between.

  1. There can be a function user can execute explicitly to validate table and answer if regular blocking still holds

That would be needed to validate that a column of rasters is regularly blocked.

  1. Given 1, 2 and 3, I see no problem.

Ideally, regular_blocking should be a constraint of some sort as if you want a column to be regular_blocking, you should be willing to enforce a rule to that effect.

It will be expansive operation. However, Constraint is removable, fortunately, so performance hit can be removed as well.

Yes. Hence why maximum extent is a constraint. Remove it if you can't afford the performance hit in bulk inserts or updates.

To get regular_blocking though, there are two items that need to happen:

  1. ST_Overlaps(raster, raster) to make sure no two rasters overlap as isn't planned for 2.0.

OK

  1. something to test that a table has no gaps.

There is a concept I have briefly discussed with Pierre when we met last time in Denver which is linked tiles. If there are images with repeated content (e.g. all pixels are NODATA), such tiles could be linked instead of storing copies. This could solve gaps: if there is a gap, link NODATA tile.

Finally, clients rendering tiles should be able to handle gaps easily.

I don't think that is a problem at the moment. qgis with the postgis raster plugin seems to have no problems with gaps in a coverage.

For #1, we can do a table EXCLUDE constraint to make sure that no rasters overlap.

Yes.

As for #2, that'll require some thinking.

Indeed.

With additional thinking and diagramming, a table EXCLUDE constraint could work assuming the right function is written.

I think a possible interim solution (at least for 2.0 code freeze) is to add a regular_blocking column to the raster_columns view with a garbage constraint that allows the end-user (who should know their own data) to specify that a column is regularly blocked. A constraint is best as that makes no assumptions and can be done within the database but won't be available until 2.1.

in reply to:  45 comment:46 by bnordgren, 12 years ago

Replying to dustymugs:

I think a possible interim solution (at least for 2.0 code freeze) is to add a regular_blocking column to the raster_columns view with a garbage constraint that allows the end-user (who should know their own data) to specify that a column is regularly blocked. A constraint is best as that makes no assumptions and can be done within the database but won't be available until 2.1.

I'd agree that the column should be "informative" until the objectives can be hashed out a little more. I very much like the idea of "regularly blocked" raster groups from the perspective of accelerated access (e.g., access by "original" pixel values, easily computed by the original grid metadata). However, I do not want to be forced to create a new table for each raster.

For instance, the GFED3 database has monthly burned area and emissions data on a 0.5x0.5 degree global grid. While I'd love to take advantage of whatever acceleration is offered in the future for regularly blocked rasters, I don't want to add a new table for every month. Injecting a computed table name into an SQL query is always awkward.

by Bborie Park, 12 years ago

Attachment: patch.diff added

patch for raster_columns and raster_overviews as views

comment:47 by Bborie Park, 12 years ago

Patch attached for implementing raster_columns and raster_overviews as views. Existing raster_columns and raster_overviews are renamed by prefixing "deprecated_" to the table name to give users time to run the functions AddRasterConstraints and AddOverviewContraints.

The function sets for AddRasterColumn, DropRasterColumn and DropRasterTable still exist though in a limited capability as there are no raster_columns and raster_overviews tables. This is so that the current python raster2pgsql.py loader can still be used until the C raster2pgsql.c is merged into trunk.

Once the C raster2pgsql.c is merged into trunk, the functions for AddRasterColumn, DropRasterColumn and DropRasterTable will be deleted. The deprecated raster_columns and raster_overviews tables (prefixed with deprecated_) will not be touched as it is up to the user to drop those tables.

With this change from tables to views, 3rd party plugins should be able to confidently use the information in the raster_columns and raster_overviews views as the data underlying the views are derived from existing table constraints. If a value isn't present in the view, the corresponding constraint does not exist for that specific raster column.

I plan on making an announcement on the postgis-devel list after I've made the commit so that people can understand the change, see how to establish constraints for their raster columns and apply the overview constraint to their overviews. The announcement will be made on Monday 2011-12-05 as it's Friday evening (for me at least) and I don't expect many people to be active.

comment:48 by Bborie Park, 12 years ago

Resolution: fixed
Status: assignedclosed

Added in r8305

comment:49 by mloskot, 11 years ago

(archive janitor)

Related/dup ticket #1216

Note: See TracTickets for help on using tickets.