Opened 11 years ago
Closed 9 years ago
#2485 closed defect (fixed)
[raster]: raster constraints prevent raster data from restoring
Reported by: | rdunklau | Owned by: | dustymugs |
---|---|---|---|
Priority: | critical | Milestone: | PostGIS 2.2.2 |
Component: | raster | Version: | 2.1.x |
Keywords: | Cc: |
Description
The sql functions installed by PostGIS should probably set their own search path, or qualify the underlying function calls namespace.
Take the following function definition from raster/rt_pg/rtpostgis.sql.in:
CREATE OR REPLACE FUNCTION _raster_constraint_pixel_types(rast raster) RETURNS text[] AS $$ SELECT array_agg(pixeltype)::text[] FROM st_bandmetadata($1, ARRAY[]::int[]); $$ LANGUAGE 'sql' STABLE STRICT;
The call to st_bandmetadata is not qualified. Therefore, when using this function in a constraint definition in another schema than public (let's say ns1), the table data can not be dumped and restored, since pg_restore sets the search_path to ns1, pg_catalog. This skips entirely the schema in which the st_bandmetadata resides (public, by default).
This has another implication: this means that those function calls should probably be qualified by the @extschema@ namespace, which makes the extension non-relocatable.
Please find attached a simple test case.
How to reproduce:
- create a database
- run the supplied script against this database
- use pg_dump to create a dump of the given database
- use pg_restore to restore this dump
What is expected:
- the dump is restored successfully
What happens:
- the data from the ns1.t1 table is not restored, due to search_path issues when calling the check constraint function.
Attachments (1)
Change History (47)
by , 11 years ago
Attachment: | testcase.sql added |
---|
comment:1 by , 11 years ago
Component: | postgis → raster |
---|---|
Milestone: | → PostGIS 2.0.5 |
Owner: | changed from | to
Summary: | SQL functions namespace qualification → [raster]: SQL functions namespace qualification |
this will be difficult to do since people install postgis in different schemas and enforcing a particular schema will be hard without causing more damage.
This issue you describe is only an issue for functions that call other functions. The backup schema qualifies the constraint functions, so its the fact that this constraint function calls another function that is the problem. I've had similar issue when using ST_Transform on indexes that my indexes don't come back unless I add a search path to the ST_Transform function.
Bborie and Pierre, Is there any way to make this constraint function not call ST_BandMetaData? That is ultimately the issue. I've flagged this as a raster issue since I don't think we have any functions in PostGIS geometry/geography that are used in constraints and call other functions.
comment:2 by , 11 years ago
Using @extschema@ to refer to the extension schema may accomodate both use.
The extension will be installable in another schema. However, the extension will not be fully-relocatable: the ALTER EXTENSION POSTGIS set schema
statement will not work.
The PostgreSQL documentation about extensions explains this better than I can:
http://www.postgresql.org/docs/current/static/extend-extensions.html#AEN54999
comment:3 by , 11 years ago
Other functions at issue at a cursory glance I think these are the only problematic ones.
_raster_constraint_info_regular_blocking
_raster_constraint_nodata_values
_raster_constraint_out_db
As noted by rdunklau, I think with extension install, there is a way to grab the chosen extension and add that as a search_path of the function, but not with old way of installing. Anyway it sounds kinda messy even with extension. The ideal solution is avoiding calling other none pg_catalog functions from the constraints.
comment:4 by , 11 years ago
choosen schema I meant but then it won't be relocatable afterward which is yucky solution.
comment:5 by , 11 years ago
Priority: | medium → critical |
---|
comment:6 by , 11 years ago
Summary: | [raster]: SQL functions namespace qualification → [raster]: raster constraints prevent raster data from restoring |
---|
comment:7 by , 11 years ago
Milestone: | PostGIS 2.0.5 → PostGIS Future |
---|---|
Priority: | critical → medium |
Version: | 2.0.x → trunk |
This really is no different than pre-typemod geometry where constraints were defined with functions.
ST_SRID(geom) = 4326 ST_NDims(geom) = 2 GeometryType = 'POLYGON'
So, there is no way to check the attributes of a raster without making use of the raster functions.
comment:8 by , 11 years ago
Milestone: | PostGIS Future → PostGIS 2.0.5 |
---|---|
Priority: | medium → critical |
comment:9 by , 11 years ago
dustymugs,
It's different. The difference is that those constraint functions don't call other functions within the definition. The raster constraint functions that don't call other postgis functions in their body are fine. It's the ones I mentioned and the example one.
This is actually a pretty serious problem that rdunklau has pointed out. It means you can't restore your raster data that has constraint without first restoring the table structure and then doing another restore of the data.
comment:10 by , 11 years ago
thinking about this some more it seems we have 2 not so ugly options (well in my mind anyway)
1) embed the whole sql call of these functions in the constraint itself. This is somewhat undesirable because it requires people dropping and reapplying constraints and may break your info calls so those may need to be redone.
2) My most desirable make these C functions, but I have no idea how complicated it is to do that. I can only assume since it takes just one raster, it shouldn't be that complicated - the trickest seems to me calling that array_agg from c.
comment:11 by , 11 years ago
I should add
_raster_constraint_info_regular_blocking
I was wrong. That one is none-issue I think since I think its only used in the view.
comment:12 by , 11 years ago
Ah… so not the actual function used in the constraint but rather functions within the constraint functions.
I don't understand your second paragraph. The data has no constraints, the table does. If you're restoring data to a table with no constraints, there is nothing to go wrong.
If you're doing a standard pg_dump/pg_restore cycle, I've never seen where the data would be restored before the table structure…
comment:13 by , 11 years ago
None of the _raster_constraint_info_XYZ functions should have any problems unless your database layout is all sorts of crazy.
comment:14 by , 11 years ago
What I meant to say is what happens is the constraint gets created but since it can't find its mate functions, the constraint check fails and data load fails because the search_path is set to pg_catalog,schema_of_table. It's actually a non-issue if schema_of_table is same schema as what postgis is installed in.
In retrospect, it isn't quite as serious as I thought since most people are lazy and throw postgis and their data in public. It's only an issue if you keep your data in a schema different from your postgis install. So I suspect Bborie you'll be bitten by this. Leo would be bitten too. It's serious but not quite as serious and an issue for anal retentive folks
As far as the view goes, the view is a non-issue the raster_columns view is always in same schema as postgis and I think views are treated differently anyway. So you are right info functions are not affected.
comment:15 by , 11 years ago
If this is what I think it is, I've always been bitten by this since I always install PostGIS into its own schema (same goes with PG contrib modules into a contrib schema). I always emit the SQL from pg_restore (pg_dump in custom format) and make my necessary changes (usually with a shell script).
comment:16 by , 11 years ago
The scenario described here would affect the following functions:
ST_IsCoverageTile _raster_constraint_pixel_types _raster_constraint_nodata_values _raster_constraint_out_db
comment:17 by , 11 years ago
In retrospect, it isn't quite as serious as I thought since most people are lazy and throw postgis and their data in public.
From my experience, many people would be bitten by this. Not because they throw all postgis data in its own schema, but because there can be postgis data spread among many of them.
I always emit the SQL from pg_restore (pg_dump in custom format) and make my necessary changes (usually with a shell script).
As a temporary workaround, could you provide an example of such a script ? The only thing I could think of was to "sed" the SET search_path command to include public as well, but I'm sure there could be many unforeseen side-effects (for user-defined functions, for exemple).
comment:18 by , 11 years ago
You can safely use the "sed" approach if the database (dumped) is well organized, even with custom user functions. Basically, as long as PostGIS is installed in its own schema and you keep it that way (not add stuff not part of PostGIS to that schema), you can safely make changes to the SET search_path.
If you've got more complicated situations (messy), you'll need something more fancy. I unfortunately don't have those scripts anymore (changed jobs) but I used the utils/create_undef.pl script as a reference point.
comment:19 by , 11 years ago
So, did you choose an approach to fix it ? Would embedding the complete call in the constraint definition (as proposed by robe) be an appropriate patch for this issue ?
Thank you.
comment:20 by , 11 years ago
rdunklau,
I think embedding the sql call will unfortunately cause other problems, so best bet is probably making these completely C calls. That will take some time so probably can't be done for 2.1.1 (probably 2.1.2)
If you are using PostgreSQL 9.2+ you can use the new pg_dump that allows you to separate the data structure, data load calls. As I recall I think it also allows you do restore separate sections. I think the search path is only overwritten in the table structure create steps so if you restore the structure first, set your database search paths and then restore the data, it will work fine.
I'll try to document this, but sadly is only a fix for 9.2+ users
comment:21 by , 11 years ago
nevermind scratch that thought the section data also sets the search path to current and pg_catalog so that doesn't work.
comment:22 by , 11 years ago
The only (not a lot of steps solution) I can think of is to install postgis in pg_catalog.
I've often toyed with that idea but never checked to see if it poses other problems. I'm beginning to like that solution.
1) Tucks those 1000s of functions away from sight 2) Gets around this issue of functions not accessible during restore since pg_catalog is ALWAYs part of search path 3) Not having to set search_path if you choose to have a schema for postgis functions
comment:24 by , 11 years ago
Why? Because its not part of core — but plv8 installs in pg_catalog. We can have our cake and eat it too and with extensions it's a simple
ALTER EXTENSION postgis SET SCHEMA pg_catalog;
It's like being part of core without being part of core. I did discover one downside which might be a bug in our extension upgrade or extension model. I can't go back. I'm trapped in core.
If I do
ALTER EXTENSION postgis set SCHEMA public;
Gives notice — cannot remove dependency on schema pg_catalog because it is a system object.
comment:25 by , 11 years ago
one other thing that was shocking about this momentary core adventure — the postgis functions make up 30% of the functions if they are in pg_catalog. That's really scary.
comment:27 by , 11 years ago
Sorry for the noise, but what is the status given to this bug ?
Will it be corrected ?
If not, I think the documentation should be updated to reflect the fact that raster constraints used in a schema other than public render a database unrestorable.
comment:28 by , 11 years ago
rdunklau,
I agree the documentation should be updated regardless. I have on my todo to do that much (since even if we fix people running lower versions should know about it so they can upgrade). I was hoping I wouldn't have something really complicated that people need to do to work around the issue (or worse yet say tough luck). Installing in pg_catalog seems like the least painful of options (but as mentioned with the downside you can't go back).
If we fix, I doubt we'll be pushing the fix to 2.0 unless its an easy fix. So more like 2.½.2.
Bborie — any more thoughts?
Thanks, Regina
comment:29 by , 11 years ago
I just thought of another possibly brilliant or really stupid work around for this issue.
We explicitly add search_path to these functions something like
ALTER FUNCTION _raster_constraint_pixel_types(raster) SET search_path=pg_catalog,public,postgis;
I think we might even be able to do this in the code base too with the assumption that most people install postgis in public or postgis schema. I recalled a while back that PostgreSQL would complain if you added a non-existent search path, but lately I don't seem to be noticing that problem.
rdunklau - if you are listing, the trick is to install postgis as usual with
CREATE EXTENSION .. and then add these search paths to these select functions, and then restore your data. This is the trick I've used in the past to allow be to get away with using ST_Transform in an index and have my indexes come back during restore. Which had similar issues (by setting search_path of ST_Transform function)
comment:30 by , 11 years ago
Milestone: | PostGIS 2.0.5 → PostGIS 2.0.6 |
---|
comment:31 by , 10 years ago
Hi guys
With postgres 9.4 and a mandatory pg_dump/pg_restore coming soon, I decided to write a quick workaround for this bug until there's a real fix. Hope this little bit of code is useful for others too.
The basic idea is that since pg_dump is generating a search path that can't find the raster functions (because they're in public), we can move every raster to a temporary name, stuff the rasters into the public schema, pg_dump, pg_restore, then put everything back where it belongs afterwards.
I've put it here: https://gist.github.com/gbb/d551c92edf2cd8371c09
If you find any errors, please let me know.
G
comment:32 by , 10 years ago
gbell,
Thanks for your contribution haven't had a chance to look at it.
FWIW a mandatory pg_dump/pg_restore will not be needed for 9.4 if you are going from PostgreSQL 9.3 / 2.1. Using pg_upgrade to go from PostgreSQL 9.3 / 2.1 to PostgreSQL 9.4 / 2.1 should work just fine. We won't be releasing 2.2 before 9.4 comes out, so I expect all distributions to be shipping 9.4 with PostGIS 2.1.
comment:34 by , 10 years ago
I have yet another hackish solution to this I came across by accident. If you create a constraint as NOT VALID, it will check the constraint for future inserts/updates to the table, but will not check existing data and will also not check the constraint during load.
I still need to verify, but I think doing a
ALTER TABLE sometable
ADD CONSTRAINT enforce_num_bands_rast CHECK (st_numbands(rast) = 3) NOT VALID;
might do the trick and maybe we can include that as a load option. it would makeloading much faster too since it wouldn't have to validate on load, but puts a bit of trust on user's understanding of their data.
comment:35 by , 10 years ago
Okay that seemed to work. I recreated the check constraints enforce_nodata_values_rast,
enforce_num_bands_rast ,
enforce_out_db_rast,
enforce_pixel_types_rast
as not valid in my original database, did backup and restore and things came back. This was with having postgis in public, but my rasters in a separate schema. Should work the same for if people install postgis in postgis schema. Thinking about it more, it would speed up restore too and we really only need these for raster_columns and ensure forward.
comment:36 by , 10 years ago
Interesting. Would you see this being part of the raster2pgsql loader? or some other mechanism?
comment:37 by , 10 years ago
I was thinking it would be part of raster2pgsql as an extra switch, but then I was thinking for these ones affected I'm not sure it's a bad idea to always have it not validate or at very least validate, but then drop and readd the constraint as not valid.
comment:38 by , 10 years ago
forget about all the above, I've concluded the best solution is as I proposed here: http://lists.osgeo.org/pipermail/postgis-devel/2015-March/024796.html 9and much earlier in this thread)
, just set the search_path starting with functions affected. Since we have a script that generates the functions anyway, we could have it eventually genenerate the ALTER FUNCTION … needed to put in the search_paths and just do it for all postgis functions.
So we'd just have a separate script file generated that does something like:
ALTER FUNCTION _raster_constraint_pixel_types(raster) SET search_path=pg_catalog,postgis,contrib,extensions,public;
starting with affected functions and if it looks good have a script that generates it for all functions. This will solve both the raster specific issue as well as the postgis proper related ones like #3076
comment:39 by , 9 years ago
Milestone: | PostGIS 2.0.7 → PostGIS 2.0.8 |
---|
comment:40 by , 9 years ago
comment:42 by , 9 years ago
Milestone: | PostGIS 2.0.8 → PostGIS 2.2.2 |
---|---|
Version: | trunk → 2.1.x |
See #3490
comment:43 by , 9 years ago
comment:44 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:46 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Simple test case to reproduce the issue