Opened 9 years ago
Closed 8 years ago
#3501 closed defect (fixed)
[raster] add raster constraint max extent exceeds array size limit
Reported by: | hypostase | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 2.2.4 |
Component: | raster | Version: | 2.2.x |
Keywords: | Cc: |
Description
When adding raster constraints the maximum extent constrain appears to require an array exceeding the limit ERROR: array size exceeds the maximum allowed (1073741823) CONTEXT: SQL statement "SELECT st_ashexewkb(st_envelope(st_collect(st_envelope(rast)))) FROM cjsmith.dtm2015_50x50"
The max extent is calculated successfully for the same raster when using 64x64 tiles (in a 8M row table), but fails are 50x50 (13M rows) and at 32x32 (32M)
Postgres Version select version();
version
PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (
Red Hat 4.8.5-4), 64-bit (1 row)
PostGIS version select postgis_full_version();
post
gis_full_version
POSTGIS="2.2.1 r14555" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.8.0, 6 March
2012" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" TOPO LOGY RASTER (1 row)
hmdb⇒ select count(*) from dtm2015_50x50;
count
(1 row)
hmdb⇒ select count(*) from dtm2015;
count
(1 row)
Driver: GTiff/GeoTIFF Files: dtm2015.27700.tif Size is 135168, 247040 Coordinate System is: PROJCS["OSGB 1936 / British National Grid",
GEOGCS["OSGB 1936",
DATUM["OSGB_1936",
SPHEROID["Airy 1830",6377563.396,299.3249646000044,
AUTHORITY["EPSG","7001"]],
AUTHORITY["EPSG","6277"]],
PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433], AUTHORITY["EPSG","4277"]],
PROJECTIONTransverse_Mercator, PARAMETER["latitude_of_origin",49], PARAMETER["central_meridian",-2], PARAMETER["scale_factor",0.9996012717], PARAMETER["false_easting",400000], PARAMETER["false_northing",-100000], UNIT["metre",1,
AUTHORITY["EPSG","9001"]],
AUTHORITY["EPSG","27700"]]
Origin = (-4000.000000000000000,1235005.000000000000000) Pixel Size = (5.000000000000000,-5.000000000000000) Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=BAND
Corner Coordinates: Upper Left ( -4000.000, 1235005.000) ( 9d25'58.40"W, 60d47'32.38"N) Lower Left ( -4000.000, -195.000) ( 7d36'41.51"W, 49d45'42.36"N) Upper Right ( 671840.000, 1235005.000) ( 3d 0'53.82"E, 60d54'16.94"N) Lower Right ( 671840.000, -195.000) ( 1d46'51.89"E, 49d50'10.01"N) Center ( 333920.000, 617405.000) ( 3d 2'41.10"W, 55d26'45.71"N) Band 1 Block=135168x1 Type=Float32, ColorInterp=Gray
NoData Value=-3.40282306073709653e+38
hmdb⇒ select addrasterconstraints('dtm2015_50x50'::name, 'rast'::name);
NOTICE: Adding maximum extent constraint CONTEXT: PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN ERROR: array size exceeds the maximum allowed (1073741823) CONTEXT: SQL statement "SELECT st_ashexewkb(st_envelope(st_collect(st_envelope(rast)))) FROM cjsmith.dtm2015_50x50" PL/pgSQL function _add_raster_constraint_extent(name,name,name) line 20 at EXECUTE PL/pgSQL function addrasterconstraints(name,name,name,text[]) line 104 at assignment PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN
Change History (21)
comment:1 by , 8 years ago
Milestone: | PostGIS 2.2.2 → PostGIS 2.2.3 |
---|
comment:2 by , 8 years ago
comment:3 by , 8 years ago
Well union in this case would be much more efficient. I'm going to switch to that, but not sure it solves your problems. Did you try doing st_union? I think both st_collect and st_union use transfn so might be governed by same array limit. Though since st_union can collapse, maybe not.
comment:6 by , 8 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I'm going to close this out without testing. If it doesn't work feel free to reopen.
comment:7 by , 8 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
As noted in https://lists.osgeo.org/pipermail/postgis-users/2016-November/041720.html
ST_Union did not fix and ST_MemUnion is too slow. So will replace with ST_Extent after I do some benchmarks to make sure it doesn't adversely affect smaller tables.
comment:8 by , 8 years ago
We probably also need an ST_Transform in there for when the raster contains multiple SRIDs.
See https://lists.osgeo.org/pipermail/postgis-users/2016-November/041721.html
comment:9 by , 8 years ago
I did some tests on my database. I have 3 raster tables and ran each method on them.
Rows st_extent st_collect st_union st_memunion 19,575,700 30:15 failed failed 480:00+ 3,137,060 5:05 5:10 18:35 not tested 379,320 0:15.9 0:15.9 0:43.9 15:00+
All times in minutes:seconds and are just for the first select to get the ewkt. So in my case st_extent and st_collect are roughly the same, st_union is around 3x slower and st_memunion is ridiculously slow and were aborted at the times listed.
comment:10 by , 8 years ago
avernar,
Thanks for the thorough testing. Saves me work of having to benchmark.
I'm surprised ST_Collect is better than ST_Union.
Anyrate this all looks good except I think we should avoid the ST_Transform. The reason is that the intersection checks require the geometries to be in the same spatial ref so would fail when acutally checked.
The other reason to avoid ST_Transform is that some spatial refs transforming to others doesn't always make sense. For example if your data uses UTM spatial refs. Transforming from one UTM to another especially one that is far away doesn't make much sense since the bounds you'd get for the srid are invalid.
In the case of multiple SRIDS, I think the Extent constraint should just be chucked as it's bound to do more damage than good.
comment:11 by , 8 years ago
Ah, makes sense not using ST_Transform then if it would just fail the validation later anyways.
As for ST_Collect being faster I just noticed the note in http://postgis.refractions.net/documentation/manual-svn/ST_Collect.html. It's still surprising there's that much of a difference when dealing with simple bounding boxes between the two.
comment:12 by , 8 years ago
avernar,
Given you said it takes 30 minutes to create the check constraint, I'm tempted to just CREATE it with NOT VALID. Since the new constraint was created from the extent, it should be valid for existing, so no need to check it on existing data and it will still work for future inserts and most importantly will still show in raster_columns.
But it would shave off 30 minutes from restore and creation for your big case.
I did a test and a not valid constraint comes back not valid so should save on restore as well. Only issue is with user perception. For 2.4 I'll definitely do that. Not sure if I should for 2.3 though.
That useless check of existing data has always annoyed me since we are creating the constraints from the data.
There are some like srid I'll keep since we count on them failing if more than one srid to not create the constraint.
So the add constraint will then end up looking like:
sql := 'ALTER TABLE ' || fqtn || ' ADD CONSTRAINT ' || quote_ident(cn) || ' CHECK (st_envelope(' || quote_ident($3) || ') @ ''' || attr || '''::geometry) NOT VALID'; RETURN _add_raster_constraint(cn, sql);
I also did a check on @. I guess it really only checks box intersects so ignores the SRID. That means that mixed srids compared against a single srid will work fine as long as we don't attempt an ST_Transform (the transform would screw up the box and make it not work since PostGIS can't auto transform).
I'm torn between doing a more intensive srid check (takes 9 secs for 379,232 sample I did), or just as you have take the first srid and tough if they mixed a bunch of srids, their extent will show the first one - their penalty for being dirty mixing srids in same table. I feel a little dirty making that short-cut for the sake of performance though but I feel it's a rare enough case and a case that if people hit it there is no harm, except my conscience (and pramsey or strk or dustymugs screaming at me. Especially strk cause he's such a purist ).
comment:13 by , 8 years ago
Owner: | changed from | to
---|---|
Status: | reopened → new |
comment:15 by , 8 years ago
robe,
Just to add some perspective to the 30 minutes, the entire import took somewhere between 16 to 20 hours. It's not the fastest database server in the world. I wasn't timing it exactly but the 30 minutes for applying the constraint is not that large compared to the whole thing. On the other hand, saving 30 minutes on a restore is still a nice thing to have.
Do you want me to benchmark the more intensive srid check on my three raster tables? If so just let me know the sql to run.
comment:16 by , 8 years ago
Sorry but I'm unable to comment on this as I hadn't understood the issue. The title of this ticket is about "max extent exceeds array size limit", was that fixed ?
comment:17 by , 8 years ago
So, my suggestion here is to not mix performance vs. correctness talk. If ST_Extent fixes the issue, please close this ticket accordingly, and open a new one if you want to address the speed.
comment:18 by , 8 years ago
I was afraid you'd say that. Okay I'll do it the slow painful way for 2.3 (i'll keep my code in 2.4 but put another ticket in for that.).
comment:20 by , 8 years ago
Milestone: | PostGIS 2.2.3 → PostGIS 2.2.4 |
---|
Taking the envelope of the union of the raster tiles rather than creating a geometry collection (which is where the error comes in) might be a possible workaround