Opened 4 years ago

Closed 3 years ago

#3501 closed defect (fixed)

[raster] add raster constraint max extent exceeds array size limit

Reported by: hypostase Owned by: robe
Priority: medium Milestone: PostGIS 2.2.4
Component: raster Version: 2.2.x
Keywords: Cc:

Description

When adding raster constraints the maximum extent constrain appears to require an array exceeding the limit ERROR: array size exceeds the maximum allowed (1073741823) CONTEXT: SQL statement "SELECT st_ashexewkb(st_envelope(st_collect(st_envelope(rast)))) FROM cjsmith.dtm2015_50x50"

The max extent is calculated successfully for the same raster when using 64x64 tiles (in a 8M row table), but fails are 50x50 (13M rows) and at 32x32 (32M)

Postgres Version select version();

version



PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (

Red Hat 4.8.5-4), 64-bit (1 row)

PostGIS version select postgis_full_version();

post

gis_full_version




POSTGIS="2.2.1 r14555" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.8.0, 6 March

2012" GDAL="GDAL 1.11.4, released 2016/01/25" LIBXML="2.9.1" LIBJSON="0.11" TOPO LOGY RASTER (1 row)

hmdb=> select count(*) from dtm2015_50x50;

count


13360464

(1 row)

hmdb=> select count(*) from dtm2015;

count


8152320

(1 row)

Driver: GTiff/GeoTIFF Files: dtm2015.27700.tif Size is 135168, 247040 Coordinate System is: PROJCS["OSGB 1936 / British National Grid",

GEOGCS["OSGB 1936",

DATUM["OSGB_1936",

SPHEROID["Airy 1830",6377563.396,299.3249646000044,

AUTHORITY["EPSG","7001"]],

AUTHORITY["EPSG","6277"]],

PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433], AUTHORITY["EPSG","4277"]],

PROJECTIONTransverse_Mercator?, PARAMETER["latitude_of_origin",49], PARAMETER["central_meridian",-2], PARAMETER["scale_factor",0.9996012717], PARAMETER["false_easting",400000], PARAMETER["false_northing",-100000], UNIT["metre",1,

AUTHORITY["EPSG","9001"]],

AUTHORITY["EPSG","27700"]]

Origin = (-4000.000000000000000,1235005.000000000000000) Pixel Size = (5.000000000000000,-5.000000000000000) Metadata:

AREA_OR_POINT=Area

Image Structure Metadata:

INTERLEAVE=BAND

Corner Coordinates: Upper Left ( -4000.000, 1235005.000) ( 9d25'58.40"W, 60d47'32.38"N) Lower Left ( -4000.000, -195.000) ( 7d36'41.51"W, 49d45'42.36"N) Upper Right ( 671840.000, 1235005.000) ( 3d 0'53.82"E, 60d54'16.94"N) Lower Right ( 671840.000, -195.000) ( 1d46'51.89"E, 49d50'10.01"N) Center ( 333920.000, 617405.000) ( 3d 2'41.10"W, 55d26'45.71"N) Band 1 Block=135168x1 Type=Float32, ColorInterp?=Gray

NoData? Value=-3.40282306073709653e+38

hmdb=> select addrasterconstraints('dtm2015_50x50'::name, 'rast'::name);

NOTICE: Adding maximum extent constraint CONTEXT: PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN ERROR: array size exceeds the maximum allowed (1073741823) CONTEXT: SQL statement "SELECT st_ashexewkb(st_envelope(st_collect(st_envelope(rast)))) FROM cjsmith.dtm2015_50x50" PL/pgSQL function _add_raster_constraint_extent(name,name,name) line 20 at EXECUTE PL/pgSQL function addrasterconstraints(name,name,name,text[]) line 104 at assignment PL/pgSQL function addrasterconstraints(name,name,name,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean,boolean) line 53 at RETURN

Change History (21)

comment:1 Changed 4 years ago by pramsey

Milestone: PostGIS 2.2.2PostGIS 2.2.3

comment:2 Changed 4 years ago by jamesramm

Taking the envelope of the union of the raster tiles rather than creating a geometry collection (which is where the error comes in) might be a possible workaround

comment:3 Changed 3 years ago by robe

Well union in this case would be much more efficient. I'm going to switch to that, but not sure it solves your problems. Did you try doing st_union? I think both st_collect and st_union use transfn so might be governed by same array limit. Though since st_union can collapse, maybe not.

comment:4 Changed 3 years ago by robe

In 15114:

use ST_Union instead of ST_Collect to compute max extent
References #3501 for PostGIS 2.3.0 trunk

comment:5 Changed 3 years ago by robe

In 15115:

use ST_Union instead of ST_Collect to compute raster max extent constraint
References #3501 for PostGIS 2.2.3

comment:6 Changed 3 years ago by robe

Resolution: fixed
Status: newclosed

I'm going to close this out without testing. If it doesn't work feel free to reopen.

comment:7 Changed 3 years ago by robe

Resolution: fixed
Status: closedreopened

As noted in https://lists.osgeo.org/pipermail/postgis-users/2016-November/041720.html

ST_Union did not fix and ST_MemUnion is too slow. So will replace with ST_Extent after I do some benchmarks to make sure it doesn't adversely affect smaller tables.

comment:8 Changed 3 years ago by avernar

We probably also need an ST_Transform in there for when the raster contains multiple SRIDs.

See https://lists.osgeo.org/pipermail/postgis-users/2016-November/041721.html

comment:9 Changed 3 years ago by avernar

I did some tests on my database. I have 3 raster tables and ran each method on them.

Rows	     st_extent	st_collect  st_union  st_memunion
19,575,700   30:15      failed      failed    480:00+
 3,137,060    5:05      5:10        18:35     not tested
   379,320    0:15.9    0:15.9       0:43.9    15:00+

All times in minutes:seconds and are just for the first select to get the ewkt. So in my case st_extent and st_collect are roughly the same, st_union is around 3x slower and st_memunion is ridiculously slow and were aborted at the times listed.

Last edited 3 years ago by avernar (previous) (diff)

comment:10 Changed 3 years ago by robe

avernar,

Thanks for the thorough testing. Saves me work of having to benchmark.

I'm surprised ST_Collect is better than ST_Union.

Anyrate this all looks good except I think we should avoid the ST_Transform. The reason is that the intersection checks require the geometries to be in the same spatial ref so would fail when acutally checked.

The other reason to avoid ST_Transform is that some spatial refs transforming to others doesn't always make sense. For example if your data uses UTM spatial refs. Transforming from one UTM to another especially one that is far away doesn't make much sense since the bounds you'd get for the srid are invalid.

In the case of multiple SRIDS, I think the Extent constraint should just be chucked as it's bound to do more damage than good.

Last edited 3 years ago by robe (previous) (diff)

comment:11 Changed 3 years ago by avernar

Ah, makes sense not using ST_Transform then if it would just fail the validation later anyways.

As for ST_Collect being faster I just noticed the note in http://postgis.refractions.net/documentation/manual-svn/ST_Collect.html. It's still surprising there's that much of a difference when dealing with simple bounding boxes between the two.

comment:12 Changed 3 years ago by robe

avernar,

Given you said it takes 30 minutes to create the check constraint, I'm tempted to just CREATE it with NOT VALID. Since the new constraint was created from the extent, it should be valid for existing, so no need to check it on existing data and it will still work for future inserts and most importantly will still show in raster_columns.

But it would shave off 30 minutes from restore and creation for your big case.

I did a test and a not valid constraint comes back not valid so should save on restore as well. Only issue is with user perception. For 2.4 I'll definitely do that. Not sure if I should for 2.3 though.

That useless check of existing data has always annoyed me since we are creating the constraints from the data.

There are some like srid I'll keep since we count on them failing if more than one srid to not create the constraint.

So the add constraint will then end up looking like:

       sql := 'ALTER TABLE ' || fqtn
            || ' ADD CONSTRAINT ' || quote_ident(cn)
            || ' CHECK (st_envelope('
            || quote_ident($3)
            || ') @ ''' || attr || '''::geometry) NOT VALID';
        RETURN _add_raster_constraint(cn, sql);

I also did a check on @. I guess it really only checks box intersects so ignores the SRID. That means that mixed srids compared against a single srid will work fine as long as we don't attempt an ST_Transform (the transform would screw up the box and make it not work since PostGIS can't auto transform).

I'm torn between doing a more intensive srid check (takes 9 secs for 379,232 sample I did), or just as you have take the first srid and tough if they mixed a bunch of srids, their extent will show the first one - their penalty for being dirty mixing srids in same table. I feel a little dirty making that short-cut for the sake of performance though but I feel it's a rare enough case and a case that if people hit it there is no harm, except my conscience (and pramsey or strk or dustymugs screaming at me. Especially strk cause he's such a purist :) ).

Last edited 3 years ago by robe (previous) (diff)

comment:13 Changed 3 years ago by robe

Owner: changed from Bborie Park to robe
Status: reopenednew

comment:14 Changed 3 years ago by robe

In 15243:

add raster constraint max extent exceeds array size limit
references #3501 for trunk (PostGIS 2.4)

comment:15 Changed 3 years ago by avernar

robe,

Just to add some perspective to the 30 minutes, the entire import took somewhere between 16 to 20 hours. It's not the fastest database server in the world. I wasn't timing it exactly but the 30 minutes for applying the constraint is not that large compared to the whole thing. On the other hand, saving 30 minutes on a restore is still a nice thing to have.

Do you want me to benchmark the more intensive srid check on my three raster tables? If so just let me know the sql to run.

comment:16 Changed 3 years ago by strk

Sorry but I'm unable to comment on this as I hadn't understood the issue. The title of this ticket is about "max extent exceeds array size limit", was that fixed ?

comment:17 Changed 3 years ago by strk

So, my suggestion here is to not mix performance vs. correctness talk. If ST_Extent fixes the issue, please close this ticket accordingly, and open a new one if you want to address the speed.

comment:18 Changed 3 years ago by robe

I was afraid you'd say that. Okay I'll do it the slow painful way for 2.3 (i'll keep my code in 2.4 but put another ticket in for that.).

comment:19 Changed 3 years ago by robe

In 15245:

add raster constraint max extent exceeds array size limit
references #3501 for 2.3 (PostGIS 2.3.1)

comment:20 Changed 3 years ago by robe

Milestone: PostGIS 2.2.3PostGIS 2.2.4

comment:21 Changed 3 years ago by robe

Resolution: fixed
Status: newclosed

In 15254:

add raster constraint max extent exceeds array size limit
closes #3501 for 2.2.4

Note: See TracTickets for help on using tickets.