Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#2024 closed defect (fixed)

raster: ST_Union is buggy

Reported by: robe Owned by: Bborie Park
Priority: blocker Milestone: PostGIS 2.1.0
Component: raster Version: master
Keywords: Cc:

Description

I think there is something wrong with the new implementation of ST_Union. I did a sample test on both my 2.0.1 and latest 2.1.0 and the 2.1.0 seems to be leaving out tiles.

SELECT ST_AddBand(NULL,ARRAY[ST_Union(rast,1), ST_Union(rast,2), ST_Union(rast,3) ]), count(*)
FROM test_100_100
WHERE ST_Intersects(rast,  ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) );

42 tiles should be glued.

See attached files:

The image I was testing with you can download from:

http://www.bostongis.com/downloads/23128870.jpg

And I ran this load to load it in:

raster2pgsql -s 26986 -t 100x100 -F -t 100x100 -I -Y 23128870.jpg test_100_100 | psql -U postgres -d testpostgis21 -h localhost -p 5442

I though it might be the ST_AddBand, but if I do just a union of a single band, I get the same issue.

Attachments (3)

201.png (819.9 KB ) - added by robe 12 years ago.
from 2.0.1
210svn_r10359.png (15.8 KB ) - added by robe 12 years ago.
from 2.1.0
23128870.jpg.aux.xml (4.8 KB ) - added by robe 12 years ago.
xml meta data when converted from MrSid

Download all attachments as: .zip

Change History (31)

by robe, 12 years ago

Attachment: 201.png added

from 2.0.1

by robe, 12 years ago

Attachment: 210svn_r10359.png added

from 2.1.0

comment:1 by robe, 12 years ago

Owner: changed from pracine to Bborie Park

comment:2 by robe, 12 years ago

Summary: ST_Union is buggyraster: ST_Union is buggy

comment:3 by Bborie Park, 12 years ago

I've made some major changes under the hood while adding multi-band support so I'll test your raster out after I finish with the addition. I did notice there were bugs (wrong variable used) that caused some odd output…

by robe, 12 years ago

Attachment: 23128870.jpg.aux.xml added

xml meta data when converted from MrSid

comment:4 by Bborie Park, 12 years ago

Can you try r10360

comment:5 by robe, 12 years ago

Much better and speed improvement was not lost.

So timing wise:

-- on 2.0.1 --
-- 42 tiles at 42,685-43,852 on 2.0.1 Windows 7 x64-bit PostgreSQL 9.2.1 64-bit
SELECT ST_AddBand(NULL,ARRAY[ST_Union(rast,1), ST_Union(rast,2), ST_Union(rast,3) ]), count(*)
FROM test_100_100
WHERE ST_Intersects(rast,  ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) );

on 2.1.0SVN r10361:

-- 42 tiles at 7473ms on 2.1.0 Windows 7 x64-bit PostgreSQL 9.2.1 64-bit
SELECT ST_AddBand(NULL,ARRAY[ST_Union(rast,1), ST_Union(rast,2), ST_Union(rast,3) ]) --, count(*)
FROM test_100_100
WHERE ST_Intersects(rast,  ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) );

However there is still a problem. If I do this:

SELECT ST_Union(rast)
FROM test_100_100
WHERE ST_Intersects(rast,  ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) );

It crashes the database service. Logs show this before service crashes:

Assertion failed: NULL != raster, file rt_api.c, line 7647

comment:6 by Bborie Park, 12 years ago

ST_Union(raster) doesn't exist yet in -trunk. Is that crash on 2.0?

comment:7 by Bborie Park, 12 years ago

Oh. Can you try…

SELECT ST_Union(rast,ARRAY[ROW(1, 'LAST'), ROW(2, 'LAST'), ROW(3, 'LAST')]::unionarg[])
FROM test_100_100
WHERE ST_Intersects(rast,  ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) );

I noticed that the above is faster than using the ST_AddBand() array method.

comment:8 by Bborie Park, 12 years ago

Damn it. My fault for allowing the ST_Union(raster) signature to slip through into -trunk. That explains the breakage…

comment:9 by robe, 12 years ago

It's a little faster:

{{{ — 6983ms - 70150ms — POSTGIS="2.1.0SVN r10361" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.1, released 2012/05/15" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER

— SELECT ST_Union(rast,ARRAY[ROW(1, 'LAST'), ROW(2, 'LAST'), ROW(3, 'LAST')]::unionarg[]) FROM test_100_100 WHERE ST_Intersects(rast, ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) ); }}}

comment:10 by robe, 12 years ago

 -- 6983ms - 70150ms -- POSTGIS="2.1.0SVN r10361" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.1, released 2012/05/15" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER

-- SELECT ST_Union(rast,ARRAY[ROW(1, 'LAST'), ROW(2, 'LAST'), ROW(3, 'LAST')]::unionarg[]) FROM test_100_100 WHERE ST_Intersects(rast, ST_GeomFromText('LINESTRING(230486.436 886271, 230486.437 887771)',26986) ); 

Got messed up last post.

So does this mean we'll no longer have an

ST_Union(rast) ?

We'll have to document since that's a breaking change.

comment:11 by robe, 12 years ago

I don't mind if we remove it. I never liked the fact we had it anyway since always thought it was misleading and would make people think it unions all bands

comment:12 by robe, 12 years ago

dustymugs,

I found a performance issue. I thought I was imaging things in my test here when I was noticing the phenomenon: http://www.bostongis.com/blog/index.php?/archives/198-Waiting-for-PostGIS-2.1-Faster-Raster-Union.html

that 2.1 speed wasn't changing with clipping and yet 2.0 was which is the hack I use to make 2.0 work faster.

It seems to me that 2.1.0 is computing over no data values. So I have concocted a more ridiculous test to highlight the issue. One in which PostGIS 2.0 trumps 2.1.0. the result looks same in both so not sure if its the union or the clipping that is at issue.

— clip union 2.0 wins — On 2.1.0SVN, 11,415 ms (on PostGIS 2.0 this takes 6,130ms)

SELECT ST_AddBand(NULL,ARRAY[ST_Union(rast,1), ST_Union(rast,2), ST_Union(rast,3) ]) , count(*)
FROM (SELECT  ST_Clip(rast,geom) As rast 
	FROM test_100_100 
           CROSS JOIN ST_Buffer(ST_GeomFromText('LINESTRING(231200 887025,231247 888240)',26986),0.1) As geom
WHERE ST_Intersects(rast,  geom ) ) As foo;

— no clip, 2.1.0 doesn't exhibit much speed difference between clip and non-clip — On 2.1.0SVN, 12,555ms (on PostGIS 2.0 takes 25,814ms)

SELECT ST_AddBand(NULL,ARRAY[ST_Union(rast,1), ST_Union(rast,2), ST_Union(rast,3) ]) , count(*)
FROM (SELECT rast 
	FROM test_100_100 
           CROSS JOIN ST_Buffer(ST_GeomFromText('LINESTRING(231200 887025,231247 888240)',26986),0.1) As geom
WHERE ST_Intersects(rast,  geom ) ) As foo;

comment:13 by robe, 12 years ago

Slight update, new syntax is much faster for this case, but 2.1.0 still loses the race. Not sure what is going on. Perhaps my nodata value theory goes out since the clip/unclipped in this case don't have the same timing. Perhaps its something with clip and not union

-- with clip On 2.1.0SVN new syntax, 8,309ms (on comparable query PostGIS 2.0 this takes 6,130ms)
SELECT ST_Union(rast,ARRAY[ROW(1, 'LAST'), ROW(2, 'LAST'), ROW(3, 'LAST')]::unionarg[])
FROM (SELECT ST_Clip(rast,geom) As rast  
	FROM test_100_100 
           CROSS JOIN ST_Buffer(ST_GeomFromText('LINESTRING(231200 887025,231247 888240)',26986),0.1) As geom
WHERE ST_Intersects(rast,  geom ) ) As foo;


-- On 2.1.0SVN no clip, 12,460ms (on comparable query PostGIS 2.0 this takes 25,760ms)
SELECT ST_Union(rast,ARRAY[ROW(1, 'LAST'), ROW(2, 'LAST'), ROW(3, 'LAST')]::unionarg[])
FROM (SELECT  rast  
	FROM test_100_100 
           CROSS JOIN ST_Buffer(ST_GeomFromText('LINESTRING(231200 887025,231247 888240)',26986),0.1) As geom
WHERE ST_Intersects(rast,  geom ) ) As foo;

in reply to:  11 comment:14 by Bborie Park, 12 years ago

Replying to robe:

I don't mind if we remove it. I never liked the fact we had it anyway since always thought it was misleading and would make people think it unions all bands

I'm planning on adding back ST_Union(raster) later today. I'm not a fan of it either but I know I'd use it when I'm lazy. And yes, it would union all bands (and create bands as it goes).

All these aggregate variants require tweaks to the underlying code due to changed assumptions…

comment:15 by Bborie Park, 12 years ago

I'm not surprised that there are performance issues, especially in tuned cases. And yes, ST_Union goes over every pixel including NODATA. Still have to decide where to make that particular optimization. In the pg C-layer or one step lower in the generic iterator.

comment:16 by Bborie Park, 12 years ago

Everything should be good now regarding ST_Union bugs (I'm sure we'll find more over time). Feel free to add a performance ticket. Something like "Make ST_Union faster!!!"

comment:17 by robe, 12 years ago

Resolution: fixed
Status: newclosed

Yah everything seems fine except that crasher which will be fixed once you commit the multi ST_Union(raster) support. So I'll close this out.

comment:18 by robe, 12 years ago

Ah I see you did fix the crasher and it now unions everything. GREAT. I think we lost some speed somewhere or could be just fickleness of my computer. I'm still not seeing muc of a differrence between my clipped on unclipped. I'll test a bit more later do another 2.0/2.1 compare. could also be that I did upgrade from 2.0.1 to 2.1.0SVN (which seemed to work this time around — maybe you fixed that too), instead of doing a clean install.

comment:19 by Bborie Park, 12 years ago

Let me know if we lost speed. I'll have to run some tests on physical hardware to see what performance is like. I doubt you'll see any significant difference between clipped and unclipped as I didn't do anything for performance.

I'm wondering how ticket #1659 is doing though with the current implementation.

comment:20 by robe, 12 years ago

I'll have to retest #1659 (I'll do that in next day or so). False alarm on the speed. In the prior test I was testing my 9.2.1 EDB install and since my 9.2.0 mingw 64 uses the same port, I launched that one by accident. That one is evidentally a bit slower (about 500 ms slower for test I was running) than the VC++ build. I should have know since I didn't have the test database I had created on it. Not sure if it's because I haven't upgraded it to 9.2.1 or it's mingw or I have different settings on it. Anyway 9.2.1 EDB shows about the same speed as before.

comment:21 by pracine, 12 years ago

I think you should test with images bigger than the one you used in your blog (i.e. a SRTM tile?). Note also that the size of the tiles affect the performance. The same image tiled smaller should take longer to be unioned. And the most important is: Is the time required still progress as the square of the number of pixel involved? If yes then you should soon reach astronomic times as the size of your image grow.

comment:22 by robe, 12 years ago

Pierre,

Not following you. You mean instead of 100x100 chunked tiles, I don't chunk and see how it handles the original 5000x5000 tile I have. I do have about 12GB of aerial data loaded (well that's the size of my schema backup). Those are chunked 50x50 though.

I think our proposition of suitable tile sizes will change a bit with C-based functions, it will be less sensitive to size of tile and I expect it to be more of a linear process. To me the issue with the plpgsql is not so much the number of pixels it had to go thru, but the fact it had to create a copy for each raster for each iteration.

I'm going to test next just chunking 1000x1000 (using the same tile) and do the same test compare. I have some DEMs, but I always choose aerials for testing since I can spot check an issue by eye much easier.

comment:23 by robe, 12 years ago

FWIW: I really got to rethink my tile sizes. It seems we are more sensitive to number of rows that need to be traversed than we are to number of pixels that need to be traversed. So I think the memory copy between the C and PostgreSQL layer is still eating up most of our time.

My 1000x1000 does so much better:

{{{ — both of these should traverse the same number of pixels — 16,958 ms (1 band) 5 tiles clipping (1000x1000 tile sizes) SELECT ST_Union(rast,1) , count(*) FROM (SELECT ST_Clip(rast,geom) As rast

FROM test_1000_1000

CROSS JOIN ST_Buffer(ST_GeomFromText('POINT(231237.436173996 887025.024778253)',26986)

,200, 'quad_segs=2') As geom

WHERE ST_Intersects(rast, geom ) ) As foo;

— 51,714ms (1 band), 152 tiles clipping (100x100 tile sizes) SELECT ST_Union(rast,1) ,count(*) FROM (SELECT ST_Clip(rast,geom) As rast

FROM test_100_100

CROSS JOIN ST_Buffer(ST_GeomFromText('POINT(231237.436173996 887025.024778253)',26986)

,200, 'quad_segs=2') As geom

WHERE ST_Intersects(rast, geom ) ) As foo; }}}

— more interestingly, clipping makes things worse in 2.1.0 case even though it — should in theory reduce the number of pixels that need to be traversed — here is the same run without clipping

-- 14,196 ms  (1 band) 5 tiles no clipping
SELECT  ST_Union(rast,1) , count(*)
FROM (SELECT rast 
	FROM test_1000_1000 
           CROSS JOIN ST_Buffer(ST_GeomFromText('POINT(231237.436173996 887025.024778253)',26986)
    ,200, 'quad_segs=2') As geom
WHERE ST_Intersects(rast,  geom ) ) As foo;

-- 39,374 ms  (1 band), 152 tiles no clipping
SELECT  ST_Union(rast,1) ,count(*)
FROM (SELECT rast 
	FROM test_100_100 
           CROSS JOIN ST_Buffer(ST_GeomFromText('POINT(231237.436173996 887025.024778253)',26986)
    ,200, 'quad_segs=2') As geom
WHERE ST_Intersects(rast,  geom ) ) As foo;

I'm beginning to think that the reason 2.0.1 seemed to improve with clipping while 2.1 doesn't is simply because in 2.0.1 the clipping time was dwarfed significantly by the union (copy pain), so the benefit it gave to union providing smaller tiles much outweighed the additional processing.

In 2.1.0 this is not the case — clipping speed and union speed are more in par so the extra cycles (e.g. 5 or 152 really adds weight since union has much less of issue traversing)

comment:24 by pracine, 12 years ago

Robe, not following you as well. Why adding ST_Clip in the mix if you want to test just ST_Union? Quite confusing.

What I mean is that ST_Union (the C or the plpgsql version) on raster, please Bborie tell me if I'm wrong, suffers from two problems:

1) You can not append a raster to a raster without creating a new raster (unlike the geometry type). So the state function will always have to recopy the previous raster, creating a new one, before appending the passed raster. So your assumption about no more need to copy the raster is false. ST_Union for raster will always suffer from that because it is inherent to the raster data structure. This is why I think more and more that aggregating raster with a PostgreSQL aggregate is not the way to go to merge tiles. It would be easier if PostgreSQL AGGREGATE would allow to pass an initial function, executed before the beginning the state cycle, that would do a first pass on the the raster to aggregate. A kind of two passes aggregate. This would allows us to precompute the extent of the final raster and then we would just have to copy tiles into it as they come and your assumption would be right. This problem worsen when the size of the tiles get smaller. That was my point above. And this why the time needed will always grow like the square of the number of pixels for a constant tile size. At some point, with really big tiled images, that make unioning unpracticable.

2) At every state, when we copy the previous raster to the new state one, we could just copy it with a fast line by line memcpy, but no, we copy it trying to apply the more complex ST_Union function (like 'MEAN') to the whole new raster. We could improve this by applying the 'MEAN' only to the changing part of the extent. This is the optimization Bborie was speaking about. But this does not broom the first problem.

So a better test, for me, is to test only ST_Union with bigger images, keeping the tile size constant. You should still see a polynomial progression. Not a linear one. I have no doubt that the C implementation does better than the plpgsql one and that you will have to work with even bigger images to see the line curving straight to the sky…

I wish I would be wrong.

comment:25 by Bborie Park, 12 years ago

For each part…

  1. Mostly correct. What isn't correct is that you're not just copying your old working raster to your new working raster. The new working raster will more than likely have a different extent. A two pass aggregate would be nice to have but I'm not expecting it from PostgreSQL.
  1. You could do a line by line memcpy of the intersecting parts from the old working raster to the new working raster but I don't know how much time that would save. The MEAN uniontype is the most entertaining of the union types as it requires two parallel actions, a COUNT and a separate SUM.

Due to the size of raster tiles, there is no way to use the same method used for ST_Union(geometry). There, the geometries are collected into an array and then processed at the end with the final aggregate function. This means that all the geometries are stored in memory to the end. If we tried that with rasters, you'll probably run out of memory before getting to the end.

Realistically, we should only be expecting the following:

  1. Is the ST_Union in 2.1 faster than 2.0?
  1. Is the output from ST_Union in 2.1 correct?

Exploring places to optimize ST_Union would be great… but there are other problems to resolve for 2.1.0. Maybe 2.1.1.

comment:26 by robe, 12 years ago

dustymugs:

I'll have to dig up the example I saw. In most cases I would say at least 98% of cases 2.1 is faster, however there are cases I have run into where the 2.0 (with clipping) is faster than 2.1 with clipping. Sorry I threw it out. I have to verify the answer is correct in 2.0 in the ones I saw.

So might be an issue that 2.1 is more sensitive to difference in extents than 2.0 is or 2.0 was just wrong.

pracine: The reason I use clipping is because in 2.0, Clipping before unioning was almost ALWAYS faster than trying to do union alone and in 2.1 this does not seem to be the case.

I also see it as the most common use case. Why would I want to just union tiles and be relegated to the size of my tiles that happen to have my area of interest. In fact if you were going to fix an extent — I would suggest you allow the user to pass the extent in as an argument to Union allowing the union to create the size and throw out anything that doesn't fit. That would save having to compute in most cases and probably satisfy a lot of user needs.

I see the common case — as doing small analysis of a region say a 500x500 area and clipping to that region, which is what the browser would expect. The browser only wants to see the region that was asked for. But my perspective might be different since I'm coming from a web-developer POV where I see the user specifying the area and I give them just that region that fits the area they requested. The unioning tons and tons of huge tiles seems useful if you are outputting large coverages as a single file, which quite frankly puts me to sleep.

in reply to:  26 comment:27 by pracine, 12 years ago

pracine:

I also see it as the most common use case. Why would I want to just union tiles and be relegated to the size of my tiles that happen to have my area of interest. In fact if you were going to fix an extent — I would suggest you allow the user to pass the extent in as an argument to Union allowing the union to create the size and throw out anything that doesn't fit. That would save having to compute in most cases and probably satisfy a lot of user needs.

Good idea. Clip and union in the same step…

I see the common case — as doing small analysis of a region say a 500x500 area and clipping to that region, which is what the browser would expect. The browser only wants to see the region that was asked for. But my perspective might be different since I'm coming from a web-developer POV where I see the user specifying the area and I give them just that region that fits the area they requested. The unioning tons and tons of huge tiles seems useful if you are outputting large coverages as a single file, which quite frankly puts me to sleep.

I agree with you but I'm wondering what is the biggest extent we can still union in reasonable time. If I remember well, with 2.0.1 unioning one srtm tile was taking ages.

comment:28 by Bborie Park, 12 years ago

Folks, this discussion really shouldn't be here in a ticket… particularly a ticket indicating that ST_Union has bugs. Now that the bugs are gone, the ticket should be quiet (especially since it is closed). Please take this discussion to the mailing list.

Note: See TracTickets for help on using tickets.