Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#1364 closed task (fixed)

[raster] Make ST_Union a c-implementation

Reported by: robe Owned by: Bborie Park
Priority: medium Milestone: PostGIS 2.1.0
Component: raster Version: master
Keywords: history Cc:

Description

Bborie,

Just putting this on the list before I forget to. I'm still revising Pierre's code to get rid or the rasterexp and the infinite number of state functions. I'm almost done but doing some spot checking to make sure my revised looks right.

Hopefully that will make it easier to rewrite when the time comes since there won't be quite so many moving parts.

I was tempted to make it multiband supportable, as now I have to write stupid code like this to reconstitute my chopped pictures and which was even more painful without the array ST_AddBand construct.

SELECT ST_AddBand(NULL, 
ARRAY[ST_Union(ST_Band(rast,1))
    , ST_Union(ST_Band(rast,2))
    , ST_Union(ST_Band(rast,3)) ] )
FROM samples.downtown_chunked 

Change History (18)

comment:1 by robe, 12 years ago

Actually we should probably think about how the ST_Union interface will change when we introduce multiband support since if a lot of people start using a particular construct, we can't just pull it out.

I would have liked ST_Union(rast) to just union bands in parallel like I have above or an argument to denote that is the behavior you want with an array of numbers like ST_Band. Perhabs something like a

ST_Union(rast raster,band_nums integer[])

But Pierre has 7 ST_Union aggregate permutations already none of which take a band number.

comment:2 by Bborie Park, 12 years ago

Multiband ST_Union would be nice but we're not going to have significant performance gains until I rewrite the C 2-raster MapAlgebra.

7 permutations without a band parameter? Does all versions work on all bands?

comment:3 by robe, 12 years ago

No all permutations as far as I can tell assumes the first band is what you want to work with. Actually I haven't tested his other permutations. i was mostly testing with ST_Union(rast) and ST_Union(rast, p_expression) — the first just defaults to 'LAST' I think.

I'm almost tempted to cut the rest out and just leave maybe two since I think the most common use case for ST_Union is going to be stitching back your chunked rasters based on a spatial filter or grouping which is pretty much satisfied by the first two. Then leave the other permutations up for discussion. I don't think Pierre would like me stripping out 5 of his functions (which of course is more like 20 functions). If I keep all 7, I think I can do it still without introducing the rastexpr and cutting out most of his various permutations of state functions. So it will still be a savings of about 10 functions.

comment:4 by robe, 12 years ago

By grouping I meant something like a file_name that raster2pgsql currently offers. That way you can stitch back a chunked dtem tile or something. That I see is the number 2 case next to spatial aggregation. I think all these extra permutations are going to confuse the hell out of people. With that said would be nice to see one version that takes a band number so you don't have to do ST_Band and alter expand that to take an array of bands.

comment:5 by Bborie Park, 12 years ago

I'm assuming that some of the permutations are duplicates of others as I don't believe Pierre uses default parameter values.

comment:6 by robe, 12 years ago

Yah those I took care of to some extent, but AGGREGATES don't seem to support default args fully (you can't define a default arg withint a CREATE AGGREGATE nad the state functions it doesn't seem to want to use one with a default arg that would otherwise match what it needs), so you would end up with more than you normally would with a regular function.

comment:7 by pracine, 12 years ago

Milestone: PostGIS Raster FuturePostGIS Future

comment:8 by Bborie Park, 12 years ago

Summary: [raster] Make ST_Union a c-implementation and also handle multiple bands[raster] Make ST_Union a c-implementation

I've broken this ticket into two due to the two distinct goals. Refer to #2021 for multi-band support in ST_Union.

comment:9 by Bborie Park, 12 years ago

Milestone: PostGIS FuturePostGIS 2.1.0
Owner: changed from pracine to Bborie Park
Status: newassigned

comment:10 by Bborie Park, 12 years ago

Keywords: history added
Resolution: fixed
Status: assignedclosed

Added in r10344. This C-implementation won't be backported to PostGIS 2.0 as it isn't a bug fix AND the significant amount of underlying code added/removed.

comment:11 by robe, 12 years ago

cool can't wait to try this out to compare with my previous workloads

comment:12 by pracine, 12 years ago

Great work Bborie!

Did you test performance?

I think the key question is "Is the state function still iterating on all the pixels forming the union of the raster or just on the changing extent?"

comment:13 by Bborie Park, 12 years ago

It runs on whatever the union's evolving extent needs to be.

comment:14 by pracine, 12 years ago

So we should expect O(n2) for a coverage of disjoint tiles? As demonstrated in my blog post? Not much change. Just that C is (supposed to be) faster.

comment:15 by Bborie Park, 12 years ago

In addition to non-stop deserialization and hexwkb encoding/decoding…

comment:16 by pracine, 12 years ago

Do you mean performance is bad?

comment:17 by Bborie Park, 12 years ago

No. Performance is better because of all the deserialization and hexwkb encoding/decoding that happens to the working union raster with every call to the function doesn't happen anymore with the C variant.

comment:18 by pracine, 12 years ago

Ok. But still, time processing should grow by a second degree order with the number of pixel involved?

Note: See TracTickets for help on using tickets.