Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#3864 closed defect (fixed)

Sorting by geometry is slower than sorting by geohash

Reported by: komzpa Owned by: pramsey
Priority: medium Milestone: PostGIS 2.4.1
Component: postgis Version: 2.3.x
Keywords: Cc:

Description (last modified by pramsey)

[local] gis@gis=# create table juno_osm_point_unclustered as (select * from juno_osm_point ORDER BY osm_id);
SELECT 159446335
Time: 601714,217 ms (10:01,714)

[local] gis@gis=# create table juno_osm_point_pt1 as (select * from juno_osm_point_unclustered ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C");
SELECT 159446335
Time: 1248105,490 ms (20:48,105)
[local] gis@gis=# create table juno_osm_point_pt2 as (select * from juno_osm_point_unclustered ORDER BY way);
SELECT 159446335
Time: 1287415,365 ms (21:27,415)

This may happen as there are some other infrastructure optimizations in string sorting code, like abbreviated key comparsions.

Change History (29)

comment:1 by robe, 7 years ago

Which version of PostGIS is this?

SELECT postgis_full_version();

You have it marked as 2.3, but just want to make sure it really is 2.3 and not 2.4. Since pramsey changed the sort behavior in 2.4

comment:2 by pramsey, 7 years ago

Description: modified (diff)

comment:3 by pramsey, 7 years ago

I guess. You're sorting 150M things and the difference is quite small. I admit I'm a little surprised that geohash isn't worse since I'm pretty sure it has to do a lot more work to generate a geohash that is needed to generate one of our pretend morton keys, which makes me wonder all the more about the result, and whether it's actually something to do with the sort itself.

comment:4 by komzpa, 7 years ago

In GeoHash scenario, it's all just being reprojected once, and later being sorted as integers due to abbreviated keys.

Exposing the morton key as abbreviated key can help: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=4ea51cdfe85ceef8afabceb03c446574daa0ac23

comment:5 by komzpa, 7 years ago

Please also note that fisrt time for the reference there is a osm_id sort, that is plain bigint sort. It finished twice as fast. There is no index on any sort in question.

comment:6 by robe, 7 years ago

Milestone: PostGIS 2.4.0PostGIS 2.4.1

comment:7 by pnorman, 7 years ago

Just to note, ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C") is tweaked based on OSM data, but the selection of 10 was done prior to the commit Komzpa linked, the C collation.

For geohash sorting, the C collation is vital and nearly cuts the time in half.

https://github.com/openstreetmap/osm2pgsql/pull/242 for GeoHash details https://github.com/openstreetmap/osm2pgsql/issues/581 for collation details.

The osm_id sort puts a lower bound on the theoretical speed.

Last edited 7 years ago by pnorman (previous) (diff)

comment:8 by komzpa, 7 years ago

It looks like ticket contains pre-2.4.0 build.

I've tried the same on 2.4.0, and it allocates more than 140Gb of RAM on order by way. On GeoHash sort it allocates up to 10Gb then lowers to 3Gb. I didn't try to wait until query finish.

comment:9 by komzpa, 7 years ago

[local] gis@gis=# create table juno_osm_point_unclustered as (select *
[more] ( >  from juno_osm_point ORDER BY osm_id);
SELECT 159843185
Time: 768781,686 ms (12:48,782)
[local] gis@gis=#  create table juno_osm_point_pt1 as (select * from juno_osm_point_unclustered ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C");
SELECT 159843185
Time: 1223812,285 ms (20:23,812)
[local] gis@gis=# create table juno_osm_point_pt2 as (select * from juno_osm_point_unclustered ORDER BY way);
ERROR:  57014: canceling statement due to user request
LOCATION:  ProcessInterrupts, postgres.c:2999
Time: 932325,460 ms (15:32,325)
-- here postgres ate 140GB of my 128 GB machine so I cancelled it.


[local] gis@gis=# create table juno_osm_point_pt3 as (select * from juno_osm_point_unclustered ORDER BY way::bytea);
SELECT 159843185
Time: 1055460,975 ms (17:35,461)
-- this can be used as reasonable performance estimator

comment:10 by gsmol, 7 years ago

About 128GB been mercilessly eated in komzpa`s case: via gdb I can detect that there is a lot of palloc() coming from lwgeom_cmp, but no pfree():

#0  palloc (size=size@entry=32) at ./build/../src/backend/utils/mmgr/mcxt.c:850                                             
#1  0x00005575da620a0a in heap_tuple_untoast_attr (attr=0x5577c344d1c0) at ./build/../src/backend/access/heap/tuptoaster.c:240
#2  0x00007f6c77386af2 in lwgeom_cmp () from /usr/lib/postgresql/10/lib/postgis-2.4.so                                                           
#3  0x00005575da9e76b1 in comparison_shim (x=<optimized out>, y=<optimized out>, ssup=<optimized out>)                 
    at ./build/../src/backend/utils/sort/sortsupport.c:53                                                            
#4  0x00005575da9ec64e in ApplySortComparator (ssup=0x5575dbbb7b98, isNull2=<optimized out>, datum2=<optimized out>, isNull1=<optimized out>,     
    datum1=<optimized out>) at ./build/../src/include/utils/sortsupport.h:225                                                                  
#5  qsort_ssup (a=0x7f6c37b25078, n=<optimized out>, n@entry=14, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:244                        
#6  0x00005575da9ec930 in qsort_ssup (a=0x7f6c37b24b20, a@entry=0x7f6c37b24340, n=<optimized out>, ssup=ssup@entry=0x5575dbbb7b98)          
    at ./qsort_tuple.c:323                                                                                       
#7  0x00005575da9ec55d in qsort_ssup (a=0x7f6c37b24340, n=<optimized out>, n@entry=778, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:309
#8  0x00005575da9ec930 in qsort_ssup (a=0x7f6c37b1d470, n=<optimized out>, n@entry=40979, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:323   
#9  0x00005575da9ec930 in qsort_ssup (a=0x7f6c37924d20, n=<optimized out>, n@entry=140380, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:323
#10 0x00005575da9ec930 in qsort_ssup (a=0x7f6c3755cf20, a@entry=0x7f6c3710f6c8, n=<optimized out>, ssup=ssup@entry=0x5575dbbb7b98)
    at ./qsort_tuple.c:323                                                                                                        
#11 0x00005575da9ec55d in qsort_ssup (a=a@entry=0x7f6c3710f6c8, n=<optimized out>, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:309     
#12 0x00005575da9ec55d in qsort_ssup (a=a@entry=0x7f6c3710f6c8, n=<optimized out>, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:309       
#13 0x00005575da9ec55d in qsort_ssup (a=a@entry=0x7f6c3710f6c8, n=<optimized out>, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:309          
#14 0x00005575da9ec55d in qsort_ssup (a=0x7f6c3710f6c8, n=<optimized out>, n@entry=20343532, ssup=ssup@entry=0x5575dbbb7b98) at ./qsort_tuple.c:309
#15 0x00005575da9ec930 in qsort_ssup (a=0x7f6c161fb048, n=<optimized out>, ssup=<optimized out>) at ./qsort_tuple.c:323
#16 0x00005575da9eca89 in tuplesort_sort_memtuples (state=state@entry=0x5575dbbb7788) at ./build/../src/backend/utils/sort/tuplesort.c:3514
Last edited 7 years ago by gsmol (previous) (diff)

comment:11 by strk, 7 years ago

Missing PG_FREE_IF_COPY in here:

PG_FUNCTION_INFO_V1(lwgeom_cmp);
Datum lwgeom_cmp(PG_FUNCTION_ARGS)
{
  GSERIALIZED *g1 = PG_GETARG_GSERIALIZED_P(0);
  GSERIALIZED *g2 = PG_GETARG_GSERIALIZED_P(1);
    PG_RETURN_INT32(gserialized_cmp(g1, g2));
}

comment:12 by strk, 7 years ago

In 15863:

Fix memory leaks in BTREE operators

References #3864

comment:13 by strk, 7 years ago

Can you test r15863 = 97fd8b3b0f40b47a87d6d793fe3bf6590855ebc6 (refs/remotes/svn/trunk) If confirmed to fix the excessive memory usage it should be backported to 2.4 branch

comment:14 by komzpa, 7 years ago

Thanks, memory issue went away. Do the backport.

As per speed, on pre-warmed table it becomes even slower now:

[local] gis@gis=# create table juno_osm_point_pt2 as (select * from juno_osm_point_unclustered ORDER BY way);
SELECT 159843185
Time: 1510500,993 ms (25:10,501)

comment:15 by strk, 7 years ago

In 15865:

Fix memory leaks in BTREE operators

References #3864 for 2.4 branch

comment:16 by strk, 7 years ago

I think it is expected that cleaning up while working keeps your table clean but makes you work somewhat slower …

comment:17 by pramsey, 7 years ago

I dunno, I decided to profile it and couldn't see any big cpu-eating monsters there. I also ran my own timing on my own data (only 200k records) and found the direct geom ORDER BY to be slightly faster…

# select gid from (
    select * from roads 
    order by geom
    ) a limit 1;

Time: 1193.123 ms (00:01.119)

# select gid from (
    select * from roads 
    order by ST_GeoHash(ST_Transform(ST_Envelope(geom),4326),10) COLLATE "C"
    ) a limit 1;

Time: 1373.566 ms (00:01.374)
Version 1, edited 7 years ago by pramsey (previous) (next) (diff)

comment:18 by komzpa, 7 years ago

The difference I see is that my dataset is made of points. Can you check on points dataset? It should perform more work by getting box from point first.

It feels that hash can be made of a point directly without building box first. Also for a point you don't need any comparsions after.

comment:19 by komzpa, 7 years ago

3.5% of improvement via memcpy removal in #3871:

[local] gis@gis=#  create table juno_osm_point_pt2 as (select * from juno_osm_point_unclustered ORDER BY way);
SELECT 159843185
Time: 1459625,971 ms (24:19,626)

comment:20 by komzpa, 7 years ago

I cannot confirm that polygon data sort is faster, too, although the difference is 7%, not 20 as in point case:

[local] gis@gis=# create table juno_osm_polygon_pt1 as (select * from juno_osm_polygon_unclustered order by ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C");
SELECT 9936051
Time: 211660,739 ms (03:31,661)
[local] gis@gis=# create table juno_osm_polygon_pt2 as (select * from juno_osm_polygon_unclustered order by way);
SELECT 9936051
Time: 226570,568 ms (03:46,571)
[local] gis@gis=# create table juno_osm_polygon_pt3 as (select * from juno_osm_polygon_unclustered order by way::bytea);
SELECT 9936051
Time: 187295,132 ms (03:07,295)

This SELECT form is used in osm2pgsql workflow.

I've tried to have a look at what can be done for points in https://github.com/postgis/postgis/pull/146

comment:21 by pramsey, 7 years ago

In 15923:

Performance improvements for b-tree geometry sorts
References #3864

comment:22 by pramsey, 7 years ago

Resolution: fixed
Status: newclosed

In 15924:

Performance improvements for b-tree geometry sorts
Closes #3864

comment:23 by pramsey, 7 years ago

Move further discussion of sortsupport improvements for geometry sorting into a fresh ticket.

comment:24 by komzpa, 7 years ago

After recompiling Postgres for server's target architecture and enabling -Ofast, on latest trunk - we're 12% slower than GeoHash instead of initial 20%:

[local] gis@gis=# create table juno_osm_point_pt1_fastmath as (select * from juno_osm_point_unclustered ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C");
SELECT 159843185
Time: 928645,823 ms (15:28,646)
[local] gis@gis=# create table juno_osm_point_pt2_fastmath as (select * from juno_osm_point_unclustered ORDER BY way);
SELECT 159843185
Time: 1052055,649 ms (17:32,056)

comment:25 by pramsey, 7 years ago

I still find this very hard to wrap my head around. Look at how much work there is in the geohash calculation function — before any sorting can happen, all those expensive keys have to be calculated. So all we have left is the idea that somehow, in your very large sort, the cost of the sort itself washes out the cost of key calculation. So the sortsupport stuff better be the answer or I'm out of ideas; makes no sense.

comment:26 by strk, 7 years ago

Idea: cluster your data so it's already sorted and compare again ?

comment:27 by komzpa, 7 years ago

It indeed helps a lot:

[local] gis@gis=# create table juno_osm_point_pt2_fastmath_ordered as  (select * from juno_osm_point_pt2_fastmath order by way);
SELECT 159843185
Time: 596587,644 ms (09:56,588)

comment:28 by pnorman, 7 years ago

The pre-sort ordering is very significant - if it's already ordered, much less work needs to be done.

What's the performance like for ordering for a 2nd time with a geohash order, and with an id order?

comment:29 by komzpa, 7 years ago

Bigint sorted sort is the same time as sorted geom and is almost same as unsorted bigint. Geohash time is same as unsorted Geohash.

[local] gis@gis=# create table juno_osm_point_pt1_fastmath_ordered as (select * from juno_osm_point_pt1_fastmath ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C");
SELECT 159843185
Time: 935372,463 ms (15:35,372)

[local] gis@gis=# create table juno_osm_point_unclustered_ordered as (select * from juno_osm_point_unclustered order by osm_id);
SELECT 159843185
Time: 596486,915 ms (09:56,487)

Note: See TracTickets for help on using tickets.