Performance issue with computed columns
|Reported by:||robe||Owned by:||robe|
As discussed in this blog post:
Using generated columns seems to be much slower than using on the fly computation. I was able to replicate similar results on my tiger data.
In my case, I was using tiger colorado state boundary and edges.
PostgreSQL 12.4, compiled by Visual C++ build 1914, 64-bit POSTGIS="3.0.2 3.0.2" [EXTENSION] PGSQL="120" GEOS="3.8.1-CAPI-1.13.3" SFCGAL="1.3.8" PROJ="Rel. 5.2.0, September 15th, 2018" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.4.3 (Internal)"
A computed query such as this
— took 6 secs 314 msec
SELECT COUNT(p.*) FROM test.co_random_points p INNER JOIN test.co_road_line r ON ST_Contains(ST_Buffer(ST_Transform(way_4326, 3857), 500), r.way);
In contrast, way_buffer created using:
ALTER TABLE test.co_random_points ADD way_buffer GEOMETRY (POLYGON, 3857) GENERATED ALWAYS AS (ST_Buffer(ST_Transform(way_4326, 3857), 500)) STORED ;
— 30 secs
SELECT COUNT(p.*) FROM test.co_random_points p INNER JOIN test.co_road_line r ON ST_Contains(p.way_buffer, r.way);
— manual buffer create using
ALTER TABLE test.co_random_points ADD way_buffer_manual GEOMETRY (POLYGON, 3857); UPDATE test.co_random_points SET way_buffer_manual = way_buffer; CREATE INDEX ix_random_points_way_buffer_manual_gist ON test.co_random_points USING GIST (way_buffer_manual);
-- returns in 1 secs 227ms SELECT COUNT(p.*) FROM test.co_random_points p INNER JOIN test.co_road_line r ON ST_Contains(p.way_buffer_manual, r.way) ;
Can think of a reason why a computed column should be any worse than a manual column.
I'll try to come up with a smaller test case we can easily fit in our testing