Opened 10 years ago
Closed 8 years ago
Last modified 6 years ago
#2385 closed defect (invalid)
ST_MakeLine causing out of memory error
|Reported by:||ris||Owned by:||pramsey|
|Priority:||medium||Milestone:||PostGIS Fund Me|
|Keywords:||ST_MakeLine memory aggregate||Cc:|
The following snippet:
SELECT ST_MakeLine ( ST_Point ( a , b ) ) FROM generate_series ( 1 , 1000000 ) AS a , generate_series ( 1 , 10 ) AS b GROUP BY a;
uses a huge amount of memory.
This has been tested on Debian postgres 9.1.9-1/postgis 1.5.3-2 and also MacOS with postgres 9.2/postgis 2.0.
If you don't have massive amounts of memory this will continue until it fails with
ERROR: out of memory DETAIL: Failed on request of size 28.
but not before filling the log with thousands of lines the likes of
accumArrayResult: 8192 total in 1 blocks; 7400 free (0 chunks); 792 used
along with a lot more diagnostic output I won't include here unless you want it as I'm 100% sure you'll be able to reproduce this.
On the surface this looks a lot like http://www.postgresql.org/message-id/E1Sgmoz-0002gU-No@wrigleys.postgresql.org
Change History (13)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
|Status:||new → closed|
Okay my windows 64-bit 2008 R2 PostGIS 2.1 production instance finished and IT DID NOT FAIL.
POSTGIS="2.1.0rc2 r11663" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.7.8" LIBJSON="UNKNOWN" TOPOLOGY RASTER PostgreSQL 9.2.4, compiled by Visual C++ build 1600, 64-bit
So given this, I'm dismissing this as a non-issue and propose you make sure you are running the latest PostgreSQL micros.
comment:3 by , 10 years ago
|Status:||closed → reopened|
I respectfully disagree.
Whether your machine actually runs out of memory or not, this is still a memory leak of some sort. One million 10-point linestrings is not that big a deal, and shouldn't be using gigs and gigs of memory.
Alternatively, I could re-submit the bug with an extra zero or two on the 1 million figure and see if peoples machines cope with it then.
comment:4 by , 10 years ago
Well I wasn't monitoring my memory utilization. So I can retest with that and check.
What micro versions of PostgreSQL are you running though? You only gave a minor, but a lot of these issues are fixed in micro releases?
comment:5 by , 10 years ago
Looking further into postgres bug 6698, it doesn't seem to have been fixed at all. (This investigation is not aided by postgres not having a real bug tracker)
So you may well be right and this could just be the same bug. I was assuming that a bug reported in 9.1.2 over a year ago would have had a fix make its way into 9.1.9 and so this would be a similar-but-not-the-same bug caused by ST_MakeLine doing things based on the way array() was doing things and thus picking up some of its bugs. But apparently not.
Not really sure what to do here - I might prod the postgres mailing list over 6698 but as this bug is surfacing in postgis it may not be welcome. But I feel at least the existence of this bug should be documented somewhere for now hence its existence here.
comment:6 by , 10 years ago
|Milestone:||→ PostGIS Future|
comment:7 by , 10 years ago
Does abusing ST_Collect in a similar way lead to similar results? Question is whether the problem is in the accumulation stage of the aggregate or in the final function of the aggregate, and collect/makeline share accumulation code.
comment:8 by , 10 years ago
Also, to isolate whether it's a PgSQL subquery problem or a makeline problem turn your generate_series calls into a simple table and makeline on the table instead of the subquery.
comment:9 by , 10 years ago
My computer is really big and it doesn't crash on your example, but it does show a pretty big resident size on the process when the query in run. The thing it, this query show basically the same resident size, and doesn't have any spatial stuff in it at all:
create table foo as select count(*) FROM generate_series ( 1 , 1000000 ) AS a, generate_series ( 1 , 10 ) AS b GROUP BY a;
comment:10 by , 10 years ago
Hmm. That example doesn't fail for me though on my Debian/9.1.9/1.5.3 machine, even if I raise the 1m to 10m.
However, doing a
CREATE TABLE bar AS SELECT * FROM generate_series ( 1 , 1000000 ) AS a , generate_series ( 1 , 10 ) AS b; SELECT ST_MakeLine ( ST_Point ( a , b ) ) FROM bar GROUP BY a;
does indeed raise the same out of memory error.
As does exchanging ST_MakeLine for ST_Collect in both cases, though in the generate_series example the error is "…size 16" and, interestingly, in the materialized table example I have to up it to 10m for it to fail and it instead gives me the slightly different
out of memory for query result
comment:11 by , 10 years ago
Well, forming a 10M point line from 10M points can be reasonably expected to take about 500Mb (about 250M for each set of 10M points, and you will have both the line and points in memory at the time you finish). And since you're crossing a relation of 10m with a relation of 10 in your increased example, you're looking at 5Gb, which seems legitimately out of memory, no leaks or bad practice required.
comment:12 by , 8 years ago
|Status:||reopened → closed|
closing as invalid (not really a bug)
comment:13 by , 6 years ago
|Milestone:||PostGIS Future → PostGIS Fund Me|
I tested on my 9.3beta2 PostGIS 2.2 windows 7 32-bit instance fails with:
and my 64-bit instance is still chugging after 30 minutes.
given your link I suspect it is nothing we can do about and needs to be solved upstream. I'll leave it to pramsey to make that judgement call.