Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#2269 closed defect (fixed)

Unreasonable memory usage when analyzing table with geometry column with many toasted datums

Reported by: pgeoghegan Owned by: strk
Priority: blocker Milestone: PostGIS 2.0.4
Component: postgis Version: 2.0.x
Keywords: toast, leak, history Cc:

Description (last modified by strk)

I appear to have a problem running ANALYZE on a table with the Postgis geometry type. The table in question is 1099 MB, but only has 10631 tuples.

The call stack during the ANALYZE usually looks like this:

(gdb) bt
#0 pglz_decompress (source=0x43f6bd98, dest=0x43fb5b2c "") at
pg_lzcompress.c:711
#1 0x0000000000493c17 in heap_tuple_untoast_attr_slice (attr=0x7f8f49b8fd90,
sliceoffset=0, slicelength=40) at tuptoaster.c:217
#2 0x0000000000880010 in pg_detoast_datum_slice (datum=0x7f8f49b8fd90,
first=0, count=40)
at fmgr.c:2266
#3 0x00007f8f4aa6e085 in gserialized_datum_get_gidx_p (gsdatum=140253393911184,
gidx=gidx@entry=0x7fffbbac5d40) at gserialized_gist.c:248
#4 0x00007f8f4aa6e170 in gserialized_datum_get_gbox_p (gsdatum=<optimized out>,
gbox=0x7fffbbac5ef0) at gserialized_gist.c:65
#5 0x00007f8f4aa379ec in compute_geometry_stats (stats=0x1310678,
fetchfunc=0x5a0ae7 <std_fetch_func>, samplerows=10631, totalrows=10631)
at geometry_estimate.c:865
#6 0x000000000059e02f in do_analyze_rel (onerel=0x7f9054bd1000,
vacstmt=0x12ca5a0,
acquirefunc=0x59f0f7 <acquire_sample_rows>, relpages=726, inh=0
'\000', elevel=17)
at analyze.c:525
#7 0x000000000059d5a9 in analyze_rel (relid=17173, vacstmt=0x12ca5a0,
bstrategy=0x130d9e8) at analyze.c:265
#8 0x0000000000612aab in vacuum (vacstmt=0x12ca5a0, relid=0, do_toast=1 '\001',
bstrategy=0x130d9e8, for_wraparound=0 '\000', isTopLevel=1 '\001') at
vacuum.c:247
#9 0x0000000000764c7a in standard_ProcessUtility (parsetree=0x12ca5a0,
queryString=0x12c9af8 "analyze verbose onesite_overview ;", params=0x0,
isTopLevel=1 '\001', dest=0x12ca928, completionTag=0x7fffbbac6910 "")
---Type <return> to continue, or q <return> to quit---
at utility.c:1027
#10 0x00007f9052a97e15 in pgss_ProcessUtility ()
from /home/pg/pgsql/lib/pg_stat_statements.so
#11 0x000000000076395b in ProcessUtility (parsetree=0x12ca5a0,
queryString=0x12c9af8 "analyze verbose onesite_overview ;", params=0x0,
isTopLevel=1 '\001', dest=0x12ca928, completionTag=0x7fffbbac6910 "")
at utility.c:332
#12 0x00000000007628e0 in PortalRunUtility (portal=0x1248398,
utilityStmt=0x12ca5a0,
isTopLevel=1 '\001', dest=0x12ca928, completionTag=0x7fffbbac6910 "")
at pquery.c:1185
#13 0x0000000000762ab8 in PortalRunMulti (portal=0x1248398,
isTopLevel=1 '\001',
dest=0x12ca928, altdest=0x12ca928, completionTag=0x7fffbbac6910 "") at
pquery.c:1316
#14 0x0000000000761fe6 in PortalRun (portal=0x1248398,
count=9223372036854775807,
isTopLevel=1 '\001', dest=0x12ca928, altdest=0x12ca928,
completionTag=0x7fffbbac6910 "") at pquery.c:814
#15 0x000000000075be41 in exec_simple_query (
query_string=0x12c9af8 "analyze verbose onesite_overview ;") at postgres.c:1046
#16 0x0000000000760190 in PostgresMain (argc=2, argv=0x1229380,
username=0x12291c8 "pg")
at postgres.c:3958
#17 0x0000000000700321 in BackendRun (port=0x124c370) at postmaster.c:3619
#18 0x00000000006ff991 in BackendStartup (port=0x124c370) at postmaster.c:3304
#19 0x00000000006fc7b3 in ServerLoop () at postmaster.c:1367
#20 0x00000000006fc0aa in PostmasterMain (argc=2, argv=0x1227230) at
postmaster.c:1127
#21 0x0000000000667225 in main (argc=2, argv=0x1227230) at main.c:199

PostGIS appears to leak memory for each tuple ANALYZED (or the memory context that allocation occurs within is not destroyed until the end of ANALYZE, which results in ballooning of memory consumption in practice).

This test case was produced with PostGIS 2.0.3.

Change History (16)

comment:1 Changed 6 years ago by strk

Description: modified (diff)
Milestone: PostGIS 2.0.4
Owner: changed from pramsey to strk
Status: newassigned

comment:2 Changed 6 years ago by strk

Resolution: fixed
Status: assignedclosed

Fixed by r11273 in 2.0 branch and r11274 in trunk

comment:3 Changed 6 years ago by robe

Resolution: fixed
Status: closedreopened

comment:4 Changed 6 years ago by robe

You change to 2.0 branch appears to be causing debbie to crash on 8.4 test.

PostgreSQL 8.4.17 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Debian 4.4.5-8) 4.4.5, 64-bit
 Postgis 2.0.4SVN - r11273 - 2013-04-09 20:27:06
   GEOS: 3.3.9dev-CAPI-1.7.9
   PROJ: Rel. 4.7.1, 23 September 2009

Running tests

 loader/Point ....................... ok 
 loader/PointM ....................... ok 
 loader/PointZ ....................... ok 
 loader/MultiPoint ....................... ok 
 loader/MultiPointM ....................... ok 
 loader/MultiPointZ ....................... ok 
 loader/Arc ....................... ok 
 loader/ArcM ....................... ok 
 loader/ArcZ ....................... ok 
 loader/Polygon ....................... ok 
 loader/PolygonM ....................... ok 
 loader/PolygonZ ....................... ok 
 loader/TSTPolygon ......... ok 
 loader/TSIPolygon ......... ok 
 loader/TSTIPolygon ......... ok 
 loader/PointWithSchema ..... ok 
 loader/NoTransPoint ......... ok 
 loader/NotReallyMultiPoint ............... ok 
 loader/MultiToSinglePoint ......... ok 
 loader/ReprojectPts ........ ok 
 loader/ReprojectPtsGeog ........ ok 
 loader/Latin1 .... ok 
 binary .. ok 
 regress .. ok 
 regress_index .. ok 
 regress_index_nulls .. ok 
 lwgeom_regress .. ok 
 regress_lrs .. ok 
 removepoint .. ok 
 setpoint .. ok 
 simplify .. ok 
 snaptogrid .. ok 
 summary .. ok 
 affine .. ok 
 empty .. ok 
 measures .. ok 
 legacy .. ok 
 long_xact .. ok 
 ctors .. ok 
 sql-mm-serialize .. ok 
 sql-mm-circularstring .. ok 
 sql-mm-compoundcurve .. ok 
 sql-mm-curvepoly .. ok 
 sql-mm-general .. ok 
 sql-mm-multicurve .. ok 
 sql-mm-multisurface .. ok 
 polyhedralsurface .. ok 
 polygonize .. ok 
 postgis_type_name .. ok 
 out_geometry .. ok 
 out_geography .. ok 
 in_gml .. ok 
 in_kml .. ok 
 iscollection .. ok 
 regress_ogc .. ok 
 regress_ogc_cover .. ok 
 regress_ogc_prep .. ok 
 regress_bdpoly .. ok 
 regress_proj .. ok 
 regress_management .. ok 
 dump .. ok 
 dumppoints .. ok 
 wmsservers .. ok 
 wkt .. ok 
 wkb .. ok 
 tickets .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_66_diff)
 typmod .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_67_diff)
 remove_repeated_points .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_68_diff)
 split .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_69_diff)
 relate .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_70_diff)
 bestsrid .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_71_diff)
 concave_hull .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_72_diff)
 hausdorff .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_73_diff)
 regress_buffer_params .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_74_diff)
 offsetcurve .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_75_diff)
 relatematch .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_76_diff)
 isvaliddetail .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_77_diff)
 sharedpaths .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_78_diff)
 snap .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_79_diff)
 node .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_80_diff)
 unaryunion .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_81_diff)
 clean .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_82_diff)
 relate_bnr .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_83_diff)
 in_geojson .. failed (diff expected obtained: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/test_84_diff)
 uninstall . failed (uninstall_postgis.sql failed: /var/lib/jenkins/workspace/postgis/tmp/2_0_pg8.4w64/uninstall.err)
.psql: FATAL:  the database system is in recovery mode
dropdb: could not connect to database postgres: FATAL:  the database system is in recovery mode
test: 1052: Something: unexpected operator
 ok (3699)

Run tests: 85
Failed: 20
make[1]: *** [check] Error 20
make[1]: Leaving directory `/var/lib/jenkins/workspace/postgis/branches/2.0/regress'
make: *** [check] Error 1
Build step 'Execute shell' marked build as failure
An attempt to send an e-mail to empty list of recipients, ignored.
Finished: FAILURE

I retested to make sure it wasn't a fluke and crsashed the second time in same spot

comment:5 Changed 6 years ago by robe

tail end of crash tickets crash:

:
NOTICE:  "<current>"."t"."g" is empty or not analyzed
#877.1|
NOTICE:  "public"."t"."g" is empty or not analyzed
#877.2|
NOTICE:  "<current>"."t"."g" is empty or not analyzed
#877.3|
connection to server was lost

comment:6 Changed 6 years ago by pramsey

I'd expect trunk to blow chunks too,

http://trac.osgeo.org/postgis/browser/trunk/postgis/geometry_estimate.c?rev=11274#L938

the edit means that VARSIZE is being called on... something? Just pfreeing the geom at the end of the loop might be the preferred approach. Or, find out what we're supposed to be saving into stawidth, since that's black magic.

comment:7 Changed 6 years ago by strk

Priority: highblocker

Oops. I guess we could theoretically read the size from a toast slice as well, will look into it tomorrow (midnight here). Turning into a blocker as it's worst than it was before.

comment:8 Changed 6 years ago by strk

"stawidth" is "the average storage width of non-null entries": http://www.postgresql.org/docs/8.4/static/catalog-pg-statistic.html

comment:9 Changed 6 years ago by strk

@pramsey, while I sleep maybe you're able to figure out if we need the TOASTED or DETOASTED value size in that field ?

comment:10 Changed 6 years ago by strk

tsearch uses compressed width:

http://pgci.eisentraut.org/jenkins/job/postgresql_master_coverage/Coverage/src/backend/tsearch/ts_typanalyze.c.gcov.html

I wonder which users have we be feeding with wrong info for all this time

comment:11 Changed 6 years ago by strk

r11275 in 2.0 and r11276 in trunk

It's still to be researched what effects will the stawidth change have on the planner. Maybe it'll consider sequential scans cheaper. Did you ever work on that testing support for planner pramsey ?

comment:12 Changed 6 years ago by pramsey

Change looks reasonable, test suite willing. I don't know how I'd test the planner, but there's testing hooks now for selectivity calculations and histogram values, in trunk at any rate, see regress_selectivity.sql

comment:13 Changed 6 years ago by robe

trunk wasn't blowing just because we don't support 8.4 on PostGIS 2.1 and the crash only seemed to happen on the 8.4 regress. I haven't tested locally on my 8.4 32-bit. Winnie doesn't test 8.4 because there is no 64-bit support for 8.4 and I can't compile 8.4 with mingw64 chain.

comment:14 Changed 6 years ago by robe

I'll keep this open for now because winnie is totally blowing crashes on install of postgis after this change. Might be unrelated since I was upgrading pg micros on her too and I'll test on my local instances as well.

At least your last change seems to make debbie 8.4 regress pass now.

comment:15 Changed 6 years ago by robe

Resolution: fixed
Status: reopenedclosed

Works fine on my pc 9.2w64 (haven't tried on 8.4 32-bit). I think winnie's issue is because I copied the wrong version of libxml during PostgresQL upgrade install so postgis dll doesn't even load since the 9.0 seems ot be fine which I didn't get around to ugprading. I'll rerun after I fixed this, but seems unrelated.

I'll close and reopen if I still see problems.

comment:16 Changed 6 years ago by robe

Keywords: history added
Note: See TracTickets for help on using tickets.