#2267 closed defect (fixed)
Server crash from analyze table
Reported by: | peters | Owned by: | pramsey |
---|---|---|---|
Priority: | blocker | Milestone: | PostGIS 2.0.4 |
Component: | postgis | Version: | 2.0.x |
Keywords: | centos rhel5 geography history point analyze crash | Cc: |
Description
Below is simple test case which replicates problem on (2) different PostGIS systems running same version. It seems have to have something to do with a table containing a geographical point in which x values are different but y values are same. May be specific to certain value ranges.
' ' | postgis_full_version(); |
PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit POSTGIS="2.0.3 r11128" GEOS="3.3.8-CAPI-1.7.8" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.6.26" LIBJSON="UNKNOWN" RASTER
(1 row)
create table foo(g geography(point,4326));
insert into foo values(st_setsrid(st_makepoint(1.2, 19.8), 4326));
insert into foo values(st_setsrid(st_makepoint(1.3, 19.8), 4326));
analyze verbose foo;
INFO: analyzing "public.foo" INFO: "foo": scanned 1 of 1 pages, containing 2 live rows and 0 dead rows; 2 rows in sample, 2 estimated total rows
server closed the connection unexpectedly
This probably means the server terminated abnormally before or while processing the request.
The connection to the server was lost. Attempting reset: Failed. !> \q
Attachments (2)
Change History (26)
comment:1 by , 11 years ago
Priority: | medium → blocker |
---|
comment:2 by , 11 years ago
Milestone: | → PostGIS 2.0.4 |
---|
follow-up: 4 comment:3 by , 11 years ago
comment:4 by , 11 years ago
I reproduced on my dev system, running 9.2.4, 2.0.3 under CENTOS-5.
Only really special thing about this build is use of larger block size (./configure —with-blocksize=32 —with-wal-blocksize=32 —enable-debug).
Rebuilt postgres and postgis for debugging, ran posted test case using psql client:
Psql client output:
analyze verbose foo;
INFO: analyzing "public.foo"
NOTICE: [geography_estimate.c:geography_analyze:1490] geography_analyze called
NOTICE: [geography_estimate.c:geography_analyze:1497] attribute stat target: 500
NOTICE: [geography_estimate.c:geography_analyze:1503] minrows: 150000
INFO: "foo": scanned 1 of 1 pages, containing 2 live rows and 0 dead rows; 2 rows in sample, 2 estimated total rows
NOTICE: [geography_estimate.c:compute_geography_stats:826] compute_geography_stats called
NOTICE: [geography_estimate.c:compute_geography_stats:932] End of 1st scan:
NOTICE: [geography_estimate.c:compute_geography_stats:934] Sample extent (min, max): (0.940639 0.0197043 0.338738), (0.940674 0.0213461 0.338738)
NOTICE: [geography_estimate.c:compute_geography_stats:935] No. of geometries sampled: 2
NOTICE: [geography_estimate.c:compute_geography_stats:936] No. of non-null geometries sampled: 2
NOTICE: [geography_estimate.c:compute_geography_stats:947] Standard deviation filter enabled
NOTICE: [geography_estimate.c:compute_geography_stats:979] standard deviations:
NOTICE: [geography_estimate.c:compute_geography_stats:980] LOWx - avg:0.940656 sd:0.000018
NOTICE: [geography_estimate.c:compute_geography_stats:981] LOWy - avg:0.020525 sd:0.000821
NOTICE: [geography_estimate.c:compute_geography_stats:982] LOWz - avg:0.338738 sd:0.000000
NOTICE: [geography_estimate.c:compute_geography_stats:983] HIGx - avg:0.940657 sd:0.000018
NOTICE: [geography_estimate.c:compute_geography_stats:984] HIGy - avg:0.020525 sd:0.000821
NOTICE: [geography_estimate.c:compute_geography_stats:985] HIGz - avg:0.338738 sd:0.000000
NOTICE: [geography_estimate.c:compute_geography_stats:995] sd_extent: xmin, ymin, zmin: 0.940639, 0.019704, 0.338738
NOTICE: [geography_estimate.c:compute_geography_stats:997] sd_extent: xmax, ymax, zmax: 0.940674, 0.021346, 0.338738
NOTICE: [geography_estimate.c:compute_geography_stats:1083] histogram_extent: xmin, ymin, zmin: 0.940639, 0.019704, 0.338738
NOTICE: [geography_estimate.c:compute_geography_stats:1085] histogram_extent: xmax, ymax, zmax: 0.940674, 0.021346, 0.338738
NOTICE: [geography_estimate.c:compute_geography_stats:1129] Number of dimensions in sample set: 3
NOTICE: [geography_estimate.c:compute_geography_stats:1247] computed histogram grid size (X,Y,Z): 128 x 5863 x 0 (0 out of 80000 cells)
NOTICE: [geography_estimate.c:compute_geography_stats:1285] Beginning histogram intersection calculations
NOTICE: [geography_estimate.c:compute_geography_stats:1304] feat 0 box is 0.940674 0.019704 0.338738, 0.940674 0.019704 0.338738
NOTICE: [geography_estimate.c:compute_geography_stats:1337] feat 0 overlaps unitsx 127-127, unitsy 0-0, unitsz -1—1
Gdb server side:
Program received signal SIGSEGV, Segmentation fault.
0x00002b803a870407 in compute_geography_stats (stats=0x1321db98, fetchfunc=0x569d3f <std_fetch_func>, samplerows=2, totalrows=2) at geography_estimate.c:1367 1367 geogstats→value[x + y * unitsx + z * unitsx * unitsy] += 1;
(gdb) where
#0 0x00002b803a870407 in compute_geography_stats (stats=0x1321db98, fetchfunc=0x569d3f <std_fetch_func>, samplerows=2, totalrows=2) at geography_estimate.c:1367
#1 0x00000000005677df in do_analyze_rel (onerel=0x2b803bfbab00, vacstmt=0x131d51f0, acquirefunc=0x5687e2 <acquire_sample_rows>, relpages=1, inh=0 '\000', elevel=17) at analyze.c:525
#2 0x0000000000566e13 in analyze_rel (relid=57344, vacstmt=0x131d51f0, bstrategy=0x1321b300) at analyze.c:265
#3 0x00000000005d2c86 in vacuum (vacstmt=0x131d51f0, relid=0, do_toast=1 '\001', bstrategy=0x1321b300, for_wraparound=0 '\000', isTopLevel=1 '\001') at vacuum.c:247
#4 0x000000000070004e in standard_ProcessUtility (parsetree=0x131d51f0, queryString=0x131d47c0 "analyze verbose foo;", params=0x0, isTopLevel=1 '\001', dest=0x131d5530,
completionTag=0x7fff713d3370 "") at utility.c:1027
#5 0x00002b80194c917c in pgss_ProcessUtility () from /usr/local/pgsql/lib/pg_stat_statements.so
#6 0x00000000006fef11 in ProcessUtility (parsetree=0x131d51f0, queryString=0x131d47c0 "analyze verbose foo;", params=0x0, isTopLevel=1 '\001', dest=0x131d5530, completionTag=0x7fff713d3370 "")
at utility.c:332
#7 0x00000000006fdff8 in PortalRunUtility (portal=0x13146360, utilityStmt=0x131d51f0, isTopLevel=1 '\001', dest=0x131d5530, completionTag=0x7fff713d3370 "") at pquery.c:1185
#8 0x00000000006fe18f in PortalRunMulti (portal=0x13146360, isTopLevel=1 '\001', dest=0x131d5530, altdest=0x131d5530, completionTag=0x7fff713d3370 "") at pquery.c:1316
#9 0x00000000006fd7ff in PortalRun (portal=0x13146360, count=9223372036854775807, isTopLevel=1 '\001', dest=0x131d5530, altdest=0x131d5530, completionTag=0x7fff713d3370 "") at pquery.c:814
#10 0x00000000006f7d16 in exec_simple_query (query_string=0x131d47c0 "analyze verbose foo;") at postgres.c:1046
#11 0x00000000006fbc3a in PostgresMain (argc=1, argv=0x131266b0, dbname=0x13126698 "gsw", username=0x13126680 "gsw") at postgres.c:3959
#12 0x00000000006a9cfb in BackendRun (port=0x1314b450) at postmaster.c:3614
#13 0x00000000006a9412 in BackendStartup (port=0x1314b450) at postmaster.c:3304
#14 0x00000000006a632a in ServerLoop () at postmaster.c:1367
#15 0x00000000006a5cb8 in PostmasterMain (argc=1, argv=0x131247e0) at postmaster.c:1127
#16 0x000000000061d98c in main (argc=1, argv=0x131247e0) at main.c:199
So problem appears to be in postgis compute_geography_stats ().
comment:5 by , 11 years ago
Here is some additional gdb info at the crash point:
(gdb) print x $1 = 127
(gdb) print y $2 = 0
(gdb) print z $3 = -1
(gdb) print unitsx $4 = 128
(gdb) print x + y * unitsx + z * unitsx * unitsy $5 = -750337
(gdb) print ndims $6 = 3
(gdb) print z_idx_max $7 = -1
(gdb) print total_cells_coverage $8 = 0
(gdb) print z_idx_min $9 = -1
The negative numbers entering into the calculation for an index into geogstats→value looks suspicious to me, but this is the first time I've seen this code.
—Peter
comment:6 by , 11 years ago
Also suspicious is that the code seems to be attempting to process z values when I originally entered 2D geographical points into the table. I am not sure where the z values are coming from.
—Peter
comment:7 by , 11 years ago
Geography coordinates are indexed in geocentric (x/y/z) space, so that's not an issue. I'll see if this test case exercises my own environment.
comment:10 by , 11 years ago
Ive tried a few things, and I consistently get crash at the same point. Note that I am running CENTOS-5 (RHEL-5 look-alike).
Database 9.2.3 with postgis 2.0.3 crashes.
Database 9.2.3 with postgis 2.0.2 crashes.
I went back to database 9.2.4 and postgis 2.0.3 and created brand new database using default block sizing and still crashes exactly the same.
Can you try this on a CENTOS, RHEL, or Fedora Os?
thanks, —Peter
comment:11 by , 11 years ago
Keywords: | centos rhel5 added |
---|
FWIW:
I have PostGIS compiled under mingw64 using GCC 4.5.4 but deployed on VC++ compiled PostgreSQL.
I tried on my 9.2.2 Windows 7
SELECT version() || ' ' || postgis_full_version(); PostgreSQL 9.2.2, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.0.3 r11132" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER
and my windows 2008 64-bit
PostgreSQL 9.2.4, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.0.2SVN r10643" GEOS="3.4.0dev-CAPI-1.8.0 r0" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.8" LIBJSON="UNKNOWN" (core procs from "2.0.1 r9979" need upgrade) RASTER (raster procs from "2.0.1 r9979" need upgrade)
and doesn't seem to crash on either of them even after running multiple times. So might be specific to Centos or Redhat I guess.
Tried my 2.1 build on windows 7 64-bit and didn't see any crashes.
Bborie, You've got every VM under the sun You wouldn't happen to have a CentOS 5 64-bit one too would you?
comment:12 by , 11 years ago
Another data point. Tried on my Windows 7 machine, using prebuilt versions:
PostgreSQL 9.2.2, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.0.2 r10789" GEOS="3.3.6-CAPI-1.7.6" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08 GDAL_DATA not found" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER
Can't tell you the line number, as I don't have windows dev tools, but it crashed on analyze. Took out the whole service.
2013-04-10 15:54:47 EDT:::@: [11048]: LOG: server process (PID 6388) was terminated by exception 0xC0000005 2013-04-10 15:54:47 EDT:::@: [11048]: DETAIL: Failed process was running: analyze verbose foo; 2013-04-10 15:54:47 EDT:::@: [11048]: HINT: See C include file "ntstatus.h" for a description of the hexadecimal value.
Note that you should just do those 2 insert statements and analyze. You can drop the table and recreate as needed and it crashes every time.
comment:13 by , 11 years ago
Can you try the 2.0.3 one? Do you run all the steps at once. I tried individually and then together and couldn't get it to crash.
comment:14 by , 11 years ago
I'll try swapping out my GEOS to see if that is the culprit — it shouldn't be involved at all, but you never know. I'm running with a dev version of GEOS.
comment:15 by , 11 years ago
dustymugs,
What is your postgis full, version, and OS you are running on?
comment:16 by , 11 years ago
comment:19 by , 11 years ago
I updated my Win7 install using the prebuilt release:
PostgreSQL 9.2.2, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.0.3 r11132" GEOS="3.3.8-CAPI-1.7.8" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08 GDAL_DATA not found" LIBXML="2.7.8" LIBJSON="UNKNOWN" (core procs from "2.0.2 r10789" need upgrade) TOPOLOGY (topology procs from "2.0.2 r10789" need upgrade) RASTER (raster procs from "2.0.2 r10789" need upgrade)
Note that there is bug in makepostgisdb_using_extensions.bat:
xcopy /I plugins.d "%PGADMIN%\plugins.d"
should be changed to read:
xcopy /I "pgAdmin III\plugins.d" "%PGADMIN%\plugins.d"
My test case still crashes on this version too! I am just copy/paste into psql, nothing special here.
—Peter
comment:20 by , 11 years ago
Peter, Thanks I'll fix the packaging. Regarding crashing, I still can't get to crash on my windows 7 64-bit even replacing all my binaries and cutting and pasting the lines in psql.
PostgreSQL 9.2.2, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.0.3 r11132" GEOS="3.3.8-CAPI-1.7.8" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER
Bborie was able to replicate on his slackware 32-bit but not his 64-bit.
Are you running in a VM? or real equipment? I have real windows servers and workstation. Anyrate they are busy dissecting this issue on IRC if you want to take part in the conversation.
http://irclogs.geoapt.com/postgis/%23postgis.2013-04-10.log
comment:21 by , 11 years ago
I was able to patch my version (9.2.4 + 2.0.3) and was unable to crash using my test case. I also ran analyze on a number of tables with geography and it ran fine.
I looked at pg_stats and it seems that for point columns you always have n_distinct = -1, avg_width = 56, everything else looks to be null. For polygons, it looks like you get more useful avg_width values. Hopefully that is ok.
Thanks for all of your efforts on this and the quick turnaround time. The real test for me will be next week when I attempt to load large amounts of data (over 1TB).
Thanks again!
—Peter
comment:23 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Committed to 2.0 at r11287. The 2d index in 2.0 has some ceil() wrappings that likely prevent this problem. The histogram allocator in trunk was completely re-written, so this code line is gone there.
comment:24 by , 11 years ago
Keywords: | history added |
---|
I can't reproduce on PostgreSQL 9.1.8, it could be a 9.2 issue only. Peters, did you have any warning at build time ?