Opened 10 years ago

Closed 6 years ago

#2985 closed defect (fixed)

server crash when analyzing small tables on FreeBSD Ports

Reported by: nkiraly Owned by: pramsey
Priority: blocker Milestone: PostGIS 2.5.0
Component: postgis Version: 2.2.x
Keywords: analyze server crash freebsd Cc:

Description

If analyze or vacuum analyze runs on a table with 1 Geography type row, there is shared memory corruption reported and postmaster restarts all aspects of the server.

This ONLY happens when a table has a column with a PostGIS Geography type with just 1 row is analyzed by autovacuum process or manually.

Same failures on FreeBSD 10 amd64 with

PostgreSQL 9.3.4 + PostGIS 2.1.0 and PostgreSQL 9.3.5 + PostGIS 2.1.5

built from ports.

Success and failure scenarios attached as SQL statements.

Attachments (6)

pg93_postgis21_sig10_test.sql (4.8 KB ) - added by nkiraly 10 years ago.
SQL statements to cause sig 10 and postmaster panic
trac_2985_tests.sql (502 bytes ) - added by nkiraly 10 years ago.
tests to test regression of #2267
trac_2985_tests.txt (2.5 KB ) - added by nkiraly 10 years ago.
results of #2267 regression test
build_with_cassert.txt (1.4 KB ) - added by nkiraly 10 years ago.
cassert build terminal output
trac_2985_postgres_gdb.txt (7.0 KB ) - added by nkiraly 10 years ago.
gdb trace during backend crash
trac_2985_postgres_postgis_gdb_trace.txt (34.5 KB ) - added by nkiraly 10 years ago.
postgis with debug symbols trace

Download all attachments as: .zip

Change History (61)

by nkiraly, 10 years ago

SQL statements to cause sig 10 and postmaster panic

comment:1 by nkiraly, 10 years ago

I updated sample statements to include a scenario where if there are 2 rows with the same lat, this still happens.

comment:2 by pramsey, 10 years ago

Using your example, or constructing my own, I'm not able to reproduce, on 2.1.5 and PgSQL 9.3. I'm under OSX.

comment:3 by robe, 10 years ago

nkiraly,

When you say same issue on FreeBSD 10 amd64, I presume that means you tested on an additional platform. What other platform did you test on besides FreeBSD? I know we've had some reports of FreeBSD and amd64 issues with regress failures so wondering if it is platform specific or an underlying issue that only exhibits itself on certain platforms.

I tried too on my 9.3.3 postgis 2.1.3 windows 7 64-bit and 9.3.5 postgis 2.1.3 windows 2012 r2 64-bit and couldn't get it to crash with your above tests. Does it consistently crash or only sometimes?

{{{ Windows 7 64-bit PostgreSQL 9.3.3, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.1.3 r12547" GEOS="3.5.0dev-CAPI-1.9.0 r3989" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11.0, released 2014/04/16" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER

Windows 2012 R2 64-bit PostgreSQL 9.3.5, compiled by Visual C++ build 1600, 64-bit POSTGIS="2.1.3 r12547" GEOS="3.4.2-CAPI-1.8.2 r3924" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.7.8" LIBJSON="UNKNOWN" RASTER }}}

comment:4 by robe, 10 years ago

nkiraly,

This issue sounds very similar to the one noted here: #2267

Can you do me a favor and run the tests in that ticket to confirm we are looking at a different issue. Also can you let us know how you compiled postgres — e.g. with cassert enabled. I had only tested on production postgres and cassert enabled sometimes shows issues you don't see in standard production compiles.

Thanks, Regina

comment:5 by nkiraly, 10 years ago

The place where I tested this and always see failures is a FreeBSD 10 virtual machine. I am getting these errors on QA servers also running FreeBSD 10 when QA databases are redeployed clean.

I ran tests from #2267 and I get postmaster terminate notification,

pg93_postgis21_trac2985=# analyze verbose foo;
INFO:  analyzing "public.foo"
INFO:  "foo": scanned 1 of 1 pages, containing 2 live rows and 0 dead rows; 2 rows in sample, 2 estimated total rows
The connection to the server was lost. Attempting reset: WARNING:  terminating connection because of crash of another server process

and sig 10 in server logs.

Nov  5 11:17:34 bludgeon kernel: pid 37842 (postgres), uid 70: exited on signal 10 (core dumped)
Nov  5 11:17:34 bludgeon postgres[15724]: [27-1] WARNING:  terminating connection because of crash of another server process
Nov  5 11:17:34 bludgeon postgres[15724]: [27-2] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
Nov  5 11:17:34 bludgeon postgres[15724]: [27-3] HINT:  In a moment you should be able to reconnect to the database and repeat your command.

VM build is

[nicholas.kiraly@bludgeon ~]$ uname -a
FreeBSD bludgeon 10.0-RELEASE-p7 FreeBSD 10.0-RELEASE-p7 #0: Tue Jul  8 06:37:44 UTC 2014     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

by nkiraly, 10 years ago

Attachment: trac_2985_tests.sql added

tests to test regression of #2267

by nkiraly, 10 years ago

Attachment: trac_2985_tests.txt added

results of #2267 regression test

by nkiraly, 10 years ago

Attachment: build_with_cassert.txt added

cassert build terminal output

comment:6 by nkiraly, 10 years ago

No I had not built with cassert enabled. I rebuilt postgres and postgis with cassert, no change in output or behavior. Terminal output attached.

comment:7 by pramsey, 10 years ago

Just on the off-chance that some useful info is revealed, could you attach gdb to the process and get the stack trace from the crash?

  • connect to your database
  • find the backend process you're connected to (ps -ax) and then attached gdb to it (gdb -p)
  • resume the backend in gdb (type 'c')
  • go to your connection and crash the backend
  • return to gdb and get the backtrace (type 'bt")

Thanks!

comment:8 by nkiraly, 10 years ago

I built with debug symbols and crashed the backend with 1 row inserted scenario, here is the trace, gdb console log attached

(gdb) bt
#0  0x000000000060b1d2 in std_fetch_func (stats=0x807846852, rownum=0, isNull=0x7fffffffca57 "?") at analyze.c:1755
#1  0x0000000807c3fea7 in geometry_estimated_extent () from /usr/local/lib/postgresql/postgis-2.1.so
#2  0x000000000060777f in do_analyze_rel (onerel=0x802e2ee10, vacstmt=0x802c5dae0, acquirefunc=0x606590 <acquire_sample_rows>, relpages=1, inh=0 '\0', elevel=17) at analyze.c:527
#3  0x0000000000606515 in analyze_rel (relid=58705, vacstmt=0x802c5dae0, bstrategy=0x80781a038) at analyze.c:267
#4  0x0000000000690f32 in vacuum (vacstmt=0x802c5dae0, relid=0, do_toast=1 '\001', bstrategy=0x80781a038, for_wraparound=0 '\0', isTopLevel=1 '\001') at vacuum.c:254
#5  0x0000000000839cc3 in standard_ProcessUtility (parsetree=0x802c5dae0, queryString=0x802c5d038 "ANALYZE VERBOSE dat_location_tho;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x802c5de88,
    completionTag=0x7fffffffd480 "") at utility.c:639
#6  0x00000000008393eb in ProcessUtility (parsetree=0x802c5dae0, queryString=0x802c5d038 "ANALYZE VERBOSE dat_location_tho;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x802c5de88,
    completionTag=0x7fffffffd480 "") at utility.c:309
#7  0x0000000000838f78 in PortalRunUtility (portal=0x802dce038, utilityStmt=0x802c5dae0, isTopLevel=1 '\001', dest=0x802c5de88, completionTag=0x7fffffffd480 "") at pquery.c:1187
#8  0x0000000000837f68 in PortalRunMulti (portal=0x802dce038, isTopLevel=1 '\001', dest=0x802c5de88, altdest=0x802c5de88, completionTag=0x7fffffffd480 "") at pquery.c:1318
#9  0x00000000008375bd in PortalRun (portal=0x802dce038, count=9223372036854775807, isTopLevel=1 '\001', dest=0x802c5de88, altdest=0x802c5de88, completionTag=0x7fffffffd480 "") at pquery.c:816
#10 0x0000000000832e21 in exec_simple_query (query_string=0x802c5d038 "ANALYZE VERBOSE dat_location_tho;") at postgres.c:1048
#11 0x0000000000832150 in PostgresMain (argc=1, argv=0x802c0c210, dbname=0x802c0c058 "pg93_postgis21_sig10", username=0x802c0c030 "deployment") at postgres.c:4007
#12 0x00000000007b8842 in BackendRun (port=0x802c71200) at postmaster.c:4011
#13 0x00000000007b7e17 in BackendStartup (port=0x802c71200) at postmaster.c:3685
#14 0x00000000007b4090 in ServerLoop () at postmaster.c:1586
#15 0x00000000007b1984 in PostmasterMain (argc=3, argv=0x7fffffffdbe8) at postmaster.c:1253
#16 0x00000000006f912c in main (argc=3, argv=0x7fffffffdbe8) at main.c:226

by nkiraly, 10 years ago

Attachment: trac_2985_postgres_gdb.txt added

gdb trace during backend crash

comment:9 by pramsey, 10 years ago

Looks like the postgis part isn't compiled w/ debug symbols, can you rebuilt it too (it should pick up the flags you used in pgsql)

comment:10 by robe, 10 years ago

pramsey just a thought and a possibly stupid one.

since nkiraly is also crashing on #2267, could it be that whatever fix we did there might apply to this 2.1 (but of course in a different way since the code has changed significantly). It looks like whatever happened in #2267 was never ported to 2.1, but then again the 2.1 doesn't have geography_esimate anymore and just has gserialized_gist_nd.c (which I presume replaced the geography_estimate.c)

one of the things I noticed — which may mean nothing is that in http://trac.osgeo.org/postgis/changeset/11287 — you added #include "liblwgeom_internal.h", and while I see that include in gserialized_gist_2d.c, I don't see it in gserialized_gist_nd.c. Is there a reason for that?

comment:11 by pramsey, 10 years ago

The *_gist_*d.c files are about the index binding, which is entirely separate from analysis, so there's no clue there.

comment:12 by robe, 10 years ago

Keywords: freebsd added

tagged this as a freebsd issue. I verfied my 9.2 which used to crash with older 2.0 (pre #2267) test doesn't seem to crash with 2.1.4 install with this or the #2267 test.

by nkiraly, 10 years ago

postgis with debug symbols trace

comment:13 by dbaston, 9 years ago

Replicated on FreeBSD 10.2, Postgres 9.3.11 installed from packages, PostGIS 2.1.7 from packages.

Last edited 9 years ago by dbaston (previous) (diff)

comment:14 by dbaston, 9 years ago

Also 2.2.1 compiled from downloaded source.

comment:15 by robe, 9 years ago

Milestone: PostGIS PostgreSQLPostGIS 2.2.2

Why is this marked as PostgreSQL. Switching to PostGIS

comment:16 by pramsey, 9 years ago

Milestone: PostGIS 2.2.2PostGIS 2.2.3

comment:17 by pramsey, 9 years ago

Still "works for me" on:

PostgreSQL 9.3.11 on x86_64-apple-darwin15.3.0, compiled by Apple LLVM version 7.0.2 (clang-700.1.81), 64-bit POSTGIS="2.2.3dev r14829" GEOS="3.6.0dev-CAPI-1.10.0 r4125" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.2" LIBJSON="0.12" (core procs from "2.2.1dev r14503" need upgrade)

comment:18 by pramsey, 8 years ago

Resolution: worksforme
Status: newclosed

comment:19 by ofca, 8 years ago

There's still something here.

FreeBSD 10.3-RELEASE-p2 #0 r299977 amd64
postgis22-2.2.2 installed from ports
postgresql95-server-9.5.4

Server crashes when vacuuming (or autovacuuming) a table that contains one (and only one) row.

There's nothing complicated in regards to table itself, and it's fully repeatable.

  Column  |   Type   | Modifiers
----------+----------+-----------
 entityid | integer  | not null
 geom     | geometry |
Last edited 8 years ago by ofca (previous) (diff)

comment:20 by robe, 8 years ago

Resolution: worksforme
Status: closedreopened

comment:21 by robe, 8 years ago

ofca,

Can you provide a script to build table, add row, and vacuum that triggers the crash,

comment:22 by ofca, 8 years ago

After further testing I found out that the crash is not dependant on amount of rows, but more likely on amount of data.

You need at least three rows with geom of size from example to prevent crashes, and I managed to find some larger multiline geom, and vacuum processed without any problems even tough table contained one row.

template1=# create database testdb;
CREATE DATABASE
template1=# create extension postgis;
CREATE EXTENSION
template1=# create table test (a integer not null, g geometry);
CREATE TABLE
template1=# insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
INSERT 0 1
template1=# vacuum FULL ANALYZE VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": found 0 removable, 1 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.test"
INFO:  "test": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

comment:23 by robe, 8 years ago

I can't replicate the crash under windows any variant I tried e.g.

PostgreSQL 9.5.2, compiled by Visual C++ build 1800, 64-bit
POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4090" PROJ="Rel. 4.9.1, 04 March 2015" GDAL="GDAL 2.0.2, released 2016/01/26" LIBXML="2.7.8" LIBJSON="0.12" RASTER

But on a FreeBSD running

PostgreSQL 9.3.14
                                                                         postgis_full_version
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 2.1.0, released 2016/04/25" LIBXML="2.9.4" LIBJSON="0.12" RASTER
(1 row)

I do get a crash

So does seem like it might be a FreeBSD, PostGIS specific issue.

comment:24 by ofca, 8 years ago

Some more details for c wizards ;)

Program received signal SIGBUS, Bus error.
0x00000000005b5132 in std_fetch_func (stats=0x803271422, rownum=0, isNull=0x7fffffffd9a7 "") at analyze.c:1761
1761    analyze.c: No such file or directory.
(gdb) bt
#0  0x00000000005b5132 in std_fetch_func (stats=0x803271422, rownum=0, isNull=0x7fffffffd9a7 "") at analyze.c:1761
#1  0x0000000807034ee3 in ?? () from /usr/local/lib/postgresql/postgis-2.2.so
#2  0x00000000005b1974 in do_analyze_rel (onerel=0x80332b768, vacstmt=0x80320ba38, acquirefunc=0x5b09f0 <acquire_sample_rows>, relpages=1, inh=0 '\000',
    in_outer_xact=0 '\000', elevel=17) at analyze.c:528
#3  0x00000000005b0960 in analyze_rel (relid=623595, vacstmt=0x80320ba38, in_outer_xact=0 '\000', bstrategy=0x8030eb030) at analyze.c:266
#4  0x000000000063892b in vacuum (vacstmt=0x80320ba38, relid=0, do_toast=1 '\001', bstrategy=0x8030eb030, for_wraparound=0 '\000', isTopLevel=1 '\001')
    at vacuum.c:259
#5  0x00000000007ad798 in standard_ProcessUtility (parsetree=0x80320ba38, queryString=0x80320b030 "vacuum FULL ANALYZE VERBOSE ;", params=0x0,
    isTopLevel=1 '\001', dest=0x80320bd78, completionTag=0x7fffffffe470 "") at utility.c:1003

comment:25 by ofca, 8 years ago

OK, after trying million different things, this is what I came up with.

Good analyze run:

 DEBUG: StartTransactionCommand
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: ProcessUtility
 DEBUG: CommitTransaction
 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: building index "pg_toast_909285_index" on table "pg_toast_909285"
 INFO: vacuuming "public.test"
 INFO: "test": found 1 removable, 1 nonremovable row versions in 1 pages
 DETAIL: 0 dead row versions cannot be removed yet.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 DEBUG: drop auto-cascades to type pg_temp_623595
 DEBUG: drop auto-cascades to type pg_temp_623595[]
 DEBUG: drop auto-cascades to toast table pg_toast.pg_toast_909285
 DEBUG: drop auto-cascades to index pg_toast.pg_toast_909285_index
 DEBUG: drop auto-cascades to type pg_toast.pg_toast_909285
 DEBUG: CommitTransaction
 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 13757591/1/12, nestlvl: 1, children:
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 INFO: analyzing "public.test"
 DEBUG: [gserialized_estimate.c:gserialized_analyze_nd:1770] gserialized_analyze_nd called
 DEBUG: [gserialized_estimate.c:gserialized_analyze_nd:1777] attribute stat target: 100
 DEBUG: [gserialized_estimate.c:gserialized_analyze_nd:1783] minrows: 30000
 INFO: "test": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1282] compute_gserialized_stats called
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1283] # sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1284] estimate of total_rows: 1
 DEBUG: [gserialized_estimate.c:nd_box_from_gbox:511] GBOX((20.900011,52.199986),(20.915567,52.208111))
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1395] stats->attr->attstattarget: 100
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1396] target # of histogram cells: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1406] sample_extent: {"min":[20.9,52.2],"max":[20.9156,52.2081]}
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:745] dimension 0, feature 0: bin 0 to bin 49
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:763] dimension 0: range = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:764] dimension 0: average = 1
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:765] dimension 0: stddev = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:766] dimension 0: stddev_ratio = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:745] dimension 1, feature 0: bin 0 to bin 49
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:763] dimension 1: range = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:764] dimension 1: average = 1
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:765] dimension 1: stddev = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:766] dimension 1: stddev_ratio = 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1498] special case: no axes have variability
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1503] histo_size[d]: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1503] histo_size[d]: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1506] histo_cells_new: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1546] histo_cells: 0
 DEBUG: [gserialized_estimate.c:nd_box_overlap:599] nd_box: {"min":[20.9,52.2],"max":[20.9156,52.2081]}
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1604] feature 0: ibox (0, 0, 0, 0) (-1, -1, 0, 0)
 DEBUG: [gserialized_estimate.c:nd_stats_value_index:357] bad index at (0, 0)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1644] ratio (1) num_cells (1)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1645] at (0, 0, 0, 0)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1655] histogram_features: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1656] sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1657] table_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1694] out: slot 0: kind 0 (STATISTIC_KIND_ND)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1695] out: slot 0: op 0 (InvalidOid)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1696] out: slot 0: numnumbers 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1697] out: null fraction: 0.000000=0/1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1698] out: average width: 944 bytes
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1699] out: distinct values: all (no check done)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1700] out: {"ndims":2,"size":[0,0],"extent":{"min":[20.8999,52.1999],"max":[20.9156,52.2082]},"table_features":1,"sample_features":1,"not_null_features":1,"histogram_features":1,"histogram_cells":0,"cells_covered":1}
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1282] compute_gserialized_stats called
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1283] # sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1284] estimate of total_rows: 1
 DEBUG: [gserialized_estimate.c:nd_box_from_gbox:511] GBOX((20.900011,52.199986),(20.915567,52.208111))
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1395] stats->attr->attstattarget: 100
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1396] target # of histogram cells: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1406] sample_extent: {"min":[20.9,52.2],"max":[20.9156,52.2081]}
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:745] dimension 0, feature 0: bin 0 to bin 49
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:763] dimension 0: range = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:764] dimension 0: average = 1
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:765] dimension 0: stddev = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:766] dimension 0: stddev_ratio = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:745] dimension 1, feature 0: bin 0 to bin 49
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:763] dimension 1: range = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:764] dimension 1: average = 1
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:765] dimension 1: stddev = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:766] dimension 1: stddev_ratio = 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1498] special case: no axes have variability
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1503] histo_size[d]: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1503] histo_size[d]: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1506] histo_cells_new: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1546] histo_cells: 0
 DEBUG: [gserialized_estimate.c:nd_box_overlap:599] nd_box: {"min":[20.9,52.2],"max":[20.9156,52.2081]}
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1604] feature 0: ibox (0, 0, 0, 0) (-1, -1, 0, 0)
 DEBUG: [gserialized_estimate.c:nd_stats_value_index:357] bad index at (0, 0)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1644] ratio (1) num_cells (1)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1645] at (0, 0, 0, 0)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1655] histogram_features: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1656] sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1657] table_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1694] out: slot 0: kind 102 (STATISTIC_KIND_ND)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1695] out: slot 0: op 0 (InvalidOid)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1696] out: slot 0: numnumbers 19
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1697] out: null fraction: 0.000000=0/1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1698] out: average width: 944 bytes
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1699] out: distinct values: all (no check done)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1700] out: {"ndims":2,"size":[0,0],"extent":{"min":[20.8999,52.1999],"max":[20.9156,52.2082]},"table_features":1,"sample_features":1,"not_null_features":1,"histogram_features":1,"histogram_cells":0,"cells_covered":1}
 DEBUG: CommitTransaction
 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 13757592/1/0 (used), nestlvl: 1, children:
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: CommitTransactionCommand
 DEBUG: CommitTransaction
 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:

Fail analyze run:

 DEBUG: StartTransactionCommand
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: ProcessUtility
 DEBUG: CommitTransaction
 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 DEBUG: building index "pg_toast_901093_index" on table "pg_toast_901093"
 INFO: vacuuming "public.test"
 INFO: "test": found 0 removable, 1 nonremovable row versions in 1 pages
 DETAIL: 0 dead row versions cannot be removed yet.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 DEBUG: drop auto-cascades to type pg_temp_623595
 DEBUG: drop auto-cascades to type pg_temp_623595[]
 DEBUG: drop auto-cascades to toast table pg_toast.pg_toast_901093
 DEBUG: drop auto-cascades to index pg_toast.pg_toast_901093_index
 DEBUG: drop auto-cascades to type pg_toast.pg_toast_901093
 DEBUG: CommitTransaction
 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 13757589/1/12, nestlvl: 1, children:
 DEBUG: StartTransaction
 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
 INFO: analyzing "public.test"
 DEBUG: [gserialized_estimate.c:gserialized_analyze_nd:1770] gserialized_analyze_nd called
 DEBUG: [gserialized_estimate.c:gserialized_analyze_nd:1777] attribute stat target: 100
 DEBUG: [gserialized_estimate.c:gserialized_analyze_nd:1783] minrows: 30000
 INFO: "test": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1282] compute_gserialized_stats called
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1283] # sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1284] estimate of total_rows: 1
 DEBUG: [g_serialized.c:gserialized_get_type:60] entered
 DEBUG: [g_serialized.c:gserialized_get_type:60] entered
 DEBUG: [g_serialized.c:lwgeom_from_gserialized:1265] Got type 2 (LineString), srid=4326
 DEBUG: [g_serialized.c:lwgeom_from_gserialized_buffer:1219] Got type 2 (LineString), hasz=0 hasm=0 geodetic=0 hasbox=0
 DEBUG: [lwgeom.c:lwgeom_count_vertices:1148] lwgeom_count_vertices got type LineString
 DEBUG: [lwgeom.c:lwgeom_is_empty:1301] lwgeom_is_empty: got type LineString
 DEBUG: [lwgeom.c:lwgeom_count_vertices:1183] counted 1 vertices
 DEBUG: [lwgeom.c:lwgeom_set_srid:1543] entered with srid=4326
 DEBUG: [g_box.c:lwgeom_calculate_gbox_cartesian:651] lwgeom_calculate_gbox got type (2) - LineString
 DEBUG: [g_box.c:ptarray_calculate_gbox_cartesian:525] ptarray_calculate_gbox Z: 0 M: 0
 DEBUG: [lwgeom_api.c:getPoint4d_p:245] getPoint4d_p called.
 DEBUG: [lwgeom_api.c:getPoint4d_p:251] ptr 0x8032ae058, zmflag 0
 DEBUG: [gserialized_estimate.c:nd_box_from_gbox:511] GBOX((20.951166,52.398453),(20.951168,52.398457))
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1395] stats->attr->attstattarget: 100
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1396] target # of histogram cells: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1406] sample_extent: {"min":[20.9512,52.3985],"max":[20.9512,52.3985]}
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:745] dimension 0, feature 0: bin 0 to bin 50
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:763] dimension 0: range = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:764] dimension 0: average = 1
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:765] dimension 0: stddev = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:766] dimension 0: stddev_ratio = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:745] dimension 1, feature 0: bin 0 to bin 50
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:763] dimension 1: range = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:764] dimension 1: average = 1
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:765] dimension 1: stddev = 0
 DEBUG: [gserialized_estimate.c:nd_box_array_distribution:766] dimension 1: stddev_ratio = 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1498] special case: no axes have variability
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1503] histo_size[d]: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1503] histo_size[d]: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1506] histo_cells_new: 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1546] histo_cells: 0
 DEBUG: [gserialized_estimate.c:nd_box_overlap:599] nd_box: {"min":[20.9512,52.3985],"max":[20.9512,52.3985]}
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1604] feature 0: ibox (0, 0, 0, 0) (-1, -1, 0, 0)
 DEBUG: [gserialized_estimate.c:nd_stats_value_index:357] bad index at (0, 0)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1644] ratio (1) num_cells (1)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1645] at (0, 0, 0, 0)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1655] histogram_features: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1656] sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1657] table_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1694] out: slot 0: kind 0 (STATISTIC_KIND_ND)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1695] out: slot 0: op 0 (InvalidOid)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1696] out: slot 0: numnumbers 0
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1697] out: null fraction: 0.000000=0/1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1698] out: average width: 32 bytes
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1699] out: distinct values: all (no check done)
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1700] out: {"ndims":2,"size":[0,0],"extent":{"min":[20.9512,52.3985],"max":[20.9512,52.3985]},"table_features":1,"sample_features":1,"not_null_features":1,"histogram_features":1,"histogram_cells":0,"cells_covered":1}
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1282] compute_gserialized_stats called
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1283] # sample_rows: 1
 DEBUG: [gserialized_estimate.c:compute_gserialized_stats_mode:1284] estimate of total_rows: 1

 DEBUG: forked new backend, pid=37855 socket=12
 kernel: pid 37854 (postgres), uid 70: exited on signal 10 (core dumped)

comment:26 by ofca, 8 years ago

I've setup a fresh FreeBSD 10.3 stable with psql 9.5.4, postgis 2.2.2, and three test databases, two crashing, one with one row not crashing — if someone would like remote access to take a stab at this, msg on #postgis

comment:27 by ofca, 8 years ago

Summary: server crash when analyzing table with 1 rowserver crash when analyzing small tables
Version: 2.1.x2.2.x

comment:28 by robe, 8 years ago

All right this is pretty annoying. I have a FreeBSD vm that I installed 9.5 from ports.

Compiled PostGIS 2.3 (bah for some reason didn't give me raster though I installed GDAL) anyrate ran the:

create table test (a integer not null, g geometry);
insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');

postgis=# insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
INSERT 0 1
postgis=# vacuum FULL ANALYZE VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": found 0 removable, 1 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.test"
INFO:  "test": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
VACUUM

The darn thing won't crash for me. My

 SELECT version() || ' ' || postgis_full_version();

-- returns
                                                                                                                                ?column?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.4 on amd64-portbld-freebsd10.1, compiled by FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512, 64-bit POSTGIS="2.3.0dev r15129" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.4" LIBJSON="0.12"
(1 row)

ofca, did you ever try this on 2.3.0 trunk (and on 9.5, I recall you said you tried on 9.5), wondering if maybe the issue doesn't exist in 2.3, or it has something to do with the fact I didn't get raster enabled, so I ran from scripts. I doubt that is it since the regular postgis-2.3.so shouldn't have any gdal in it anyway.

On production server I have, it has 2.2.2 and PostgreSQL 9.3 installed.

comment:29 by robe, 8 years ago

— I figured out my issue here

Fixed by r15130

ofca, how did you ever get PostGIS 2.2.2 installed? Did you have to hack it.

I just tried compiling from PostGIS 2.2.2 tarball and get this:

wget http://download.osgeo.org/postgis/source/postgis-2.2.2.tar.gz
tar -xvzf postgis-2.2.2.tar.gz
cd postgis-2.2.2
./configure
make
rtpg_geometry.c:105:42: error: too many arguments to function call, expected 2, have 3
        gser = gserialized_from_lwgeom(geom, 0, &gser_size);
               ~~~~~~~~~~~~~~~~~~~~~~~          ^~~~~~~~~~
/usr/local/include/liblwgeom.h:1957:1: note: 'gserialized_from_lwgeom' declared here
extern GSERIALIZED* gserialized_from_lwgeom(LWGEOM *geom, size_t *size);
^
rtpg_geometry.c:184:42: error: too many arguments to function call, expected 2, have 3
        gser = gserialized_from_lwgeom(geom, 0, &gser_size);
               ~~~~~~~~~~~~~~~~~~~~~~~          ^~~~~~~~~~
/usr/local/include/liblwgeom.h:1957:1: note: 'gserialized_from_lwgeom' declared here
extern GSERIALIZED* gserialized_from_lwgeom(LWGEOM *geom, size_t *size);
^
rtpg_geometry.c:326:81: error: too many arguments to function call, expected 2, have 3
                gser = gserialized_from_lwgeom(lwpoly_as_lwgeom(geomval2[call_cntr].geom), 0, &gser_size);
                       ~~~~~~~~~~~~~~~~~~~~~~~                                                ^~~~~~~~~~
/usr/local/include/liblwgeom.h:1957:1: note: 'gserialized_from_lwgeom' declared here
extern GSERIALIZED* gserialized_from_lwgeom(LWGEOM *geom, size_t *size);
^
rtpg_geometry.c:628:59: error: too many arguments to function call, expected 2, have 3
                gser = gserialized_from_lwgeom(pix2[call_cntr].geom, 0, &gser_size);
                       ~~~~~~~~~~~~~~~~~~~~~~~                          ^~~~~~~~~~
/usr/local/include/liblwgeom.h:1957:1: note: 'gserialized_from_lwgeom' declared here
extern GSERIALIZED* gserialized_from_lwgeom(LWGEOM *geom, size_t *size);
^
4 errors generated.
gmake[3]: *** [<builtin>: rtpg_geometry.o] Error 1
gmake[3]: Leaving directory '/usr/home/robe/sources/postgis-2.2.2/raster/rt_pg'
gmake[2]: *** [Makefile:28: pglib] Error 2
gmake[2]: Leaving directory '/usr/home/robe/sources/postgis-2.2.2/raster'
gmake[1]: *** [GNUmakefile:16: all] Error 1
gmake[1]: Leaving directory '/usr/home/robe/sources/postgis-2.2.2'
*** Error code 2

Stop.
Last edited 8 years ago by robe (previous) (diff)

comment:30 by robe, 8 years ago

Is it possible that we have fixed this issue?

I just tried with PostGIS 2.2 branch:

                                                                                                                                 ?column?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="2.2.3dev r15130" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.4" LIBJSON="0.12" PostgreSQL 9.5.4 on amd64-portbld-freebsd10.1, compiled by FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512, 64-bit
(1 row)

Using ofca's simple:

create table test (a integer not null, g geometry);
 insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
vacuum FULL ANALYZE VERBOSE test;

As well as the more extensive pg93_postgis21_sig10_test.sql​ provided by nkiraly

and I can't get my FreeBSD to crash with either.

Last edited 8 years ago by robe (previous) (diff)

comment:31 by ofca, 8 years ago

I installed 2.2.2 from ports. As to the trunk version, maybe it would be worthwhile to ask rhurlin@… to create some postgis-devel port?

I'll try to build 2.2.3dev and test on it, and I can give you access to mentioned VM with premade crash databases, so you can take a look at it yourself if time permits.

comment:32 by robe, 8 years ago

I downgraded to 9.3 and installed PostGIS 2.1.7 from ports and it crashes. I also tried by 9.5 compiling my own PostGIS 2.2.2 (after I applied Greg's patch) and it did not crash.

If I compiled PostGIS 2.2 from svn branch, it doesn't crash (even on the 9.3 I got from ports).

So I'm not sure if its something about the way FreeBSD ports is compiling PostGIS or patched it so it would work or perhaps Greg's liblwgeom patched fixed things and they were compiling against an incompatible liblwgeom.h header.

You said you got 2.2.2 from ports — what's the pkg install for that.

If I do:

pkg install postgis

I just get the 2.1.7 which only works with 9.3. I thought I had done this before, but perhaps not. Not sure where my production got their 2.2.2 since I didn't install it.

comment:33 by robe, 8 years ago

Resolution: fixed
Status: reopenedclosed

Now that I got all my compile issues squared aware, I tried this exercise a couple of times on the same PostgresQL 9.3 freebsd server.

create table test (a integer not null, g geometry);
 insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
vacuum FULL ANALYZE VERBOSE test;

Conclusion - PostGIS 2.1.7 consistently crashes, it's impressive how faithful it is at crashing.

SELECT postgis_full_version() || ' ' || version();
                                                                                                                                ?column?
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="2.1.7 r13414" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" LIBXML="2.9.4" LIBJSON="UNKNOWN" PostgreSQL 9.3.14 on amd64-portbld-freebsd10.1, compiled by FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512, 64-bit
(1 row)


postgis21=# create table test (a integer not null, g geometry);
nsert iERROR:  relation "test" already exists
postgis21=#  insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
ALYZE VERBOSE test;INSERT 0 1
postgis21=# vacuum FULL ANALYZE VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": found 0 removable, 2 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.test"
INFO:  "test": scanned 1 of 1 pages, containing 2 live rows and 0 dead rows; 2 rows in sample, 2 estimated total rows
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Went and compiled my own PostGIS 2.3.0rc1 (appyling Greg's patch to the tar ball to do so and patch from #3639) I can do CREATE EXTENSION now.

SELECT postgis_full_version() || ' ' || version();
                                                                                                                                                       ?column?
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="2.3.0rc1 r15121" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 2.1.0, released 2016/04/25" LIBXML="2.9.4" LIBJSON="0.12" RASTER PostgreSQL 9.3.14 on amd64-portbld-freebsd10.1, compiled by FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512, 64-bit
(1 row)

postgis23rc1=#  insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
INSERT 0 1
postgis23rc1=# vacuum FULL ANALYZE VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": found 0 removable, 1 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.test"
INFO:  "test": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
VACUUM

— compiled by own PostGIS 2.2.2, again applying Greg's patch from #3604 (cause I couldn't compile without it) — no crash

postgis222=# create extension postgis version "2.2.2";
CREATE EXTENSION
postgis222=# select postgis_full_version() || ' ' || version();
                                                                                                                                                      ?column?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 2.1.0, released 2016/04/25" LIBXML="2.9.4" LIBJSON="0.12" RASTER PostgreSQL 9.3.14 on amd64-portbld-freebsd10.1, compiled by FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512, 64-bit
(1 row)

postgis222=# vacuum FULL ANALYZE VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": found 0 removable, 1 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.test"
INFO:  "test": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 1 rows in sample, 1 estimated total rows
VACUUM
postgis222=#

-- YEH NO CRASH

So given that I can't get even the 2.2.2 to crash (which others were very successful with and I was successful with on another freebsd server (and only difference is the liblwgeo.h patch from #3604 or something else that FreeBSD package people are hacking that I'm not doing when I compile, to the best of my knowledge, if it was an issue in PostGIS code, the issue is resolved and probably caused by a rogue liblwgeom.h

People if you are still having issues feel free to reopen this.

comment:34 by ofca, 8 years ago

Well, I can reliably replicate this on fresh FreeBSD while building both postgresql and postgis from ports — even with all postgis port options disabled.

Did you try building via cd /usr/ports/databases/postgis22 && make install

And who is rhurlin@… ? - is he the part of the team? I'll e-mail him (or her — learning my leason with you ;p) about this; maybe he can shine some light here.

comment:35 by robe, 8 years ago

No I wasn't using the ports source files. What I did was this more or less (my instructions as I was writing down are a bit of a mess because I was installing and uninstalling PostgreSQL 9.3/9.5 to compare from ports

sudo bash
freebsd-update fetch
freebsd-update install
#to get PostGIS and PostgreSQL for binary ports
pkg install postgresql93-server postgresql93-contrib
sysrc postgresql_enable=yes
service postgresql initdb
service postgresql start
pkg install postgis (this gave me PostGIS 2.1.7 which consistently crashed)

## to install PostGIS from source
pkg install subversion
pkg install autoconf automake libtool git gmake wget libxslt
pkg install gdal geos #this got already installed if I did postgis install first from ports

To compile PostGIS 2.2.2

wget http://download.osgeo.org/postgis/source/postgis-2.2.2.tar.gz
tar -xvzf postgis-2.2.2.tar.gz 
cd postgis-2.2.2

#apply patch from r15130 #apply patch from r15135

./configure
make
make install

This gave me a perfectly nice PostGIS 2.2.2 in my PostgreSQL 9.3 which did not crash like the ports ones Which I installed this way

psql -U pgsql -d postgres
CREATE DATABASE postgis222;
\connect postgis222
CREATE EXTENSION postgis VERSION "2.2.2";
Last edited 8 years ago by robe (previous) (diff)

comment:36 by robe, 8 years ago

The /usr/ports/databases/postgis22 didn't seem to work for me (I don't seem to have that folder). However that gave me a clue to do this:

pkg install postgis22

Which gave me this message:

Proceed with this action? [y/N]: y
Fetching postgis22-2.2.2.txz: 100%    2 MiB   1.7MB/s    00:01
Checking integrity... done (1 conflicting)
  - postgis22-2.2.2 conflicts with postgis-2.1.7_2 on /usr/local/bin/pgsql2shp
Checking integrity... done (0 conflicting)
Conflicts with the existing packages have been found.
One more solver iteration is needed to resolve them.
The following 2 package(s) will be affected (of 0 checked):

Installed packages to be REMOVED:
        postgis-2.1.7_2

New packages to be INSTALLED:
        postgis22: 2.2.2

Number of packages to be removed: 1
Number of packages to be installed: 1

The process will require 55 MiB more space.

Proceed with this action? [y/N]: y
[1/2] Deinstalling postgis-2.1.7_2...
[1/2] Deleting files for postgis-2.1.7_2:   0%
pkg: /usr/local/bin/pgsql2shp different from original checksum, not removing
[1/2] Deleting files for postgis-2.1.7_2:   1%
pkg: /usr/local/bin/shp2pgsql different from original checksum, not removing
[1/2] Deleting files for postgis-2.1.7_2:   2%
pkg: /usr/local/include/liblwgeom.h different from original checksum, not removing
[1/2] Deleting files for postgis-2.1.7_2:   4%
pkg: /usr/local/lib/liblwgeom.a different from original checksum, not removing
[1/2] Deleting files for postgis-2.1.7_2:   5%
pkg: /usr/local/lib/liblwgeom.la different from original checksum, not removing
[1/2] Deleting files for postgis-2.1.7_2:   6%
pkg: /usr/local/lib/liblwgeom.so different from original checksum, not removing
[1/2] Deleting files for postgis-2.1.7_2: 100%
[2/2] Installing postgis22-2.2.2...

  ======================= GEOS Support Notice ========================
  In order to use the GEOS support, you may need to specially compile
  your version of PostgreSQL to link the C++ runtime library.
  To do this, invoke the PostgreSQL Makefile script this way:

  on csh shell:

  setenv LDFLAGS -lstdc++
  make

  on sh or bash shell:

  export LDFLAGS=-lstdc++
  make

  The initial LDFLAGS variable is passed through to the Makefile and
  adds the C++ library to the linking stage.

  ====================================================================

[2/2] Extracting postgis22-2.2.2: 100%

I didn't even bother recreating my postgis222 database, but went to check on it

psql -U pgsql -d postgis222

postgis222=# SELECT postgis_full_version() || ' ' || version();
                                                                                                                                                      ?column?
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.9.2, 08 September 2015" GDAL="GDAL 2.1.0, released 2016/04/25" LIBXML="2.9.4" LIBJSON="0.12" RASTER PostgreSQL 9.3.14 on amd64-portbld-freebsd10.1, compiled by FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512, 64-bit
(1 row)

Run this:

create table test (a integer not null, g geometry);
 insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
vacuum FULL ANALYZE VERBOSE test;

— voila CRASHED. So it seems the culprit is whatever is being pushed by ports — perhaps they changed the code so it would compile and that change is causing this problem — Like I said without patching the code, I couldn't even get it to compile and install under FreeBSD

ERROR:  relation "test" already exists
postgis222=#  insert into test values (1, '0102000020E6100000010000006D1092A47FF33440AD4ECD9B00334A40');
cuum FUINSERT 0 1
postgis222=# vacuum FULL ANALYZE VERBOSE test;
INFO:  vacuuming "public.test"
INFO:  "test": found 0 removable, 2 nonremovable row versions in 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.test"
INFO:  "test": scanned 1 of 1 pages, containing 2 live rows and 0 dead rows; 2 rows in sample, 2 estimated total rows
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

comment:37 by ofca, 8 years ago

try:

rm -rf /usr/ports
portsnap fetch
portsnap extract

This should give you a current ports tree and you can look at what's going on, what kind of patches are used and what happens during build process.

Last edited 8 years ago by robe (previous) (diff)

comment:38 by robe, 8 years ago

I think I'm missing something. I get this:

Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found.
Fetching snapshot tag from your-org.portsnap.freebsd.org... done.
Fetching snapshot metadata... done.
Fetching snapshot generated at Thu Sep 22 20:08:34 EDT 2016:
fetch: http://your-org.portsnap.freebsd.org/s/0831842c5695b8c81d35df948c917c1e482e189fdb4463687703fc567e5ece22.tgz: Not Found

Any rate I'm not going to have much time to debug this, but I will add this regress test so that it crashes for anyone packaging who has this issue when they go to do the make check.

comment:39 by robe, 8 years ago

In 15138:

Test to trap FreeBSD crash on small tables.
References #2985 for PostGIS 2.3.0

comment:40 by robe, 8 years ago

In 15141:

Test to trap FreeBSD crash on small tables.
References #2985 for PostGIS 2.2.3

in reply to:  38 comment:41 by ofca, 8 years ago

Replying to robe:

I think I'm missing something. I get this:

Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found.
Fetching snapshot tag from your-org.portsnap.freebsd.org... done.
Fetching snapshot metadata... done.
Fetching snapshot generated at Thu Sep 22 20:08:34 EDT 2016:
fetch: http://your-org.portsnap.freebsd.org/s/0831842c5695b8c81d35df948c917c1e482e189fdb4463687703fc567e5ece22.tgz: Not Found

There's something wrong with your /etc/portsnap.conf seems there's SERVERNAME=your-org.portsnap.FreeBSD.org instead of normal SERVERNAME=portsnap.FreeBSD.org

Any rate I'm not going to have much time to debug this, but I will add this regress test so that it crashes for anyone packaging who has this issue when they go to do the make check.

Should your time permit, I'll be happy to help with FreeBSD-specific questions you may have. I also informed the port maintainer about this ticket, so lets see how it goes now. Thank you :)

comment:42 by robe, 8 years ago

Summary: server crash when analyzing small tablesserver crash when analyzing small tables on FreeBSD Ports

Okay this seems to still be an issue per Rainer and FreeBSD ports is working on a fix.

Will close this out once they have resolved.

Last edited 8 years ago by robe (previous) (diff)

comment:43 by robe, 8 years ago

Milestone: PostGIS 2.2.3PostGIS Packaging
Resolution: fixed
Status: closedreopened

comment:44 by komzpa, 7 years ago

Now reproduces on my dev environment, Ubuntu 17.10

POSTGIS="2.5.0dev r16390" [EXTENSION] PGSQL="100" GEOS="3.7.0dev-CAPI-1.11.0 0" SFCGAL="1.3.1" PROJ="Rel. 4.9.3, 15 August 2016" GDAL="GDAL 2.2.1, released 2017/06/23" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.2.1" RASTER

Thread 1 (Thread 0x7f9975154780 (LWP 12434)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f9972708f5d in __GI_abort () at abort.c:90
#2  0x00007f997275128d in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f9972876a26 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007f99727f7811 in __GI___fortify_fail_abort (need_backtrace=need_backtrace@entry=false, msg=msg@entry=0x7f9972876a0e "stack smashing detected") at fortify_fail.c:33
#4  0x00007f99727f77d2 in __stack_chk_fail () at stack_chk_fail.c:29
#5  0x00007f99629fd29f in nd_box_array_distribution (nd_boxes=nd_boxes@entry=0x5581ff8f57b8, num_boxes=num_boxes@entry=1, extent=extent@entry=0x7ffff9781ee0, ndims=ndims@entry=2, distribution=distribution@entry=0x7ffff9781f20)
    at gserialized_estimate.c:843
#6  0x00007f99629fe3bd in compute_gserialized_stats_mode (stats=stats@entry=0x5581ff932098, fetchfunc=fetchfunc@entry=0x5581fd5367d0, sample_rows=sample_rows@entry=1, total_rows=total_rows@entry=1, mode=mode@entry=2) at gserialized_estimate.c:1579
#7  0x00007f99629fecb0 in compute_gserialized_stats (stats=0x5581ff932098, fetchfunc=0x5581fd5367d0, sample_rows=1, total_rows=1) at gserialized_estimate.c:1839
#8  0x00005581fd538b81 in ?? ()
#9  0x00005581fd53a094 in analyze_rel ()
#10 0x00005581fd5a496b in vacuum ()
#11 0x00005581fd5a4d9d in ExecVacuum ()
#12 0x00005581fd6f4248 in standard_ProcessUtility ()
#13 0x00005581fd6f0dd5 in ?? ()
#14 0x00005581fd6f18d7 in ?? ()
#15 0x00005581fd6f25d8 in PortalRun ()
#16 0x00005581fd6ee193 in ?? ()
#17 0x00005581fd6f01c8 in PostgresMain ()
#18 0x00005581fd429589 in ?? ()
#19 0x00005581fd677c04 in PostmasterMain ()
#20 0x00005581fd42ae8f in main ()

comment:45 by komzpa, 7 years ago

In 16391:

Fix errorneous ticket number.

References #2985

comment:46 by komzpa, 6 years ago

This exploded on PostGIS 2.5beta2 build on debian buildfarm on MIPS:

https://buildd.debian.org/status/fetch.php?pkg=postgis&arch=mips&ver=2.5.0~beta2%2Bdfsg-1~exp1&stamp=1533996382&raw=0

@@ -269,69 +269,7 @@
 #2788|f|Self-intersection|POINT(1 1)
 #2870|Point[GS]
 #2956|t
-#2985|LINESTRING(20.9511664253809 52.3984560730436)
-#2996|247.44|247.44
-#3119a|121
-#3119b|291
-#3119c|615

....


-#4081|f|t
+server closed the connection unexpectedly
+	This probably means the server terminated abnormally
+	before or while processing the request.
+connection to server was lost

comment:47 by komzpa, 6 years ago

Reproduced on Travis after #4177 patch applied (3 of 4 jobs failed, one survived somehow):

https://travis-ci.org/postgis/postgis/builds/428202858

[logbt] saw 'make' exit with code:2 (INT)
[logbt] Found corefile (non-tracked) at /tmp/logbt-coredumps/core.1192.!usr!lib!postgresql!11!bin!postgres
[logbt] Processing cores...
[New LWP 1192]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: 11/main: travis postgis_reg [local] ANALYZE                         '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

Thread 1 (Thread 0x7f4e9c8a8b00 (LWP 1192)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f4ea42a3535 in __GI_abort () at abort.c:79
#2  0x00007f4ea430a516 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f4ea442e9ee "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007f4ea43ae431 in __GI___fortify_fail_abort (need_backtrace=need_backtrace@entry=false, msg=msg@entry=0x7f4ea442e9cc "stack smashing detected") at fortify_fail.c:33
#4  0x00007f4ea43ae3f2 in __stack_chk_fail () at stack_chk_fail.c:29
#5  0x00007f4e9291d8d8 in compute_gserialized_stats_mode (stats=stats@entry=0x562e82128780, fetchfunc=fetchfunc@entry=0x562e808c6970 <std_fetch_func>, sample_rows=sample_rows@entry=1, total_rows=total_rows@entry=1, mode=mode@entry=2) at gserialized_estimate.c:1755
#6  0x00007f4e9291d8fd in compute_gserialized_stats (stats=0x562e82128780, fetchfunc=0x562e808c6970 <std_fetch_func>, sample_rows=1, total_rows=1) at gserialized_estimate.c:1841
#7  0x0000562e808c9071 in do_analyze_rel (onerel=onerel@entry=0x562e8235d400, options=options@entry=2, params=params@entry=0x7ffd6d3e6fa0, va_cols=va_cols@entry=0x0, acquirefunc=<optimized out>, relpages=1, inh=false, in_outer_xact=false, elevel=13) at ./build/../src/backend/commands/analyze.c:579
#8  0x0000562e808ca6fc in analyze_rel (relid=18637, relation=0x562e81524738, options=options@entry=2, params=params@entry=0x7ffd6d3e6fa0, va_cols=0x0, in_outer_xact=<optimized out>, bstrategy=0x562e817ba610) at ./build/../src/backend/commands/analyze.c:310
#9  0x0000562e80938f25 in vacuum (options=2, relations=0x562e817ba770, params=params@entry=0x7ffd6d3e6fa0, bstrategy=<optimized out>, bstrategy@entry=0x0, isTopLevel=isTopLevel@entry=true) at ./build/../src/backend/commands/vacuum.c:357
#10 0x0000562e80939290 in ExecVacuum (vacstmt=vacstmt@entry=0x562e81524808, isTopLevel=isTopLevel@entry=true) at ./build/../src/backend/commands/vacuum.c:141
#11 0x0000562e80a9ea60 in standard_ProcessUtility (pstmt=0x562e815248a8, queryString=0x562e81523d40 "ANALYZE test_analyze_crash;", context=<optimized out>, params=0x0, queryEnv=0x0, dest=0x562e81524b68, completionTag=0x7ffd6d3e72e0 "") at ./build/../src/backend/tcop/utility.c:667
#12 0x0000562e80a9b6e9 in PortalRunUtility (portal=0x562e815a9880, pstmt=0x562e815248a8, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=<optimized out>, completionTag=0x7ffd6d3e72e0 "") at ./build/../src/backend/tcop/pquery.c:1178
#13 0x0000562e80a9c1e7 in PortalRunMulti (portal=portal@entry=0x562e815a9880, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x562e81524b68, altdest=altdest@entry=0x562e81524b68, completionTag=completionTag@entry=0x7ffd6d3e72e0 "") at ./build/../src/backend/tcop/pquery.c:1331
#14 0x0000562e80a9ce8e in PortalRun (portal=portal@entry=0x562e815a9880, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x562e81524b68, altdest=altdest@entry=0x562e81524b68, completionTag=0x7ffd6d3e72e0 "") at ./build/../src/backend/tcop/pquery.c:799
#15 0x0000562e80a98b27 in exec_simple_query (query_string=0x562e81523d40 "ANALYZE test_analyze_crash;") at ./build/../src/backend/tcop/postgres.c:1122
#16 0x0000562e80a9ab36 in PostgresMain (argc=<optimized out>, argv=argv@entry=0x562e8156de78, dbname=<optimized out>, username=<optimized out>) at ./build/../src/backend/tcop/postgres.c:4153
#17 0x0000562e80a25bed in BackendRun (port=0x562e81567250) at ./build/../src/backend/postmaster/postmaster.c:4361
#18 BackendStartup (port=0x562e81567250) at ./build/../src/backend/postmaster/postmaster.c:4033
#19 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1706
#20 0x0000562e80a26c2f in PostmasterMain (argc=5, argv=0x562e8151e870) at ./build/../src/backend/postmaster/postmaster.c:1379
#21 0x0000562e807b3cdb in main (argc=5, argv=0x562e8151e870) at ./build/../src/backend/main/main.c:228

travis_time:end:1c2705c6:start=1536850966762588593,finish=1536851027772766991,duration=61010178398
[0K
[31;1mThe command "logbt -- make check "RUNTESTFLAGS=--verbose"" exited with 2.[0m

comment:48 by pramsey, 6 years ago

As promised, this is now highly replicable, I can exercise it on OSX against 2.5 branch head. It also works against geometry.

CREATE TABLE analyze_crash (
  location geometry(Point,4326) 
  );
INSERT INTO analyze_crash ( location ) 
  VALUES ( 
    ST_GeomFromText('POINT(-79.935308 40.458163)', 4326) 
  );
ANALYZE analyze_crash;
Last edited 6 years ago by pramsey (previous) (diff)

comment:49 by pramsey, 6 years ago

Milestone: PostGIS PackagingPostGIS 2.5.0
Priority: criticalblocker

comment:50 by robe, 6 years ago

Is this related to #4144, if manifestations of the same issue, can we close this one back and just focus on the #4114

comment:51 by pramsey, 6 years ago

In 16781:

Avoid array overflow in ANALYZE (References #2985)

comment:52 by pramsey, 6 years ago

In 16782:

Avoid array overflow in ANALYZE (References #2985)

comment:53 by pramsey, 6 years ago

In 16783:

Avoid array overflow in ANALYZE (References #2985)

comment:54 by pramsey, 6 years ago

In 16784:

Avoid array overflow in ANALYZE (References #2985)

comment:55 by pramsey, 6 years ago

Resolution: fixed
Status: reopenedclosed

In 16785:

Avoid array overflow in ANALYZE (Closes #2985)

Note: See TracTickets for help on using tickets.