Opened 3 years ago
Closed 3 years ago
#5014 closed defect (fixed)
flatgeobuf test failure
Reported by: | sebastic | Owned by: | Björn Harrtell |
---|---|---|---|
Priority: | critical | Milestone: | PostGIS 3.2.0 |
Component: | postgis | Version: | master |
Keywords: | Cc: |
Description
The Debian package build for 3.2.0-beta1 fails due to a test failure:
./regress/core/flatgeobuf ..Died at /build/postgis-3.2.0~beta1+dfsg/regress/run_test.pl line 744. failed (psql exited with an error: /tmp/pgis_reg/test_18_out) ----------------------------------------------------------------------------- --- Null geometry --- T1|0| --- Geometry roundtrips --- P1|0|POINT(1.1 2.1) P2|0|POINT Z (1.1 2.11 3.2) P3|0|POINT ZM (1.1 2.12 3.2 4.3) P4|0|SRID=4326;POINT(-71.1043443253471 42.3150676015829) psql:flatgeobuf.sql:38: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. psql:flatgeobuf.sql:38: fatal: connection to server was lost ----------------------------------------------------------------------------- make[2]: *** [regress/runtest.mk:11: check-regress] Error 2 make[2]: Leaving directory '/build/postgis-3.2.0~beta1+dfsg' *** /tmp/pg_virtualenv.n9Corf/log/postgresql-14-regress.log (last 100 lines) *** 2021-10-24 05:03:35.932 UTC [1532552] LOG: starting PostgreSQL 14.0 (Debian 14.0-1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.3.0-11) 10.3.0, 64-bit 2021-10-24 05:03:35.932 UTC [1532552] LOG: listening on IPv4 address "127.0.0.1", port 5432 2021-10-24 05:03:35.933 UTC [1532552] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2021-10-24 05:03:35.934 UTC [1532553] LOG: database system was shut down at 2021-10-24 05:03:35 UTC 2021-10-24 05:03:35.938 UTC [1532552] LOG: database system is ready to accept connections 2021-10-24 05:04:15.781 UTC [1533687] pbuilder@postgis_reg ERROR: Number of iterations must be between 1 and 5 : LWGEOM_ChaikinSmoothing 2021-10-24 05:04:15.781 UTC [1533687] pbuilder@postgis_reg STATEMENT: SELECT '2', ST_astext(ST_ChaikinSmoothing('LINESTRING(0 0, 8 8, 0 16)',10)); 2021-10-24 05:04:15.781 UTC [1533687] pbuilder@postgis_reg ERROR: Number of iterations must be between 1 and 5 : LWGEOM_ChaikinSmoothing 2021-10-24 05:04:15.781 UTC [1533687] pbuilder@postgis_reg STATEMENT: SELECT '3', ST_astext(ST_ChaikinSmoothing('LINESTRING(0 0, 8 8, 0 16)',0)); 2021-10-24 05:04:16.898 UTC [1533736] pbuilder@postgis_reg ERROR: LWGEOM_collect: Operation on mixed SRID geometries (Point, 32749) != (Point, 32740) 2021-10-24 05:04:16.898 UTC [1533736] pbuilder@postgis_reg STATEMENT: SELECT ST_Collect('SRID=32749;POINT(0 0)', 'SRID=32740;POINT(1 1)'); 2021-10-24 05:04:16.898 UTC [1533736] pbuilder@postgis_reg ERROR: LWGEOM_makeline: Operation on mixed SRID geometries (Point, 0) != (Point, 3) 2021-10-24 05:04:16.898 UTC [1533736] pbuilder@postgis_reg STATEMENT: select ST_makeline('POINT(0 0)', 'SRID=3;POINT(1 1)'); 2021-10-24 05:04:16.899 UTC [1533736] pbuilder@postgis_reg ERROR: BOX2D_construct: Operation on mixed SRID geometries (Point, 0) != (Point, 3) 2021-10-24 05:04:16.899 UTC [1533736] pbuilder@postgis_reg STATEMENT: select ST_makebox2d('POINT(0 0)', 'SRID=3;POINT(1 1)'); 2021-10-24 05:04:16.899 UTC [1533736] pbuilder@postgis_reg ERROR: BOX3D_construct: Operation on mixed SRID geometries (Point, 0) != (Point, 3) 2021-10-24 05:04:16.899 UTC [1533736] pbuilder@postgis_reg STATEMENT: select ST_3DMakeBox('POINT(0 0)', 'SRID=3;POINT(1 1)'); 2021-10-24 05:04:18.376 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "t.g" do not exist 2021-10-24 05:04:18.381 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "t.g" do not exist 2021-10-24 05:04:18.381 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "t.g" do not exist 2021-10-24 05:04:18.390 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "c1.g" do not exist 2021-10-24 05:04:18.390 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "c2.g" do not exist 2021-10-24 05:04:18.390 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "p.g" do not exist 2021-10-24 05:04:18.391 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "c2.g" do not exist 2021-10-24 05:04:18.396 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "p.g" do not exist 2021-10-24 05:04:18.396 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "p.g" do not exist 2021-10-24 05:04:18.396 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "c1.g" do not exist 2021-10-24 05:04:18.397 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "c1.g" do not exist 2021-10-24 05:04:18.398 UTC [1533808] pbuilder@postgis_reg WARNING: stats for "p.g" do not exist Dropping cluster 14/regress ...
Attachments (1)
Change History (23)
comment:1 by , 3 years ago
Priority: | medium → blocker |
---|
comment:2 by , 3 years ago
follow-up: 4 comment:3 by , 3 years ago
Is it possible for you to send a backtrace for this or if there is a way for me to test this via some container thing.
You can setup a cowbuilder chroot to reproduce the issue, see:
https://debian-gis-team.pages.debian.net/policy/packaging.html#git-pbuilder
After configuring sudo
as documented:
sudo cowbuilder --create \ --distribution=sid \ --basepath=/var/cache/pbuilder/base-sid.cow sudo cowbuilder --login --basepath /var/cache/pbuilder/base-sid.cow # Inside the chroot ## Add apt sources echo "deb http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list echo "deb-src http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list sed -i 's/^#deb-src/deb-src/' /etc/apt/sources.list apt update ## Install tools you need (vim, less, gdb, etc) ## Get postgis source package cd /tmp/buildd apt source -t experimental postgis cd postgis-*/ ## Install build dependencies apt build-dep postgis ## Delete patch which skips the failing test apt install quilt quilt delete flatgeobuff.patch ## Build the package DEB_BUILD_OPTIONS="parallel=3" dpkg-buildpackage -uc -uc 2>&1 | tee ../postgis.build
I see you are using pg_virtualenv but not sure what that is.
It creates a throw-away PostgreSQL environment for running regression tests, see its manpage:
https://manpages.debian.org/unstable/postgresql-common/pg_virtualenv.1.en.html
You can use its -s
option to launch a shell when the command fails, e.g.:
pg_virtualenv -v 14 -s make check RUNTESTFLAGS="-v"
comment:4 by , 3 years ago
Replying to Bas Couwenberg:
Is it possible for you to send a backtrace for this or if there is a way for me to test this via some container thing.
You can setup a cowbuilder chroot to reproduce the issue, see:
https://debian-gis-team.pages.debian.net/policy/packaging.html#git-pbuilder
After configuring
sudo
as documented:sudo cowbuilder --create \ --distribution=sid \ --basepath=/var/cache/pbuilder/base-sid.cow sudo cowbuilder --login --basepath /var/cache/pbuilder/base-sid.cow # Inside the chroot ## Add apt sources echo "deb http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list echo "deb-src http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list sed -i 's/^#deb-src/deb-src/' /etc/apt/sources.list apt update ## Install tools you need (vim, less, gdb, etc) ## Get postgis source package cd /tmp/buildd apt source -t experimental postgis cd postgis-*/ ## Install build dependencies apt build-dep postgis ## Delete patch which skips the failing test apt install quilt quilt delete flatgeobuff.patch ## Build the package DEB_BUILD_OPTIONS="parallel=3" dpkg-buildpackage -uc -uc 2>&1 | tee ../postgis.build
I think I'm missing a step somewhere. I got all the way to the last step:
and after running
DEB_BUILD_OPTIONS="parallel=3" dpkg-buildpackage -uc -uc 2>&1 | tee ../postgis.build
I get -
make[1]: Leaving directory '/build/postgis-3.2.0~beta1+dfsg' dh_clean /build/postgis-3.2.0~beta1+dfsg/debian/clean: 2: loader/pgsql2shp: not found /build/postgis-3.2.0~beta1+dfsg/debian/clean: 3: loader/shp2pgsql: not found dh_clean: warning: debian/clean is marked executable but does not appear to an executable config. dh_clean: warning: dh_clean: warning: If debian/clean is intended to be an executable config file, please ensure it can dh_clean: warning: be run as a stand-alone script/program (e.g. "./debian/clean") dh_clean: warning: Otherwise, please remove the executable bit from the file (e.g. chmod -x "debian/clean") dh_clean: warning: dh_clean: warning: Please see "Executable debhelper config files" in debhelper(7) for more information. dh_clean: warning: dh_clean: error: debian/clean (executable config) returned exit code 127 make: *** [debian/rules:58: clean] Error 25
Any thoughts on what I missed?
comment:5 by , 3 years ago
okay the file /build/postgis-3.2.0~beta1+dfsg/debian/clean just had path lines
loader/pgsql2shp loader/shp2pgsql
I remarked out those two lines and it seemed to get further. It's compiling now. Not sure if that was the right thing to do.
comment:6 by , 3 years ago
thanks Bas. I was able to replicate with your setup and get a back trace
#0 0x00007f9c0c34acd1 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f9c00bc7ec7 in memcpy (__len=<optimized out>, __src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29 #2 flatbuffers::vector_downward::push (num=<optimized out>, bytes=<optimized out>, this=0x7ffe4fe29440) at include/flatbuffers/flatbuffers.h:1031 #3 flatbuffers::vector_downward::push (num=<optimized out>, bytes=<optimized out>, this=0x7ffe4fe29440) at include/flatbuffers/flatbuffers.h:1030 #4 flatbuffers::FlatBufferBuilder::PushBytes (size=<optimized out>, bytes=<optimized out>, this=0x7ffe4fe29440) at include/flatbuffers/flatbuffers.h:1347 #5 flatbuffers::FlatBufferBuilder::CreateVector<double> (len=<optimized out>, v=<optimized out>, this=<optimized out>) at include/flatbuffers/flatbuffers.h:1712 #6 flatbuffers::FlatBufferBuilder::CreateVector<double> (v=..., v=..., this=0x7ffe4fe29440) at include/flatbuffers/flatbuffers.h:1740 #7 FlatGeobuf::CreateHeaderDirect (metadata=0x0, description=0x0, title=0x0, crs=..., index_node_size=16, features_count=1, columns=<optimized out>, has_tm=false, has_t=false, has_m=false, has_z=false, geometry_type=<optimized out>, envelope=<optimized out>, name=<optimized out>, _fbb=...) at ./deps/flatgeobuf/header_generated.h:660 #8 flatgeobuf_encode_header (ctx=0x56325fab2d38) at ./deps/flatgeobuf/flatgeobuf_c.cpp:87 #9 0x00007f9c00bc9268 in flatgeobuf_create_index (ctx=<optimized out>) at ./deps/flatgeobuf/flatgeobuf_c.cpp:195 #10 0x00007f9c00ba8591 in flatgeobuf_agg_finalfn (ctx=0x56325fab2ce8) at ./postgis/flatgeobuf.c:558 #11 0x000056325e629e70 in ?? () #12 0x000056325e62ae9e in ?? () #13 0x000056325e62bf50 in ?? () #14 0x000056325e649988 in ExecSetParamPlan () #15 0x000056325e616192 in ?? () #16 0x000056325e6213bd in ?? () #17 0x000056325e621d44 in ExecMakeTableFunctionResult () #18 0x000056325e6328c1 in ?? () #19 0x000056325e622430 in ExecScan () #20 0x000056325e61973d in standard_ExecutorRun () #21 0x000056325e79211b in ?? () #22 0x000056325e7935ab in PortalRun () #23 0x000056325e78f6ed in ?? () #24 0x000056325e79162c in PostgresMain () #25 0x000056325e70f618 in ?? () #26 0x000056325e710484 in PostmasterMain () #27 0x000056325e485e99 in main ()
comment:7 by , 3 years ago
This almost has to have something to do with changes from debian stable to sid. Might want to consider adding debian sid to at least one PostGIS CI run.
I'm trying to reproduce but I'm completely unfamiliar with this cowbuilder business. For now I'm trying with a fork of postgis-build-env based on debian sid.
comment:8 by , 3 years ago
I tested with bookworm - AKA bookkie (which I guess has been promoted to testing). I couldn't replicate the issue there so I suspect something else with the cowbuilder setup or the truly sid. Going to try again with a true sid.
This was running with PostgreSQL 14 distribution packaged with bookworm.
I'm going to go back to the cowbuilder setup I have and see if I can still replicate it there. If I can't compiling from scratch we might just need to package the beta2 and have Bas see if he can still replicate.
comment:9 by , 3 years ago
Okay just spun up a sid and no crash. Will go back to cowbuilder setup to test.
comment:10 by , 3 years ago
3.2.0-beta1 still fails on Debian unstable & testing, so the upgrade of postgresql-14 to 14.1 did not resolve the issue.
comment:11 by , 3 years ago
okay I just ran in the cow-builder sid setup I have and all tests pass against git postgis/master and the postgis/3.2.0beta1 tag, 3.2.0beta1 tar ball, even building with make -j4
but crashes when using the debian build package. Bas, is there something special this does beyond standard ./configure make check?
# this crashes
## Build the package DEB_BUILD_OPTIONS="parallel=3" dpkg-buildpackage -uc -uc 2>&1 | tee ../postgis.build
to do all above I did:
sudo cowbuilder --create \ --distribution=sid \ --basepath=/var/cache/pbuilder/base-sid.cow sudo cowbuilder --login --basepath /var/cache/pbuilder/base-sid.cow # Inside the chroot ## Add apt sources echo "deb http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list echo "deb-src http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list sed -i 's/^#deb-src/deb-src/' /etc/apt/sources.list apt update apt upgrade #this I added I didn't do that before when testing ## Install tools you need (vim, less, gdb, etc) ## Get postgis source package cd /tmp/buildd apt source -t experimental postgis cd postgis-*/ ## Install build dependencies apt build-dep postgis ## Delete patch which skips the failing test apt install quilt quilt delete flatgeobuff.patch ## Build the package DEB_BUILD_OPTIONS="parallel=3" dpkg-buildpackage -uc -uc 2>&1 | tee ../postgis.build ##crashes ## Now test direct from git apt install ca-certificates git #needed for git clone cd /tmp/buildd git clone https://git.osgeo.org/gitea/postgis/postgis.git cd postgis sh autogen.sh ./configure make # all tests passed pg_virtualenv -v 14 -s make check RUNTESTFLAGS="-v" # now testing 3.2.0beta1 git clean -fd git checkout 3.2.0beta1 sh autogen.sh ./configure make check
Try building direct from tar ball - regresses just fine
cd /tmp/build apt install wget wget http://download.osgeo.org/postgis/source/postgis-3.2.0beta1.tar.gz tar -xvf postgis-3.2.0beta1.tar.gz cd postgis-3.2.0beta1 ./configure make pg_virtualenv -v 14 -s make check RUNTESTFLAGS="-v"
Try building with parallel build, regresses fine
cd /tmp/build apt install wget wget http://download.osgeo.org/postgis/source/postgis-3.2.0beta1.tar.gz tar -xvf postgis-3.2.0beta1.tar.gz cd postgis-3.2.0beta1 ./configure make -j4 pg_virtualenv -v 14 -s make -j4 check RUNTESTFLAGS="-v"
comment:12 by , 3 years ago
Bas, is there something special this does beyond standard ./configure make check?
The biggest difference is that dpkg-buildpackage
uses fakeroot
not real root like your git builds.
The Debian package build also applies a couple of patches:
https://salsa.debian.org/debian-gis-team/postgis/-/tree/experimental/debian/patches
comment:13 by , 3 years ago
Yah the patches look harmless and I tested using the same folder and it was fine if I build without dpkg-buildpackage. I'm not sure how to use fakeroot (I tried but gave me permission issues). It must be causing it to build a corrupt package. The reason I say that is if I test (what I think is the built package from dpkg-buildpackage using pg_virtualenv) it crashes too. So it's not testing in fakeroot that is the issue unless I'm testing that wrong.
If I do this it passes
sudo cowbuilder --create \ --distribution=sid \ --basepath=/var/cache/pbuilder/base-sid.cow sudo cowbuilder --login --basepath /var/cache/pbuilder/base-sid.cow # Inside the chroot ## Add apt sources echo "deb http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list echo "deb-src http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list sed -i 's/^#deb-src/deb-src/' /etc/apt/sources.list apt update apt upgrade #this I added I didn't do that before when testing ## Install tools you need (vim, less, gdb, etc) ## Get postgis source package cd /tmp/buildd apt source -t experimental postgis cd postgis-*/ ## Install build dependencies apt build-dep postgis ## Delete patch which skips the failing test apt install quilt quilt delete flatgeobuff.patch pg_virtualenv -v 14 -s make check RUNTESTFLAGS="-v"
If I do this it fails both during the building and running env after
sudo cowbuilder --create \ --distribution=sid \ --basepath=/var/cache/pbuilder/base-sid.cow sudo cowbuilder --login --basepath /var/cache/pbuilder/base-sid.cow # Inside the chroot ## Add apt sources echo "deb http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list echo "deb-src http://deb.debian.org/debian/ experimental main contrib non-free" >> /etc/apt/sources.list sed -i 's/^#deb-src/deb-src/' /etc/apt/sources.list apt update apt upgrade #this I added I didn't do that before when testing ## Install tools you need (vim, less, gdb, etc) ## Get postgis source package cd /tmp/buildd apt source -t experimental postgis cd postgis-*/ ## Install build dependencies apt build-dep postgis ## Delete patch which skips the failing test apt install quilt quilt delete flatgeobuff.patch ## Build the package #crashes on flatgeobuf dpkg-buildpackage -uc -uc 2>&1 | tee ../postgis.build ## Also crashes on flatgeobuf exit #assume this exits the fakeroot pg_virtualenv -v 14 -s make check RUNTESTFLAGS="-v"
comment:14 by , 3 years ago
Priority: | blocker → critical |
---|
downgrading to critical since it requires a unique set of building steps to replicate.
comment:15 by , 3 years ago
Still fails with beta2.
The backtrace posted earlier shows:
memcpy ([...]) at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
This suggest hardening flags may be the reason you cannot reproduce the issue when not using dpkg-buildpackage
. The flags it uses are:
# DEB_BUILD_MAINT_OPTIONS=hardening=+all dpkg-buildflags --export=sh export CFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong -Wformat -Werror=format-security" export CPPFLAGS="-Wdate-time -D_FORTIFY_SOURCE=2" export CXXFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong -Wformat -Werror=format-security" export DFLAGS="-frelease" export FCFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong" export FFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong" export GCJFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong" export LDFLAGS="-Wl,-z,relro -Wl,-z,now" export OBJCFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong -Wformat -Werror=format-security" export OBJCXXFLAGS="-g -O2 -ffile-prefix-map=/build/postgis=. -fstack-protector-strong -Wformat -Werror=format-security"
-fstack-protector-strong might be it.
Disabling the hardening flags with the following in debian/rules
does not resolve the issue:
export DEB_BUILD_MAINT_OPTIONS=hardening=-all
by , 3 years ago
Attachment: | flatgeobuf.patch added |
---|
comment:16 by , 3 years ago
Disabling the P5
& P6
tests (e.g. with flatgeobuf.patch) makes the testsuite not fail, enabling either of those two tests causes the testsuite to fail.
Is the flatgeobuf code valgrind clean? The memcpy
call from the backtrace might point to memory issues.
comment:17 by , 3 years ago
Thanks Bas, that's a good lead - the P5 and P6 tests are special because they use the spatial index feature of the function. But a bit strange to me that it is not yet possible to reproduce in other places though, I'm pretty sure some PostGIS CI builds are valgrind checking. I agree -fstack-protector-strong
could seem to be related but I'm not familiar with it and as disabling it doesn't help I guess it must be something else.
comment:18 by , 3 years ago
Inspected the code and the trace above and it is strange…. it seems to crash on a memcpy that is triggered by flatgeobuf_encode_header and CreateHeaderDirect. This is code run on several other occations including in the non-indexed variants. But, it also says the invocations follow flatgeobuf_create_index which is the second time it's called in that execution of ST_AsFlatGeobuf so I suppose it must mean it is not safe to call multiple times - but I cannot understand why yet.
comment:19 by , 3 years ago
I might have found the reason. Attempt to fix in https://git.osgeo.org/gitea/postgis/postgis/pulls/63 / https://git.osgeo.org/gitea/postgis/postgis/commit/eda70258f373c1ad88e1851f194c891ad8d9aff4.
comment:20 by , 3 years ago
Confirmed fixed with the changes in eda70258f373c1ad88e1851f194c891ad8d9aff4:
https://salsa.debian.org/debian-gis-team/postgis/-/jobs/2223711
comment:21 by , 3 years ago
Owner: | changed from | to
---|
comment:22 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
From the line number in ticket looks like this combination should crash it but I can't get it to crash on my PostgreSQL 14 PostGIS 3.2.0beta1 mingw64 install and hasn't been crashing on regression testing of other bots running debian (debbie and our raspberry pis (berrie, berrie32).
Is it possible for you to send a backtrace for this or if there is a way for me to test this via some container thing. I see you are using pg_virtualenv but not sure what that is.