Opened 14 months ago
Closed 14 months ago
#5121 closed defect (fixed)
LTO enabled causes windows, freebsd and some github actions to fail
|Reported by:||robe||Owned by:||robe|
export-all-symbols -Wl,--out-implib=libpostgis-3.3.a lto1.exe: internal compiler error: in gen_subprogram_die, at dwarf2out.c:22668 libbacktrace could not find executable to open Please submit a full bug report, with preprocessed source if appropriate. See <https://sourceforge.net/projects/mingw-w64> for instructions. lto-wrapper.exe: fatal error: C:\ming64gcc81\mingw64\bin\gcc.exe returned 1 exit status compilation terminated. C:/ming64gcc81/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.1.0/../../../../x86_64-w64-mingw32/bin/ld.exe: error: lto-wrapper failed collect2.exe: error: ld returned 1 exit status make: *** [E:/jenkins/postgresql/rel/pg14w64gcc81/lib/pgxs/src/makefiles/../../src/Makefile.shlib:374: postgis-3.3.dll] Error 1 make: Leaving directory '/projects/postgis/branches/3.3/postgis' make: *** [GNUmakefile:24: all] Error 1
And searching for this on the internet, led me back to my old ticket from 2 years ago #4583
Which Raul kindly pointed out was because of #4754.
@komzpa I recall an LTO commit of yours recently. I admittedly have not been paying too much attention, been ignoring the problem hoping it would go away.
bessie32 is also failing, but could be a different issue
15:52:22 libtool: link: gcc8 -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-trunc -fno-math-errno -fno-signed-zeros -Wall -flto -fPIC -DPIC -I./../rt_core -I./.. -I. -I../.. -I../../liblwgeom -I../../liblwgeom -I/usr/local/include -I/usr/local/include -I/usr/local/include -I/usr/local/include raster2pgsql.o -flto -o raster2pgsql ../rt_core/librtcore.a ../../liblwgeom/.libs/liblwgeom.a -L/usr/local/lib -lm -lproj -ljson-c -lSFCGAL -lgdal -lgeos_c -lintl -liconv 15:52:23 /usr/local/bin/ld: /tmp//cczKohy3.ltrans0.ltrans.o: undefined reference to symbol 'rtrealloc' 15:52:23 /usr/local/bin/ld: /usr/local/lib/librttopo.so.1: error adding symbols: DSO missing from command line 15:52:23 collect2: error: ld returned 1 exit status 15:52:23 gmake: *** [Makefile:86: raster2pgsql] Error 1 15:52:23 gmake: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/b0741830443c896ebbf15b51486a2b23787b7485/raster/loader' 15:52:23 gmake: *** [Makefile:35: rtloader] Error 2 15:52:23 gmake: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/b0741830443c896ebbf15b51486a2b23787b7485/raster' 15:52:23 gmake: *** [GNUmakefile:24: all] Error 1 15:52:23 gmake: Leaving directory '/usr/home/jenkins/workspace/PostGIS_Worker_Run/label/bessie32/b0741830443c896ebbf15b51486a2b23787b7485' 15:52:23 *** Error code 2 15:52:23
But bessie (64-bit FreeBSD seems fine)
I still need to confirm I have the same issue on by dev.
Change History (20)
by , 14 months ago
comment:1 by , 14 months ago
comment:2 by , 14 months ago
Thanks for the quick response. I'll test it out on my mingw setup and commit if it works.
comment:3 by , 14 months ago
Okay tested on my mingw setup (my setup is old BTW gcc 8.1 but that is another story)
At anyrate the patch seems to screw up ability to find CC.
cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -O2 -I../liblwgeom -I../liblwgeom -std=gnu99 -g -O2 -fno-math-errno -fno-signed-zeros -Wall -flto -I../libpgcommon -I../deps/flatgeobuf -I../deps/wagyu -I../deps/uthash/include -I/projects/geos/rel-3.11w64gcc81/include -IC:/ming64gcc81/projects/proj/rel-7.2.1w64gcc81/include -IC:/ming64gcc81/projects/protobuf/rel-3.2.0w64gcc81/include -I/projects/libxml/rel-libxml2-2.9.9w64gcc81/include/libxml2 -I/projects/CGAL/rel-sfcgal-1.4.0w64gcc81/include -IC:/ming64gcc81/projects/json-c/rel-0.12w64gcc81/include/json-c -IC:/ming64gcc81/projects/pcre/rel-8.33w64gcc81/include -DNDEBUG -I/projects/postgresql/rel/pg15w64gcc81/include -I/projects/rel-libiconv-1.16w64gcc81/include -DDLL_EXPORT -DPIC -I. -I./ -IC:/MING64~1/projects/POSTGR~1/rel/PG15W6~1/include/server -IC:/MING64~1/projects/POSTGR~1/rel/PG15W6~1/include/internal -I/projects/zlib/rel-zlib-1.2.11w64gcc81/include -I/projects/libxml/rel-libxml2-2.9.9w64gcc81/include -I./src/include/port/win32 -I/projects/libxml/rel-libxml2-2.9.9w64gcc81/include/libxml2 -IC:/ming64gcc81/projects/lz4/rel-lz4-1.9.3w64gcc81/include -IC:/MING64~1/projects/POSTGR~1/rel/PG15W6~1/include/server/port/win32 -DWIN32_STACK_RLIMIT=4194304 -c -o postgis_module.o postgis_module.c /bin/sh: line 1: cc: command not found
Looking at the postgis/Makefile generated, it seems to have
CUSTOM_CC := $(CC)
Which I am assuming is the culprit. By comparison, the generated liblwgeom/Makefile has
CC = x86_64-w64-mingw32-gcc
trying to change Makefile.in to below gets me back to the original error
CUSTOM_CC := @CC@
FWIW I think @strk was saying we should get rid of PGXS as it's causing more issues than helping.
comment:4 by , 14 months ago
My conclusion was that PGXS has to be replaced, since it intoduces another set of compilation options that is not fully controlled by user. It seems that LTO cannot be reliably enabled by default before doing that, especially if we need to allow selecting CC in
./configure like in this failing tests.
Thank you for mentioning @strk's opinion. Would be great if he could comment on this issue.
CUSTOM_CC := $(CC) was my attempt to make PGXS use the same compiler. Seemed to work in my case, but later I noticed
cc being used as compiler in log.
comment:5 by , 14 months ago
I never liked delegating control to PGXS, the first victim of this was —prefix support which is still an issue after over 11 years: #635
comment:6 by , 14 months ago
comment:7 by , 14 months ago
The commit above just disables LTO everywhere.
if test "MINGWBUILD" = "0"; then should be
if test "$MINGWBUILD" = "0"; then.
comment:8 by , 14 months ago
comment:9 by , 14 months ago
I just noticed that the commit where I accidentally disabled LTO everywhere, we got all green lights on github actions
So I guess LTO is causing the errors on github too. Does your pull request solve the github issues you know?
comment:10 by , 14 months ago
|Summary:||winnie is broken with this strange error lto1.exe: internal compiler error: in gen_subprogram_die, at dwarf2out.c:22668 → LTO enabled causes windows, freebsd and some github actions to fail|
changing the title of this since it seems more involved than just mingw
comment:11 by , 14 months ago
@robe, sorry, I don't understand your question about github issues. Do you mean is there a ticket requesting LTO?
by , 14 months ago
after disabling LTO for all by accident
by , 14 months ago
renabled lto for all except mingw
comment:12 by , 14 months ago
About github actions (not issues). I've added the screen shots to show what I mean. The ticket thing there is a GH pull request which is fine.
When I accidentally disabled LTO for all systems, all GH actions became green. A couple have been red for a while.
# Regina accidentally disabling LTO entirely
When I changed to just disable for mingw, then those went red again though winnie was still happy
# Regina changing to just disable for mingw windows
I was baffled with the errors on the GH actions cause they are each different so I thought they were caused by bad docker builds or a change in GDAL.
1) CI (pg14-clang-geosmain-gdal34-proj71, usan_clang) and (pg13-clang-geos39-gdal31-proj71, usan_clang)
couldn't find GDALALL checking for library containing GDALAllRegister… no
Error: Process completed with exit code 1.
2)CI (pg13-geos39-gdal31-proj71, usan_gcc) psql:/src/postgis/regress/00-regress-install/share/contrib/postgis/sfcgal.sql:52: ERROR: could not load library "/src/postgis/regress/00-regress-install/lib/postgis_sfcgal-3.so": /src/postgis/regress/00-regress-install/lib/postgis_sfcgal-3.so: undefined symbol: ubsan_handle_mul_overflow
comment:13 by , 14 months ago
Yes, github action errors seem not related at first glance. I decided to switch the breaking PR from draft because of that.
Unfortunately I still don't have a fix. I'm going to proceed as if the plan is to get rid of PGXS and hopefully find some kind of solution in the process.
Adding LTO flags automatically should probably be disabled for now.
comment:14 by , 14 months ago
Okay so maybe we can defined a —with-lto config to enable them?
comment:15 by , 14 months ago
PR replacing MINGWBUILD check with —enable-lto option: https://github.com/postgis/postgis/pull/681
comment:16 by , 14 months ago
I come from the opposite side of the pgxs, I feel like it cleared up a lot of alternate problems by anal retentively enforcing a "build your extension just like your server" rule which probably saved us from a lot of really obscure mixed-compiler, fun-platform bugs which we are not taking into our calculations of the "cost of pgxs" because we never ever saw them, because they didn't exist.
comment:17 by , 14 months ago
|Status:||new → closed|
I made that breaking commit, link to Github PR: https://github.com/postgis/postgis/pull/678
I managed to replicate the issue on x86 FreeBSD 12 by installing gcc8 and gcc10, postgresql13-client (and required libraries), configuring with
../configure CC=gcc8 CXX=g++8 AR=gcc-ar8 RANLIB=gcc-ranlib8 CXXFLAGS='-O2 -pipe -fstack-protector-strong -Wl,-rpath=/usr/local/lib/gcc8 -nostdinc++ -isystem /usr/include/c++/v1 -Wl,-rpath=/usr/local/lib/gcc8' CFLAGS='-Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-trunc' '--with-libiconv=/usr/local' --without-interrupt-tests
(FreeBSD test explicitly sets CC and CXX, but
configurefails to select correct ar/ranlib, since they are not called gcc-ar/gcc-ranlib but gcc-ar8/gcc-ranlig8 (please see attached log), so I set them explicitly.)
The problem is a compiler version (LTO version) mismatch between selected gcc8 and gcc10 in PGXS
/usr/local/lib/postgresql/pgxs/src/Makefile.global). I tried to fix this by setting
CUSTOM_CCbefore including pgxs.mk (/usr/local/lib/postgresql/pgxs/src/makefiles/pgxs.mk) and that allowed to build the extensions. PR draft: https://github.com/postgis/postgis/pull/679
This is not quite a solution since CFLAGS in
Makefile.globalin this case still contains
-Wl,-rpath=/usr/local/lib/gcc10and cannot be overwritten (but flags can be appended by setting CUSTOM_COPTS or PG_CFLAGS) and
I haven't tried building with MinGW yet.
That's as far as I could get for now, will appreciate any help.