Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#5172 closed defect (fixed)

GHA ci main/main is failing

Reported by: robe Owned by: robe
Priority: blocker Milestone: PostGIS 3.3.0
Component: raster Version: master
Keywords: Cc:

Description

Our github action job that tests main GEOS, Master GDAL, master PostgreSQL is failing. The last build of the docker image was 7hrs ago, so I assume it might be something changed in one of those projects. I suspect it's either GDAL or PostgreSQL.

https://github.com/postgis/postgis/runs/6938853618?check_suite_focus=true

PostgreSQL 15beta1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
  Postgis 3.3.0dev - (6cd8406) - 2022-06-17 15:39:34
  scripts 3.3.0dev 6cd8406
  raster scripts 3.3.0dev 6cd8406
  GEOS: 3.11.0beta2-CAPI-1.16.0
  PROJ: 9.1.0
  SFCGAL: 1.4.1
  GDAL: GDAL 3.6.0dev-fed5a54, released 2022/06/16

It croaks at this point:

Died at ./regress/run_test.pl line 778.
 ./raster/test/regress/check_gdal .. failed (psql exited with an error: /tmp/pgis_reg/test_223_out)
-----------------------------------------------------------------------------
invalid_path
psql:check_gdal.sql:20: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
psql:check_gdal.sql:20: error: connection to server was lost
-----------------------------------------------------------------------------
make: *** [regress/runtest.mk:24: check-regress] Error 2
[logbt] saw 'make' exit with code:2 (INT)
[logbt] Found corefile (non-tracked) at /tmp/logbt-coredumps/core.12126.!usr!local!pgsql!bin!postgres
[logbt] Processing cores...
warning: Can't open file /dev/shm/PostgreSQL.2296011912 during file-backed mapping note processing
warning: Can't open file /dev/shm/PostgreSQL.1620786782 during file-backed mapping note processing
warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing
warning: Can't open file /SYSV002fa66a (deleted) during file-backed mapping note processing
[New LWP 12126]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
48	iofclose.c: No such file or directory.
Core was generated by `postgres: postgres postgis_reg-3.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f4bcc71d217 in _IO_new_fclose (fp=0x0) at iofclose.c:48
Thread 1 (Thread 0x7f4bcc6a6740 (LWP 12126)):
#0  0x00007f4bcc71d217 in _IO_new_fclose (fp=0x0) at iofclose.c:48
        status = <optimized out>
#1  0x00007f4bc30eb29a in VSIStdinFilesystemHandler::~VSIStdinFilesystemHandler (this=<optimized out>, __in_chrg=<optimized out>) at cpl_vsil_stdin.cpp:399
No locals.
#2  0x00007f4bc30eb319 in VSIStdinFilesystemHandler::~VSIStdinFilesystemHandler (this=0x558fcf7023b0, __in_chrg=<optimized out>) at cpl_vsil_stdin.cpp:408
No locals.
#3  0x00007f4bc308f6a7 in VSIFileManager::~VSIFileManager (this=0x558fcfa7c800, __in_chrg=<optimized out>) at cpl_vsil.cpp:2795
        iter = {first = "/vsistdin/", second = 0x558fcf7023b0}
        oSetAlreadyDeleted = std::set with 16 elements = {[0] = 0x558fcf7023b0, [1] = 0x558fcf703be0, [2] = 0x558fcf70f650, [3] = 0x558fcf7255a0, [4] = 0x558fcf725830, [5] = 0x558fcf725ac0, [6] = 0x558fcf72ab10, [7] = 0x558fcf749460, [8] = 0x558fcf753120, [9] = 0x558fcf76e850, [10] = 0x558fcf76fce0, [11] = 0x558fcf8251f0, [12] = 0x558fcfa69c40, [13] = 0x558fcfa82120, [14] = 0x558fcfa9c060, [15] = 0x558fcfae85d0}
        oSetAlreadyDeleted = Python Exception <class 'gdb.error'> value has been optimized out: 
        iter = Python Exception <class 'gdb.error'> value has been optimized out: 
#4  0x00007f4bc308f775 in VSICleanupFileManager () at cpl_vsil.cpp:2928
No locals.
#5  0x00007f4bc2bf0f3e in GDALDriverManager::~GDALDriverManager (this=0x558fcf728600, __in_chrg=<optimized out>) at gdaldrivermanager.cpp:273
        bHasDroppedRef = <optimized out>
        nDSCount = 0

This is the first failure, and no code changes have been made between now and last successful build aside from the docker image rebuild.

Change History (7)

comment:1 by pramsey, 2 years ago

I just did a local build with the latest GDAL and latest GEOS and latest PostGIS and no crash. I incline to wondering if something is wrong with the image, or how it is being created.

in reply to:  1 comment:2 by robe, 2 years ago

Replying to pramsey:

I just did a local build with the latest GDAL and latest GEOS and latest PostGIS and no crash. I incline to wondering if something is wrong with the image, or how it is being created.

It's also possible something changed in the past 4 days since the image was built. I'll do a rebuild to see if it continues and if so I'll try next on debbie.

comment:3 by robe, 2 years ago

I forgot I do build gdal master on debbie. I tested with master on debbie and no issue. The GHA docker image is still building, I'll trigger a run after it is done building.

comment:4 by robe, 2 years ago

okay sadly even after rebuild of docker image, it's still erroring out

 ./raster/test/regress/check_gdal .. failed (psql exited with an error: /tmp/pgis_reg/test_223_out)
-----------------------------------------------------------------------------
invalid_path
psql:check_gdal.sql:20: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
psql:check_gdal.sql:20: error: connection to server was lost
-----------------------------------------------------------------------------
make: *** [regress/runtest.mk:24: check-regress] Error 2
[logbt] saw 'make' exit with code:2 (INT)
[logbt] Found corefile (non-tracked) at /tmp/logbt-coredumps/core.12126.!usr!local!pgsql!bin!postgres
[logbt] Processing cores...

If it were something wrong with the base image, I would think we would be seeing issues with pg14-clang-geosmain-gdal34-proj71 which are also based on same image with only difference being compiled versions of PostgreSQL / Proj / and GDAL.

Could it be maybe some play with Proj? This is also running bleeding edge proj 9.1 and that is one thing I'm not testing on debbie, as she's running with system installed proj.

I'm going to create an image with Proj 7.1 (like the pg14, to rule proj out as the culprit)

comment:5 by robe, 2 years ago

I think the culprit is PROJ 9.1.

I swapped out the latest (which had GDAL 3.6, PROJ 9.1, PostgreSQL 15, GEOS 3.11) and replaced with GDAL 3.6, PROJ 9.0, PostgreSQL 15, GEOS 3.11, and there is no error.

https://github.com/postgis/postgis/runs/7058314172?check_suite_focus=true

 PostgreSQL 15beta1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
  Postgis 3.3.0dev - (0bc5aa5) - 2022-06-26 04:15:30
  scripts 3.3.0dev 0bc5aa5
  raster scripts 3.3.0dev 0bc5aa5
  GEOS: 3.11.0beta3-CAPI-1.16.0
  PROJ: 9.0.1
  SFCGAL: 1.4.1
  GDAL: GDAL 3.6.0dev-125fafc, released 2022/06/25

as opposed to the latest, which crashes on the gdal driver test

PostgreSQL 15beta1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
  Postgis 3.3.0dev - (76a92a5) - 2022-06-24 19:04:03
  scripts 3.3.0dev 76a92a5
  raster scripts 3.3.0dev 76a92a5
  GEOS: 3.11.0beta2-CAPI-1.16.0
  PROJ: 9.1.0
  SFCGAL: 1.4.1
  GDAL: GDAL 3.6.0dev-c5ffcd7, released 2022/06/19

Although I suppose it still could be GDAL since it looks like perhaps GHA has the docker latest cached, since last run is till not the latest build of GDAL

comment:6 by Regina Obe <lr@…>, 2 years ago

Resolution: fixed
Status: newclosed

In 5bcf72d/git:

Put back latest to see if picks up new image and fixes #5172

comment:7 by robe, 2 years ago

Okay it looks like the latest is working now. So wasn't PROJ either. Must have been a fix in GDAL between 6/19 and 6/23 and the last postgis build with latest was on 6/24 so must have just missed the latest publish by a hair because this one:

https://github.com/postgis/postgis/runs/7062500186?check_suite_focus=true

PostgreSQL 15beta1 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
  Postgis 3.3.0dev - (5bcf72d) - 2022-06-26 18:08:58
  scripts 3.3.0dev 5bcf72d
  raster scripts 3.3.0dev 5bcf72d
  GEOS: 3.11.0beta3-CAPI-1.16.0
  PROJ: 9.1.0
  SFCGAL: 1.4.1
  GDAL: GDAL 3.6.0dev-2d29343, released 2022/06/24

Ran successfully

Last edited 2 years ago by robe (previous) (diff)
Note: See TracTickets for help on using tickets.