#5157 closed defect (fixed)
basic_test failure on i386
Reported by: | sebastic | Owned by: | strk |
---|---|---|---|
Priority: | blocker | Milestone: | PostGIS 3.3.0 |
Component: | liblwgeom | Version: | master |
Keywords: | Cc: | sebastic |
Description
The Debian package build for 3.3.0-alpha1 failed on i386 due to test failures:
Suite: minimum_bounding_circle Test: basic_test ...FAILED 1. cu_minimum_bounding_circle.c:35 - CU_ASSERT_TRUE(d <= result->radius) Test: test_empty ...passed
Full buildlogs:
Change History (15)
comment:1 by , 2 years ago
comment:2 by , 2 years ago
Only i386, amd64 and others are fine, see:
https://buildd.debian.org/status/package.php?p=postgis&suite=experimental
While the test may not have changed, dependencies have, e.g. PostgreSQL from 14.2 to 14.3.
follow-up: 5 comment:3 by , 2 years ago
Still fails in 3.3.0beta1:
Test: basic_test ...FAILED 1. cu_minimum_bounding_circle.c:35 - CU_ASSERT_TRUE(d <= result->radius)
Should we consider i386 no longer support and ignore test failures there?
comment:4 by , 2 years ago
No change in 3.3.0beta2:
Test: basic_test ...FAILED 1. cu_minimum_bounding_circle.c:35 - CU_ASSERT_TRUE(d <= result->radius)
comment:5 by , 2 years ago
Replying to Bas Couwenberg:
Still fails in 3.3.0beta1:
Test: basic_test ...FAILED 1. cu_minimum_bounding_circle.c:35 - CU_ASSERT_TRUE(d <= result->radius)Should we consider i386 no longer support and ignore test failures there?
Maybe in PostGIS 3.4.0, but we should fix in 3.3.0 as we still have 32-bit bots and never said we wouldn't support.
comment:6 by , 2 years ago
Priority: | medium → blocker |
---|
comment:7 by , 2 years ago
Looking at this more, seems something else is going on besides 32-bit.
We have a 32-bit freebsd ci bot and a raspberry-pi 32-bit bot and neither are failing on this test or any of the other tests. It could also be the geos version I suppose since minimum_bounding_circle is a GEOS function. I don't think PostGIS does anything aside from expose it. There have been a number of issues with GEOS on ARM 32-bit and PPC like this one in GEOS - https://github.com/libgeos/geos/issues/579
I see this 32-bit is running GEOS 3.10.2. Would be curious if 3.10.3 (https://libgeos.org/posts/2022-06-03-geos-3-10-3-released/) exhibits the same issue.
What is the gcc of this 32-bit and is it like a regular i386 or an arm or ppc or some other thing? Yours I see:
Run Summary: Type Total Ran Passed Failed Inactive suites 47 47 n/a 0 0 tests 333 333 332 1 0 asserts 5396 5396 5395 1 n/a
For reference the 32-bit cis we have are running the following:
- berrie (32-bit Rasberry PI armv7l-unknown-linux-gnueabihf,gcc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110, 32-bit) - running GEOS 3.11.0 (well she follows GEOS branches, so she is not running tagged, but the geos 3.11 branch), output on last 3.3 run is
Run Summary: Type Total Ran Passed Failed Inactive suites 46 46 n/a 0 0 tests 332 332 332 0 0 asserts 5381 5381 5381 0 n/a
- bessie32 - i386-portbld-freebsd12.3, compiled by gcc10 (FreeBSD Ports Collection) 10.3.0, 32-bit GEOS: 3.10.2-CAPI-1.16.0 (although, I got to check why one more tests on bessie32, I suspect is because we have protobuf disabled on berrie). If it were a plain geos 32-bit issue though, I'd expect bessie32 to be failing as well since she's running the same GEOS version as your debian 32-bit - as you can see looks to be running all the same tests (except for one additional assert)
Run Summary: Type Total Ran Passed Failed Inactive suites 47 47 n/a 0 0 tests 333 333 333 0 0 asserts 5397 5397 5397 0 n/a
- Cirrhus CI 32-bit (FreeBSD 12.3) - this one is failing completely for other reasons so out of the running at the moment.
comment:8 by , 2 years ago
Okay I think I answered some of my questions looking closer at your build.
(Debian 14.3-1+b1) on i686-pc-linux-gnu, compiled by gcc (Debian 11.3.0-1) 11.3.0, 32-bit
So it's a plain old i386 32-bit running Debian 11, so should be very similar to berrie except for a newer Debian/gcc. I'll see if I can replicate on a similar setup and see what the discrepancy is with the output.
comment:9 by , 2 years ago
You don't need an i386 system to reproduce this issue, you can also use an i386 chroot on amd64.
See the comment in #5014 where the process to create and use such a chroot is documented.
follow-up: 13 comment:11 by , 2 years ago
Replying to Bas Couwenberg:
3.2.2 does not have this issue.
Dependencies are all the same? I didn't think any of that code has changed.
Anyrate I was able to replicate the issue using your beta2 package and dpkg_buildpackage.
I put in a debug line to see the difference:
char *msg1 = "mbc_test failed (got %.12f expected %.12f) \n"; if ( d > result->radius){ printf(msg1, d, result->radius); }
output was:
Suite: minimum_bounding_circle Test: basic_test ...mbc_test failed (got 247.436045591407 expected 247.436045591404) FAILED 1. cu_minimum_bounding_circle.c:38 - CU_ASSERT_TRUE(d <= result->radius)
So I think maybe I should just change the test, cause the difference is so insignificant not to be a concern.
comment:13 by , 2 years ago
Replying to robe:
Replying to Bas Couwenberg:
3.2.2 does not have this issue.
Dependencies are all the same? I didn't think any of that code has changed.
gcc-12 is now default, everything else is mostly the same:
--- /tmp/postgis-3.3.0.deps 2022-07-24 07:14:46.896921001 +0200 +++ /tmp/postgis-3.2.2.deps 2022-07-24 07:14:59.852608988 +0200 -Setting up bsdextrautils (2.38-4) ... +Setting up bsdextrautils (2.38-5) ... -Setting up lib32gcc-s1 (12.1.0-5) ... +Setting up lib32gcc-s1 (12.1.0-7) ... -Setting up lib32stdc++6 (12.1.0-5) ... +Setting up lib32stdc++6 (12.1.0-7) ... -Setting up libblosc1:amd64 (1.21.1+ds2-2) ... +Setting up libblosc1:amd64 (1.21.1+ds2-3) ... -Setting up libblosc-dev (1.21.1+ds2-2) ... +Setting up libblosc-dev:amd64 (1.21.1+ds2-3) ... -Setting up libgfortran5:amd64 (12.1.0-5) ... +Setting up libgfortran5:amd64 (12.1.0-7) ... -Setting up liblcms2-2:amd64 (2.12~rc1-2) ... +Setting up liblcms2-2:amd64 (2.13.1-1) ... -Setting up liblz1:amd64 (1.13-3) ... +Setting up liblz1:amd64 (1.13-4) ... -Setting up libobjc-11-dev:amd64 (11.3.0-4) ... +Setting up libobjc-11-dev:amd64 (11.3.0-5) ... -Setting up libobjc4:amd64 (12.1.0-5) ... +Setting up libobjc4:amd64 (12.1.0-7) ... -Setting up libpython3-stdlib:amd64 (3.10.4-1+b1) ... +Setting up libpython3-stdlib:amd64 (3.10.5-3) ... -Setting up libsqlite3-dev:amd64 (3.39.0-2) ... +Setting up libsqlite3-dev:amd64 (3.39.2-1) ... +Setting up libstdc++-11-dev:amd64 (11.3.0-5) ... -Setting up libwww-mechanize-perl (2.10-1) ... +Setting up libwww-mechanize-perl (2.12-1) ... -Setting up libxslt1.1:amd64 (1.1.34-4) ... +Setting up libxslt1.1:amd64 (1.1.35-1) ... -Setting up libyaml-0-2:amd64 (0.2.2-1) ... +Setting up libyaml-0-2:amd64 (0.2.5-1) ... -Setting up plzip (1.10-3) ... +Setting up plzip (1.10-4) ... -Setting up python3 (3.10.4-1+b1) ... +Setting up python3 (3.10.5-3) ... -Setting up python3-minimal (3.10.4-1+b1) ... +Setting up python3-minimal (3.10.5-3) ... -Setting up xsltproc (1.1.34-4) ... +Setting up xsltproc (1.1.35-1) ...
comment:14 by , 2 years ago
I didn't check how the dpkg_buildpackage is building the package. Is it using the ./configure —enable-lto flag. That's the only thing I can think of that changed between PostGIS 3.2.2 and PostGIS 3.3.0 that might have caused a difference in answers. At anyrate, the difference is so miniscule, that it's safe enough to ignore. Could also be the gcc difference.
comment:15 by , 2 years ago
Confirmed fixed with the changes from 8cff748.
With gcc-12 and without those changes the test still fails on i386.
Disabling LTO also fixes the issue.
strange that function I don't think has changed in a while. This is only on 32-bit right, no issue with the 64-bit? Guessing maybe it's just a tad bit off our floating point tolerance.