Opened 6 months ago

Closed 4 months ago

Last modified 4 months ago

#5157 closed defect (fixed)

basic_test failure on i386

Reported by: Bas Couwenberg Owned by: strk
Priority: blocker Milestone: PostGIS 3.3.0
Component: liblwgeom Version: master
Keywords: Cc: Bas Couwenberg

Description

The Debian package build for 3.3.0-alpha1 failed on i386 due to test failures:

 Suite: minimum_bounding_circle
   Test: basic_test ...FAILED
     1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)
   Test: test_empty ...passed

Full buildlogs:

Change History (15)

comment:1 by robe, 6 months ago

strange that function I don't think has changed in a while. This is only on 32-bit right, no issue with the 64-bit? Guessing maybe it's just a tad bit off our floating point tolerance.

comment:2 by Bas Couwenberg, 6 months ago

Only i386, amd64 and others are fine, see:

https://buildd.debian.org/status/package.php?p=postgis&suite=experimental

While the test may not have changed, dependencies have, e.g. PostgreSQL from 14.2 to 14.3.

comment:3 by Bas Couwenberg, 5 months ago

Still fails in 3.3.0beta1:

  Test: basic_test ...FAILED
    1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)

Full buildlog

Should we consider i386 no longer support and ignore test failures there?

Last edited 5 months ago by Bas Couwenberg (previous) (diff)

comment:4 by Bas Couwenberg, 5 months ago

No change in 3.3.0beta2:

  Test: basic_test ...FAILED
    1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)

Full buildlog

in reply to:  3 comment:5 by robe, 4 months ago

Replying to Bas Couwenberg:

Still fails in 3.3.0beta1:

  Test: basic_test ...FAILED
    1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)

Full buildlog

Should we consider i386 no longer support and ignore test failures there?

Maybe in PostGIS 3.4.0, but we should fix in 3.3.0 as we still have 32-bit bots and never said we wouldn't support.

comment:6 by robe, 4 months ago

Priority: mediumblocker

comment:7 by robe, 4 months ago

Looking at this more, seems something else is going on besides 32-bit.

We have a 32-bit freebsd ci bot and a raspberry-pi 32-bit bot and neither are failing on this test or any of the other tests. It could also be the geos version I suppose since minimum_bounding_circle is a GEOS function. I don't think PostGIS does anything aside from expose it. There have been a number of issues with GEOS on ARM 32-bit and PPC like this one in GEOS - https://github.com/libgeos/geos/issues/579

I see this 32-bit is running GEOS 3.10.2. Would be curious if 3.10.3 (https://libgeos.org/posts/2022-06-03-geos-3-10-3-released/) exhibits the same issue.

What is the gcc of this 32-bit and is it like a regular i386 or an arm or ppc or some other thing? Yours I see:

Run Summary:    Type  Total    Ran Passed Failed Inactive
              suites     47     47    n/a      0        0
               tests    333    333    332      1        0
             asserts   5396   5396   5395      1      n/a

For reference the 32-bit cis we have are running the following:

  1. berrie (32-bit Rasberry PI armv7l-unknown-linux-gnueabihf,gcc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110, 32-bit) - running GEOS 3.11.0 (well she follows GEOS branches, so she is not running tagged, but the geos 3.11 branch), output on last 3.3 run is
Run Summary:    Type  Total    Ran Passed Failed Inactive
             suites     46     46    n/a      0        0
             tests    332    332    332      0        0
             asserts   5381   5381   5381      0      n/a
  1. bessie32 - i386-portbld-freebsd12.3, compiled by gcc10 (FreeBSD Ports Collection) 10.3.0, 32-bit GEOS: 3.10.2-CAPI-1.16.0 (although, I got to check why one more tests on bessie32, I suspect is because we have protobuf disabled on berrie). If it were a plain geos 32-bit issue though, I'd expect bessie32 to be failing as well since she's running the same GEOS version as your debian 32-bit - as you can see looks to be running all the same tests (except for one additional assert)
Run Summary:    Type  Total    Ran Passed Failed Inactive
               suites     47     47    n/a      0        0
               tests    333    333    333      0        0
               asserts   5397   5397   5397      0      n/a
  1. Cirrhus CI 32-bit (FreeBSD 12.3) - this one is failing completely for other reasons so out of the running at the moment.

comment:8 by robe, 4 months ago

Okay I think I answered some of my questions looking closer at your build.

(Debian 14.3-1+b1) on i686-pc-linux-gnu, compiled by gcc (Debian 11.3.0-1) 11.3.0, 32-bit

So it's a plain old i386 32-bit running Debian 11, so should be very similar to berrie except for a newer Debian/gcc. I'll see if I can replicate on a similar setup and see what the discrepancy is with the output.

comment:9 by Bas Couwenberg, 4 months ago

You don't need an i386 system to reproduce this issue, you can also use an i386 chroot on amd64.

See the comment in #5014 where the process to create and use such a chroot is documented.

comment:10 by Bas Couwenberg, 4 months ago

3.2.2 does not have this issue.

in reply to:  10 ; comment:11 by robe, 4 months ago

Replying to Bas Couwenberg:

3.2.2 does not have this issue.

Dependencies are all the same? I didn't think any of that code has changed.

Anyrate I was able to replicate the issue using your beta2 package and dpkg_buildpackage.

I put in a debug line to see the difference:

char *msg1 = "mbc_test failed (got %.12f expected %.12f) \n";

if ( d > result->radius){
      printf(msg1, d, result->radius);
}

output was:

Suite: minimum_bounding_circle
  Test: basic_test ...mbc_test failed (got 247.436045591407 expected 247.436045591404)
FAILED
    1. cu_minimum_bounding_circle.c:38  - CU_ASSERT_TRUE(d <= result->radius)

So I think maybe I should just change the test, cause the difference is so insignificant not to be a concern.

comment:12 by Regina Obe <lr@…>, 4 months ago

Resolution: fixed
Status: newclosed

In 8cff748/git:

Minimum Bounding Circle regress failure on i386 Debian
Revise test to ignore micro floating point difference
Closes #5157 for PostGIS 3.3.0

in reply to:  11 comment:13 by Bas Couwenberg, 4 months ago

Replying to robe:

Replying to Bas Couwenberg:

3.2.2 does not have this issue.

Dependencies are all the same? I didn't think any of that code has changed.

gcc-12 is now default, everything else is mostly the same:

--- /tmp/postgis-3.3.0.deps     2022-07-24 07:14:46.896921001 +0200
+++ /tmp/postgis-3.2.2.deps     2022-07-24 07:14:59.852608988 +0200
-Setting up bsdextrautils (2.38-4) ...
+Setting up bsdextrautils (2.38-5) ...
-Setting up lib32gcc-s1 (12.1.0-5) ...
+Setting up lib32gcc-s1 (12.1.0-7) ...
-Setting up lib32stdc++6 (12.1.0-5) ...
+Setting up lib32stdc++6 (12.1.0-7) ...
-Setting up libblosc1:amd64 (1.21.1+ds2-2) ...
+Setting up libblosc1:amd64 (1.21.1+ds2-3) ...
-Setting up libblosc-dev (1.21.1+ds2-2) ...
+Setting up libblosc-dev:amd64 (1.21.1+ds2-3) ...
-Setting up libgfortran5:amd64 (12.1.0-5) ...
+Setting up libgfortran5:amd64 (12.1.0-7) ...
-Setting up liblcms2-2:amd64 (2.12~rc1-2) ...
+Setting up liblcms2-2:amd64 (2.13.1-1) ...
-Setting up liblz1:amd64 (1.13-3) ...
+Setting up liblz1:amd64 (1.13-4) ...
-Setting up libobjc-11-dev:amd64 (11.3.0-4) ...
+Setting up libobjc-11-dev:amd64 (11.3.0-5) ...
-Setting up libobjc4:amd64 (12.1.0-5) ...
+Setting up libobjc4:amd64 (12.1.0-7) ...
-Setting up libpython3-stdlib:amd64 (3.10.4-1+b1) ...
+Setting up libpython3-stdlib:amd64 (3.10.5-3) ...
-Setting up libsqlite3-dev:amd64 (3.39.0-2) ...
+Setting up libsqlite3-dev:amd64 (3.39.2-1) ...
+Setting up libstdc++-11-dev:amd64 (11.3.0-5) ...
-Setting up libwww-mechanize-perl (2.10-1) ...
+Setting up libwww-mechanize-perl (2.12-1) ...
-Setting up libxslt1.1:amd64 (1.1.34-4) ...
+Setting up libxslt1.1:amd64 (1.1.35-1) ...
-Setting up libyaml-0-2:amd64 (0.2.2-1) ...
+Setting up libyaml-0-2:amd64 (0.2.5-1) ...
-Setting up plzip (1.10-3) ...
+Setting up plzip (1.10-4) ...
-Setting up python3 (3.10.4-1+b1) ...
+Setting up python3 (3.10.5-3) ...
-Setting up python3-minimal (3.10.4-1+b1) ...
+Setting up python3-minimal (3.10.5-3) ...
-Setting up xsltproc (1.1.34-4) ...
+Setting up xsltproc (1.1.35-1) ...

comment:14 by robe, 4 months ago

I didn't check how the dpkg_buildpackage is building the package. Is it using the ./configure —enable-lto flag. That's the only thing I can think of that changed between PostGIS 3.2.2 and PostGIS 3.3.0 that might have caused a difference in answers. At anyrate, the difference is so miniscule, that it's safe enough to ignore. Could also be the gcc difference.

Last edited 4 months ago by robe (previous) (diff)

comment:15 by Bas Couwenberg, 4 months ago

Confirmed fixed with the changes from 8cff748.

With gcc-12 and without those changes the test still fails on i386.

Disabling LTO also fixes the issue.

Note: See TracTickets for help on using tickets.