Opened 23 months ago

Closed 20 months ago

Last modified 20 months ago

#5157 closed defect (fixed)

basic_test failure on i386

Reported by: Bas Couwenberg Owned by: strk
Priority: blocker Milestone: PostGIS 3.3.0
Component: liblwgeom Version: master
Keywords: Cc: Bas Couwenberg

Description

The Debian package build for 3.3.0-alpha1 failed on i386 due to test failures:

 Suite: minimum_bounding_circle
   Test: basic_test ...FAILED
     1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)
   Test: test_empty ...passed

Full buildlogs:

Change History (15)

comment:1 by robe, 23 months ago

strange that function I don't think has changed in a while. This is only on 32-bit right, no issue with the 64-bit? Guessing maybe it's just a tad bit off our floating point tolerance.

comment:2 by Bas Couwenberg, 23 months ago

Only i386, amd64 and others are fine, see:

https://buildd.debian.org/status/package.php?p=postgis&suite=experimental

While the test may not have changed, dependencies have, e.g. PostgreSQL from 14.2 to 14.3.

comment:3 by Bas Couwenberg, 21 months ago

Still fails in 3.3.0beta1:

  Test: basic_test ...FAILED
    1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)

Should we consider i386 no longer support and ignore test failures there?

Version 0, edited 21 months ago by Bas Couwenberg (next)

comment:4 by Bas Couwenberg, 21 months ago

No change in 3.3.0beta2:

  Test: basic_test ...FAILED
    1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)

Full buildlog

in reply to:  3 comment:5 by robe, 21 months ago

Replying to Bas Couwenberg:

Still fails in 3.3.0beta1:

  Test: basic_test ...FAILED
    1. cu_minimum_bounding_circle.c:35  - CU_ASSERT_TRUE(d <= result->radius)

Full buildlog

Should we consider i386 no longer support and ignore test failures there?

Maybe in PostGIS 3.4.0, but we should fix in 3.3.0 as we still have 32-bit bots and never said we wouldn't support.

comment:6 by robe, 21 months ago

Priority: mediumblocker

comment:7 by robe, 21 months ago

Looking at this more, seems something else is going on besides 32-bit.

We have a 32-bit freebsd ci bot and a raspberry-pi 32-bit bot and neither are failing on this test or any of the other tests. It could also be the geos version I suppose since minimum_bounding_circle is a GEOS function. I don't think PostGIS does anything aside from expose it. There have been a number of issues with GEOS on ARM 32-bit and PPC like this one in GEOS - https://github.com/libgeos/geos/issues/579

I see this 32-bit is running GEOS 3.10.2. Would be curious if 3.10.3 (https://libgeos.org/posts/2022-06-03-geos-3-10-3-released/) exhibits the same issue.

What is the gcc of this 32-bit and is it like a regular i386 or an arm or ppc or some other thing? Yours I see:

Run Summary:    Type  Total    Ran Passed Failed Inactive
              suites     47     47    n/a      0        0
               tests    333    333    332      1        0
             asserts   5396   5396   5395      1      n/a

For reference the 32-bit cis we have are running the following:

  1. berrie (32-bit Rasberry PI armv7l-unknown-linux-gnueabihf,gcc (Raspbian 10.2.1-6+rpi1) 10.2.1 20210110, 32-bit) - running GEOS 3.11.0 (well she follows GEOS branches, so she is not running tagged, but the geos 3.11 branch), output on last 3.3 run is
Run Summary:    Type  Total    Ran Passed Failed Inactive
             suites     46     46    n/a      0        0
             tests    332    332    332      0        0
             asserts   5381   5381   5381      0      n/a
  1. bessie32 - i386-portbld-freebsd12.3, compiled by gcc10 (FreeBSD Ports Collection) 10.3.0, 32-bit GEOS: 3.10.2-CAPI-1.16.0 (although, I got to check why one more tests on bessie32, I suspect is because we have protobuf disabled on berrie). If it were a plain geos 32-bit issue though, I'd expect bessie32 to be failing as well since she's running the same GEOS version as your debian 32-bit - as you can see looks to be running all the same tests (except for one additional assert)
Run Summary:    Type  Total    Ran Passed Failed Inactive
               suites     47     47    n/a      0        0
               tests    333    333    333      0        0
               asserts   5397   5397   5397      0      n/a
  1. Cirrhus CI 32-bit (FreeBSD 12.3) - this one is failing completely for other reasons so out of the running at the moment.

comment:8 by robe, 21 months ago

Okay I think I answered some of my questions looking closer at your build.

(Debian 14.3-1+b1) on i686-pc-linux-gnu, compiled by gcc (Debian 11.3.0-1) 11.3.0, 32-bit

So it's a plain old i386 32-bit running Debian 11, so should be very similar to berrie except for a newer Debian/gcc. I'll see if I can replicate on a similar setup and see what the discrepancy is with the output.

comment:9 by Bas Couwenberg, 21 months ago

You don't need an i386 system to reproduce this issue, you can also use an i386 chroot on amd64.

See the comment in #5014 where the process to create and use such a chroot is documented.

comment:10 by Bas Couwenberg, 20 months ago

3.2.2 does not have this issue.

in reply to:  10 ; comment:11 by robe, 20 months ago

Replying to Bas Couwenberg:

3.2.2 does not have this issue.

Dependencies are all the same? I didn't think any of that code has changed.

Anyrate I was able to replicate the issue using your beta2 package and dpkg_buildpackage.

I put in a debug line to see the difference:

char *msg1 = "mbc_test failed (got %.12f expected %.12f) \n";

if ( d > result->radius){
      printf(msg1, d, result->radius);
}

output was:

Suite: minimum_bounding_circle
  Test: basic_test ...mbc_test failed (got 247.436045591407 expected 247.436045591404)
FAILED
    1. cu_minimum_bounding_circle.c:38  - CU_ASSERT_TRUE(d <= result->radius)

So I think maybe I should just change the test, cause the difference is so insignificant not to be a concern.

comment:12 by Regina Obe <lr@…>, 20 months ago

Resolution: fixed
Status: newclosed

In 8cff748/git:

Minimum Bounding Circle regress failure on i386 Debian
Revise test to ignore micro floating point difference
Closes #5157 for PostGIS 3.3.0

in reply to:  11 comment:13 by Bas Couwenberg, 20 months ago

Replying to robe:

Replying to Bas Couwenberg:

3.2.2 does not have this issue.

Dependencies are all the same? I didn't think any of that code has changed.

gcc-12 is now default, everything else is mostly the same:

--- /tmp/postgis-3.3.0.deps     2022-07-24 07:14:46.896921001 +0200
+++ /tmp/postgis-3.2.2.deps     2022-07-24 07:14:59.852608988 +0200
-Setting up bsdextrautils (2.38-4) ...
+Setting up bsdextrautils (2.38-5) ...
-Setting up lib32gcc-s1 (12.1.0-5) ...
+Setting up lib32gcc-s1 (12.1.0-7) ...
-Setting up lib32stdc++6 (12.1.0-5) ...
+Setting up lib32stdc++6 (12.1.0-7) ...
-Setting up libblosc1:amd64 (1.21.1+ds2-2) ...
+Setting up libblosc1:amd64 (1.21.1+ds2-3) ...
-Setting up libblosc-dev (1.21.1+ds2-2) ...
+Setting up libblosc-dev:amd64 (1.21.1+ds2-3) ...
-Setting up libgfortran5:amd64 (12.1.0-5) ...
+Setting up libgfortran5:amd64 (12.1.0-7) ...
-Setting up liblcms2-2:amd64 (2.12~rc1-2) ...
+Setting up liblcms2-2:amd64 (2.13.1-1) ...
-Setting up liblz1:amd64 (1.13-3) ...
+Setting up liblz1:amd64 (1.13-4) ...
-Setting up libobjc-11-dev:amd64 (11.3.0-4) ...
+Setting up libobjc-11-dev:amd64 (11.3.0-5) ...
-Setting up libobjc4:amd64 (12.1.0-5) ...
+Setting up libobjc4:amd64 (12.1.0-7) ...
-Setting up libpython3-stdlib:amd64 (3.10.4-1+b1) ...
+Setting up libpython3-stdlib:amd64 (3.10.5-3) ...
-Setting up libsqlite3-dev:amd64 (3.39.0-2) ...
+Setting up libsqlite3-dev:amd64 (3.39.2-1) ...
+Setting up libstdc++-11-dev:amd64 (11.3.0-5) ...
-Setting up libwww-mechanize-perl (2.10-1) ...
+Setting up libwww-mechanize-perl (2.12-1) ...
-Setting up libxslt1.1:amd64 (1.1.34-4) ...
+Setting up libxslt1.1:amd64 (1.1.35-1) ...
-Setting up libyaml-0-2:amd64 (0.2.2-1) ...
+Setting up libyaml-0-2:amd64 (0.2.5-1) ...
-Setting up plzip (1.10-3) ...
+Setting up plzip (1.10-4) ...
-Setting up python3 (3.10.4-1+b1) ...
+Setting up python3 (3.10.5-3) ...
-Setting up python3-minimal (3.10.4-1+b1) ...
+Setting up python3-minimal (3.10.5-3) ...
-Setting up xsltproc (1.1.34-4) ...
+Setting up xsltproc (1.1.35-1) ...

comment:14 by robe, 20 months ago

I didn't check how the dpkg_buildpackage is building the package. Is it using the ./configure —enable-lto flag. That's the only thing I can think of that changed between PostGIS 3.2.2 and PostGIS 3.3.0 that might have caused a difference in answers. At anyrate, the difference is so miniscule, that it's safe enough to ignore. Could also be the gcc difference.

Last edited 20 months ago by robe (previous) (diff)

comment:15 by Bas Couwenberg, 20 months ago

Confirmed fixed with the changes from 8cff748.

With gcc-12 and without those changes the test still fails on i386.

Disabling LTO also fixes the issue.

Note: See TracTickets for help on using tickets.