#5365 closed defect (fixed)
interrupt tests fail due to change in error message
Reported by: | strk | Owned by: | robe |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.4.0 |
Component: | QA/buildbots | Version: | master |
Keywords: | Cc: | pramsey |
Description
I get these failures, when not disabling interrupt tests:
----------------------------------------------------------------------------- regress/core/interrupt_buffer .. failed (diff expected obtained: /tmp/pgis_reg/test_122_diff) ----------------------------------------------------------------------------- --- /home/src/postgis/postgis/regress/core/interrupt_buffer_expected 2023-02-18 09:11:00.360153532 +0100 +++ /tmp/pgis_reg/test_122_out 2023-04-08 00:00:07.769328100 +0200 @@ -1,3 +1,3 @@ -ERROR: canceling statement due to statement timeout +ERROR: canceling statement due to user request buffer interrupted on time 1|5 -----------------------------------------------------------------------------
And many other similar subtle changes.
Also in regress/core/interrupt_relate and in regress/core/interrupt where the error message is slightly different (liblwgeom does not even try to loop for too long in that case now):
regress/core/interrupt .. failed (diff expected obtained: /tmp/pgis_reg/test_120_diff)
—- /home/src/postgis/postgis/regress/core/interrupt_expected 2023-02-18 09:11:00.360153532 +0100 +++ /tmp/pgis_reg/test_120_out 2023-04-08 00:00:06.593328067 +0200 @@ -1,4 +1,3 @@ -NOTICE: liblwgeom code interrupted -ERROR: canceling statement due to statement timeout +NOTICE: .././../liblwgeom/ptarray.c:449 - ptarray_segmentize2d: Too many segments required (3.605551e+100)
segmentize interrupted on time 1|LINESTRING(0 0,2 0,4 0)
I'm using the "buildbots" component becuase we don't seem to have a "testsuite" component (we probably should). I don't think there's any buildbot currently running these interrupt tests.
Change History (5)
comment:1 by , 20 months ago
comment:2 by , 20 months ago
Cc: | added |
---|
comment:3 by , 20 months ago
I think the key here is "let PostgreSQL deal with the error", meaning IFF the geos error message is an interruption, we should do that, as a geos interruption is supposed to ONLY happen upon request from PostgreSQL, be it a timeout or a user request.
So the ! strstr
condition was correct, in that it ONLY reported the GEOS error message when the GEOS exception was NOT an interrupt request.
What caused #4070 isn't clear to me, maybe pg_cancel_backend() triggers events in some unexpected order.
The statement timeout error comes from PostgreSQL itself:
I think the error changed with commit [17708aa3c2cd5079b2b26f1befad43ec30036aa6/git] when a
! strstr
was changed to astrstr
conditional in this block:The commit referenced #4070
Honestly I don't know if we can distinguish "user request" from "timeout", but I suspect the PostgreSQL error code would know it. For sure the testcase uses a timeout, which may not be the same thing as a "user request" (C).
In any case, that commit changed the error message without updating the tests, which ended up under the carpet when CI failed to consistently succeed with them (due to real-time clock not being predictable with CI).