Opened 14 months ago

Closed 13 months ago

Last modified 12 months ago

#5365 closed defect (fixed)

interrupt tests fail due to change in error message

Reported by: strk Owned by: robe
Priority: medium Milestone: PostGIS 3.4.0
Component: QA/buildbots Version: master
Keywords: Cc: pramsey

Description

I get these failures, when not disabling interrupt tests:

-----------------------------------------------------------------------------
 regress/core/interrupt_buffer .. failed (diff expected obtained: /tmp/pgis_reg/test_122_diff)
-----------------------------------------------------------------------------
--- /home/src/postgis/postgis/regress/core/interrupt_buffer_expected    2023-02-18 09:11:00.360153532 +0100
+++ /tmp/pgis_reg/test_122_out  2023-04-08 00:00:07.769328100 +0200
@@ -1,3 +1,3 @@
-ERROR:  canceling statement due to statement timeout
+ERROR:  canceling statement due to user request
 buffer interrupted on time
 1|5
-----------------------------------------------------------------------------

And many other similar subtle changes.

Also in regress/core/interrupt_relate and in regress/core/interrupt where the error message is slightly different (liblwgeom does not even try to loop for too long in that case now):

regress/core/interrupt .. failed (diff expected obtained: /tmp/pgis_reg/test_120_diff)


—- /home/src/postgis/postgis/regress/core/interrupt_expected 2023-02-18 09:11:00.360153532 +0100 +++ /tmp/pgis_reg/test_120_out 2023-04-08 00:00:06.593328067 +0200 @@ -1,4 +1,3 @@ -NOTICE: liblwgeom code interrupted -ERROR: canceling statement due to statement timeout +NOTICE: .././../liblwgeom/ptarray.c:449 - ptarray_segmentize2d: Too many segments required (3.605551e+100)

segmentize interrupted on time 1|LINESTRING(0 0,2 0,4 0)


I'm using the "buildbots" component becuase we don't seem to have a "testsuite" component (we probably should). I don't think there's any buildbot currently running these interrupt tests.

Change History (5)

comment:1 by strk, 13 months ago

The statement timeout error comes from PostgreSQL itself:

src/backend/tcop/postgres.c:3368: errmsg("canceling statement due to statement timeout")));

I think the error changed with commit [17708aa3c2cd5079b2b26f1befad43ec30036aa6/git] when a ! strstr was changed to a strstr conditional in this block:


/* Return NULL on GEOS error
 *
 * Prints error message only if it was not for interruption, in which
 * case we let PostgreSQL deal with the error.
 */
#define HANDLE_GEOS_ERROR(label) \
  { \
    if (strstr(lwgeom_geos_errmsg, "InterruptedException")) \
      ereport(ERROR, \
        (errcode(ERRCODE_QUERY_CANCELED), errmsg("canceling statement due to user request"))); \
    else \
      lwpgerror("%s: %s", (label), lwgeom_geos_errmsg); \
    PG_RETURN_NULL(); \
  }

The commit referenced #4070

Honestly I don't know if we can distinguish "user request" from "timeout", but I suspect the PostgreSQL error code would know it. For sure the testcase uses a timeout, which may not be the same thing as a "user request" (C).

In any case, that commit changed the error message without updating the tests, which ended up under the carpet when CI failed to consistently succeed with them (due to real-time clock not being predictable with CI).

comment:2 by strk, 13 months ago

Cc: pramsey added

comment:3 by strk, 13 months ago

I think the key here is "let PostgreSQL deal with the error", meaning IFF the geos error message is an interruption, we should do that, as a geos interruption is supposed to ONLY happen upon request from PostgreSQL, be it a timeout or a user request.

So the ! strstr condition was correct, in that it ONLY reported the GEOS error message when the GEOS exception was NOT an interrupt request.

What caused #4070 isn't clear to me, maybe pg_cancel_backend() triggers events in some unexpected order.

comment:4 by Sandro Santilli <strk@…>, 13 months ago

Resolution: fixed
Status: newclosed

In 1d263dc2/git:

Let PostgreSQL write the interruption error message

Closes #5365
Reverts 17708aa3c2

comment:5 by Sandro Santilli <strk@…>, 12 months ago

In 4d851fb/git:

Let PostgreSQL write the interruption error message

References #5365 in stable-3.3 branch (3.3.4dev)

Tweak segmentize interrupt test

Note: See TracTickets for help on using tickets.