Opened 15 years ago

Closed 15 years ago

#210 closed defect (fixed)

segmentation faults in lwgeom_geos.c:pgis_union_geometry_array

Reported by: dfuhriman Owned by: robe
Priority: high Milestone: PostGIS 1.4.1
Component: postgis Version: master
Keywords: Cc:

Description

Using SVN at revision r4190 with GEOS 3.1.0 and PROJ 4.6.0

In a couple different cases, I've come across a bug that causes a segmentation fault in the server. It seems to happen when a NULL geometry is passed into the ST_Union(function), but I could be wrong about that.

The following query should trip the bug when using the attached data. Replacing the LEFT OUTER JOIN with a simple JOIN works fine (most likely because there is no intersection) between the geometries. FWIW, this query does work with 1.3.6 (built, I believe, against the 3.0.x GEOS line, but also using PROJ 4.6.0).

SELECT T1.gid as taxlot_id,
       coalesce(sqft_of(ST_Union(B2.the_geom)),
                sqft_of(ST_Union(ST_Intersection(T1.the_geom,B1.the_geom)))
                ) as roof_area_sqft
  INTO TABLE example
  FROM test.T1 as T1
      --  JOIN test.B2 as B2 on (st_within(B2.the_geom_centroids,T1.the_geom))
      -- JOIN test.B1 as B1 on (st_intersects(T1.the_geom, B1.the_geom))
        LEFT OUTER JOIN test.B2 as B2 on (st_within(B2.the_geom_centroids,T1.the_geom))
        LEFT OUTER JOIN test.B1  as B1 on (st_intersects(T1.the_geom, B1.the_geom))
        where T1.gid = 1
       GROUP by T1.gid;

the sqft_of() function is just a convenience function defined as:

create or replace function sqft_of(geometry) returns integer as $$
  select
     CASE 
       WHEN  cast(round(ST_Area(ST_Transform($1,102008))*10.76391) as integer) > 0 
         THEN cast(round(ST_Area(ST_Transform($1,102008))*10.76391) as integer)
         ELSE NULL 
       END
  $$ language 'sql' STRICT IMMUTABLE;

(I'm a bit confused about why it's even getting called at all since it's defined as STRICT, and it does seem to get a NULL geometry, but that's a different question.)

Debugging output follows:

NOTICE:  [lwgeom_estimate.c:LWGEOM_gist_joinsel:198] LWGEOM_gist_joinsel called with jointype 1
NOTICE:  LWGEOM_gist_joinsel called with incorrect join type
NOTICE:  [lwgeom_estimate.c:LWGEOM_gist_joinsel:198] LWGEOM_gist_joinsel called with jointype 1
NOTICE:  LWGEOM_gist_joinsel called with incorrect join type
NOTICE:  [lwgeom_gist.c:LWGEOM_overlap:98] GIST: LWGEOM_overlap --entry
NOTICE:  [lwgeom_api.c:getbox2d_p:423] getbox2d_p call
NOTICE:  [lwgeom_api.c:getbox2d_p:430] getbox2d_p: has box
NOTICE:  [lwgeom_api.c:getbox2d_p:423] getbox2d_p call
NOTICE:  [lwgeom_api.c:getbox2d_p:430] getbox2d_p: has box
NOTICE:  [lwgeom_gist.c:LWGEOM_overlap:126] GIST: lwgeom_overlap:
(-122.667847 45.497169, -122.661789 45.497169) (-122.642586 45.457573 -122.640732 45.457573) = 0
NOTICE:  [lwgeom_gist.c:LWGEOM_overlap:98] GIST: LWGEOM_overlap --entry
NOTICE:  [lwgeom_api.c:getbox2d_p:423] getbox2d_p call
NOTICE:  [lwgeom_api.c:getbox2d_p:435] getbox2d_p: has no box - computing
NOTICE:  [lwgeom_api.c:lwgeom_getType:853] lwgeom_getType 65
NOTICE:  [lwgeom_api.c:compute_serialized_box3d:1669] compute_serialized_box3d called on type 1
NOTICE:  [lwpoint.c:lwpoint_deserialize:253] lwpoint_deserialize called
NOTICE:  [lwgeom_api.c:lwgeom_getType:853] lwgeom_getType 65
NOTICE:  [lwpoint.c:lwpoint_deserialize:284] lwpoint_deserialize: input has SRID
NOTICE:  [lwgeom_api.c:pointArray_construct:788] pointArray_construct called.
NOTICE:  [lwgeom_api.c:pointArray_construct:798] pointArray_construct returning 0x33c4e10
NOTICE:  [lwgeom_api.c:compute_serialized_box3d:1687] compute_serialized_box3d: lwpoint deserialized
NOTICE:  [lwpoint.c:lwpoint_compute_box3d:98] lwpoint_compute_box3d called with point 0x33c4de0
NOTICE:  [lwpoint.c:lwpoint_compute_box3d:107] lwpoint_compute_box3d returning ptarray_compute_box3d return
NOTICE:  [ptarray.c:ptarray_compute_box3d_p:408] ptarray_compute_box3d call (array has 1 points)
NOTICE:  [lwgeom_api.c:getPoint3dz_p:582] getPoint3dz_p called on array of 2-dimensions / 1 pts
NOTICE:  [lwgeom_api.c:pointArray_ptsize:811] pointArray_ptsize: TYPE_NDIMS(pa->dims)=2
NOTICE:  [ptarray.c:ptarray_compute_box3d_p:414] got point 0
NOTICE:  [ptarray.c:ptarray_compute_box3d_p:432] scanning other 1 points
NOTICE:  [ptarray.c:ptarray_compute_box3d_p:449] returning box
NOTICE:  [lwgeom_api.c:compute_serialized_box3d:1691] compute_serialized_box3d: bbox found
NOTICE:  [lwgeom_api.c:getbox2d_p:440] getbox2d_p: compute_serialized_box3d returned 0x33816d0
NOTICE:  [lwgeom_api.c:getbox2d_p:444] getbox2d_p: box3d converted to box2d
NOTICE:  [lwgeom_api.c:getbox2d_p:423] getbox2d_p call
NOTICE:  [lwgeom_api.c:getbox2d_p:430] getbox2d_p: has box
NOTICE:  [lwgeom_gist.c:LWGEOM_overlap:126] GIST: lwgeom_overlap:
(-122.687775 45.531441, -122.687767 45.531441) (-122.667847 45.497169 -122.661789 45.497169) = 0
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:119] GEOS incremental union (call 2)
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:131] unite_garray: number of elements: 1
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

Attachments (6)

example.sql.gz (13.9 KB ) - added by dfuhriman 15 years ago.
postgis-nullunion.patch (10.6 KB ) - added by mcayland 15 years ago.
postgis-nullunion-v2.patch (10.5 KB ) - added by mcayland 15 years ago.
postgis-nullunion-v3.patch (10.8 KB ) - added by mcayland 15 years ago.
badapples.zip (160.3 KB ) - added by robe 15 years ago.
example with some invalid geometries
postgis-nullfix.patch (19.8 KB ) - added by mcayland 15 years ago.

Download all attachments as: .zip

Change History (47)

by dfuhriman, 15 years ago

Attachment: example.sql.gz added

comment:1 by dfuhriman, 15 years ago

Version: trunk

comment:2 by robe, 15 years ago

Milestone: postgis 1.4.0postgis 1.3.7

I think this is the same as this bug #179

Sadly I haven't had any success replicating the behavior consistently. Its a long standing issue.

Which version of PostgreSQL are you running by the way?

in reply to:  2 comment:3 by dfuhriman, 15 years ago

Replying to robe:

I think this is the same as this bug #179

Sadly I haven't had any success replicating the behavior consistently. Its a long standing issue.

Hopefully the attached data will help, though like I said, this particular case seems to be fine when using 1.3.6, it was only the update to 1.4.0SVN that caused the problem (possibly because of the new union available in GEOS 3.1?)

Which version of PostgreSQL are you running by the way?

8.3.7 on a CentOS 5 box, x86_64 architecture.

comment:4 by robe, 15 years ago

Testing this on "POSTGIS="1.4.0b1" GEOS="3.1.1-CAPI-1.6.0" PROJ="Rel. 4.6.1, 21 August 2008" USE_STATS" "PostgreSQL 8.3.0, compiled by Visual C++ build 1400"

I got an error because I don't have that SRID in my table you are using. I also got another error that I couldn't replicate something about SRID -1 (can't used mixed SRIDs), but haven't been able repeat it.

So I changed your statement to

SELECT T1.gid as taxlot_id,
       coalesce(st_numgeometries(ST_Union(B2.the_geom)),
                st_numgeometries(ST_Union(ST_Intersection(T1.the_geom,B1.the_geom)))
                ) as roof_area_sqft
  INTO TABLE example2
  FROM test.T1 as T1
      --  JOIN test.B2 as B2 on (st_within(B2.the_geom_centroids,T1.the_geom))
      -- JOIN test.B1 as B1 on (st_intersects(T1.the_geom, B1.the_geom))
        LEFT OUTER JOIN test.B2 as B2 on (st_within(B2.the_geom_centroids,T1.the_geom))
        LEFT OUTER JOIN test.B1  as B1 on (st_intersects(T1.the_geom, B1.the_geom))
        where T1.gid = 1
       GROUP by T1.gid;

Didn't give me an error. Just gives the expected notice NOTICE: LWGEOM_gist_joinsel called with incorrect join type NOTICE: LWGEOM_gist_joinsel called with incorrect join type Query returned successfully with no result in 15 ms.

So like I said I think this is the same issue with ST_MakeLine some strange business going on with how we deal with arrays rather than a GEOS bug. It might be the ordering of the union in new geos makes it hit on your machine for this and maybe if you tried it again it may actually work.

comment:5 by dfuhriman, 15 years ago

Hmm… the SRID -1 is something I saw, too, but couldn't quite figure out, and like you said, doesn't seem to happen consistently. What really strange to me is that sqft_of is being called at all, since it is defined as STRICT and the Intersection on B1 or the Union on B2 do not intersect.

I *think* it's because tranform() was operating on a NULL geometry, unless for some reason union or intersect were returning something with the wrong SRID, which seems unlikely. Unless there's something magic about being called from within a coalesce.

I started simplifying things a bit, and now it seems to only happen on about every third call. When it *does* succeed, the result of st_numgeometries() is NULL (which is what I would expect).

Here's the call I'm using:

          SELECT T1.gid as gid,
                 st_numgeometries(ST_Union(B2.the_geom)) as sqft
            INTO TABLE example
            FROM test.T1 as T1
                  LEFT OUTER JOIN test.B2 as B2 on (st_within(B2.the_geom_centroids,T1.the_geom))
                  where T1.gid = 1
                 GROUP by T1.gid;

Successful calls show this in the log:

NOTICE:  [lwgeom_gist.c:LWGEOM_overlap:126] GIST: lwgeom_overlap:
(-122.687775 45.531441, -122.687767 45.531441) (-122.667847 45.497169 -122.661789 45.497169) = 0
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:119] GEOS incremental union (call 2)
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:131] unite_garray: number of elements: 1
NOTICE:  [lwgeom_api.c:lwgeom_getType:853] lwgeom_getType 0
SELECT

Unsuccessful attempts show the same error as the previous log.

I maybe able to attach a debugger to the server process and see what I can see, though my profound lack of knowledge on the codebase may make that somewhat less useful. :)

comment:6 by robe, 15 years ago

Yap this is definitely a heisenbug of some sort. I think it is the NULL that causes the -1 SRID. What was especially strange is I got it without changing your function and was on the first test. Then the second call it complained about the SRID 102… not being registered in spatial_ref_sys. and each subsequent time complained about the SRID until I just got rid of your sqft from the equation. So seems each call is taking a slightly different path.

I have been able to crash on 8.3.7 but never on 8.3.0.

comment:7 by dfuhriman, 15 years ago

OK, after playing with it yet more, I can reliably get a segfault every third time I run just this select:

create table foo (id serial primary key);
select addGeometryColumn('', 'foo', 'the_geom', 4326,'MULTIPOLYGON', 2);
insert into foo (the_geom) VALUES (NULL);
select  st_numgeometries(ST_Union(the_geom)) from foo;

I played with the debugger a bit, although I can't say I really know how to interpret what I'm seeing, not know the guts of postgres and postgis.

But for instance, I ran the above select three times.

pgis_union_geometry_array returns to finalize_aggregate (in postgres code), which calls datumCopy on the returned value. dataCopy in turn calls datumGetSize, which on the first two tries always returns 0, which seems right (the geom being null and all), but on the third go through, datumGetSize returns a wildly wrong number of 90135632, where upon the SEGV happens on the immediately following palloc().

The strange thing is, it doesn't just *seem* to happen once every three times, the SEGV happens *every* third time I run it.

Anyway, I hope this helps you guys track it down…

comment:8 by pramsey, 15 years ago

Thanks for stripping this down into such a small package. Unfortunately I can't reproduce on 32-bit OS/X. I'll try at home on a 32-bit Linux shortly.

comment:9 by robe, 15 years ago

I'm somewhat convinced that this and #179 if not manifestations of the same disease are in the same family.

Just to test — can everyone try the following two examples. Very watered down. Of course don't do this on a production system unless you don't mind it going down.

SELECT ST_Union(ARRAY[NULL,NULL,NULL,NULL]) ;
SELECT ST_MakeLine(ARRAY[NULL,NULL,NULL,NULL]) ;


In the above cases on my 8.4RC1 windows 32-bit install running PostGIS 1.4b1, GEOS 3.1.1

I get these messages: Union says: ERROR: Unknown geometry type: 0

Make Line says: ERROR: Unknown geometry type: 0 and promptly kills my backend

Note that they both give the same errors and MakeLine does not use GEOS at all. Though the makeline seems to be a more consistent killing machine when it kills but not consistently on all installs.

I ran the same a couple of times on my PostgreSQL 8.3.7 windows 32-bit, PostGIS 1.4b1, GEOS 3.1.1 Union says: ERROR: Unknown geometry type: 0

Make Line says: Returns NULL as expected and gives no error.

comment:10 by robe, 15 years ago

BTW Dfuhriman's example always works fine for me under 8.4RC1, windows 32-bit, PostGIS 1.4b, GEOS 3.1.1

in reply to:  9 comment:11 by dfuhriman, 15 years ago

Replying to robe:

The first always works, the latter always fails.

=# SELECT ST_Union(ARRAY[NULL,NULL,NULL,NULL]) ;
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:119] GEOS incremental union (call 2)
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:131] unite_garray: number of elements: 4
NOTICE:  [lwgeom_geos.c:pgis_union_geometry_array:247] geom 0 @ 0x157ddab0
NOTICE:  [lwgeom_api.c:lwgeom_getType:853] lwgeom_getType 0
NOTICE:  [lwgeom.c:lwgeom_deserialize:26] lwgeom_deserialize got 0 - Unknown
ERROR:  Unknown geometry type: 0

and

# SELECT ST_MakeLine(ARRAY[NULL,NULL,NULL,NULL]) ;
NOTICE:  [lwgeom_functions_basic.c:LWGEOM_makeline_garray:2133] LWGEOM_makeline_garray called.
NOTICE:  [lwgeom_functions_basic.c:LWGEOM_makeline_garray:2148] LWGEOM_makeline_garray: array detoasted
NOTICE:  [lwgeom_functions_basic.c:LWGEOM_makeline_garray:2153] LWGEOM_makeline_garray: array has 4 elements
server closed the connection unexpectedly

8.3.7, 64-bit, Centos 5, 1.4.0SVN

comment:12 by mcayland, 15 years ago

FWIW I can reproduce this here on 8.3.5/Debian Lenny/x86_64 1.4.0SVN, but haven't had a chance to dig into this yet. Will try and dig a bit deeper over the next few days…

ATB,

Mark.

comment:13 by robe, 15 years ago

Owner: changed from pramsey to robe
Status: newassigned

Mark when you say reproduce, you mean just the orginal example or also the

ST_MakeLine one. For the ST_MakeLine I can make it consistently crash on my 8.3.7 EL 5 64-bit install. That was actually how we noticed the issue originally — when we switched from 32-bit EL 5 to 64-bit EL 5 was when the ST_MakeLine started crashing.

But since I can produce that on 32-bit windows, I assume that is not strictly a 64-bit issue though 64-bit may be more sensitive.

SELECT ST_Union(ARRAY[NULL,NULL,NULL,NULL]) ;

That seems to behave consistently across of just giving the ERROR: Unknown geometry type: 0

To me this is a bug. We should never be getting an unknown type error. That should just return NULL like the ST_MakeLine does when its not crashing the server.

comment:14 by mcayland, 15 years ago

Okay - I've worked out what's going on here. What is happening is that the code used to implement the aggregates does not handle NULLs correctly :(

The issue is that NULLs in PostgreSQL arrays are stored in an out-of-array bitmap and so unless you check the bitmap to find out which elements to skip, you go outside of the allocated data area since NULLs *are* included in the element count.

Please find attached a work-in-progress patch against 1.4 that makes ST_Union() NULL safe - can everyone test and make sure I didn't break anything? If it works, I'll do the same to the other aggregates, including ST_MakeLine() which was the cause of one of Regina's failures.

With the attached patch applied I can now do this:

postgis=# select st_numgeometries(ST_Union(the_geom)) from foo;

st_numgeometries


(1 row) postgis=# SELECT ST_Union(ARRAY[NULL,NULL,NULL,NULL]);

st_union


(1 row)

HTH,

Mark.

by mcayland, 15 years ago

Attachment: postgis-nullunion.patch added

comment:15 by robe, 15 years ago

Mark,

Still not quite right.

I can do this on PostgreSQL 8.4 now

SELECt ST_Union(ARRAY[null,null,null,null]) and I correctly get null back.

If I do select ST_Union(Array[ST_GeomFromText('POINT(1 2)'),null,null,null]);

or SELECT ST_Union(the_geom) FROM (VALUES (ST_GeomFromText('POINT(1 2)') ), (null), (null) ) As foo(the_geom)

I still get the same error of:

ERROR: Unknown geometry type: 0

comment:16 by robe, 15 years ago

Mark just to verify I guess you didn't touch the make line at all since you are waiting to make sure ST_Union is perfect before copying the same logic?

SELECt ST_MakeLine(Array[null,null, null, null]);

still crashes on my 8.4 install.

comment:17 by mcayland, 15 years ago

Yeah, I had a feeling I may have missed a few combinations of NULL/non-NULLs since the code has quite few different paths. Thanks for the test case - I'll use these to work out what's going wrong, and work up a new version of the patch.

And yes - I haven't touched any of the other aggregates yet, but the principle should be broadly the same.

ATB,

Mark.

by mcayland, 15 years ago

Attachment: postgis-nullunion-v2.patch added

comment:18 by mcayland, 15 years ago

Okay - looks like the continue from the old code was causing the bitmask to not be updated. Can you try the new v2 patch please? Also, can you confirm that you get the same results from ST_Union() on a normal dataset without NULLs both pre and post patch? Just in case I did manage to break something else at the same time… ;)

Many thanks,

Mark.

comment:19 by pramsey, 15 years ago

Patch v2 works for me (not that OS/X exposed many of these bugs).

comment:20 by pramsey, 15 years ago

Oh, and things still seem to work OK.

postgis14=# select sum(area(the_geom)) from counties;
       sum        
------------------
 1096.08192142873
(1 row)

postgis14=# select area(st_union(the_geom)) from counties;
       area       
------------------
 1096.08192142873
(1 row)

postgis14=# insert into counties (the_geom) values (NULL);
INSERT 0 1
postgis14=# select area(st_union(the_geom)) from counties;
       area       
------------------
 1096.08192142873
(1 row)

comment:21 by mcayland, 15 years ago

Thanks Paul - I'll work up the patch to solve the problem for the remaining aggregates.

ATB,

Mark.

comment:22 by robe, 15 years ago

Mark,

Its better but, I think its probably still flawed. At this point I'm willing to accept this though since its better than what we had before and you have fixed the most offensive of corner cases that causes the crash — though I'm testing on OpenSUSE and OpenSUSE only crashes on the MakeLine one which you have yet to fix. That one I find more serious because it even crashes my 1.3.7 installs and I have had to work around that one.

But since we are talking about the flawedness of the ST_Union patch. To demonstrate:

SELECT ST_Union(the_geom) FROM (SELECT CASE WHEN i%4 = 0 THEN ST_SetSRID(ST_Point(i,j),4326) ELSE NULL END As the_geom FROM generate_series(1,20) As i CROSS JOIN generate_series(-30, 40) As j) As foo;

I get

—-ERROR: Operation on mixed SRID geometries (but this has always been the case since the beginning of time but is still related to this bug I beleive). This is a long standing bug.

Vs.

SELECT ST_Union(ARRAY[ST_SetSRID(ST_Point(10,40), 4326), NULL])

This one works fine.

vs.

SELECT ST_Union(ARRAY[NULL,NULL,ST_SetSRID(ST_Point(10,40), 4326), NULL])

Does not work fine: ERROR: Operation on mixed SRID geometries

Why should one work and the other not?

comment:23 by robe, 15 years ago

More update. This does not return same result as my old install.

I tested on my windows 1.3.6 build against 8.3.7 and this

SELECT ST_AsText(ST_Union(the_geom)) FROM (SELECT CASE WHEN i%4 = 0 THEN ST_SetSRID(ST_Point(i,j),4326) ELSE NULL END As the_geom FROM generate_series(1,20) As i CROSS JOIN generate_series(-30, 40) As j) As foo;

Returns: As I would expect — a MULTIPOINT

— On my PostgreSQL 8.4 Windows 1.4b install strangely it also returns that mixed SRID funky error. But to make sure I was testing a newer version I did retest this on my 1.4b

SELECT ST_Union(the_geom) FROM (VALUES (ST_GeomFromText('POINT(1 2)') ), (null), (null) ) As foo(the_geom)

and it returns the geometry type 0 not defined thing but on your new patch (my OpenSUSE patched - it works)

So the fact that the below is inconsistent between 1.3 and 1.4 means sadly :( more work to be done here.

SELECT ST_AsText(ST_Union(the_geom)) FROM (SELECT CASE WHEN i%4 = 0 THEN ST_SetSRID(ST_Point(i,j),4326) ELSE NULL END As the_geom FROM generate_series(1,20) As i CROSS JOIN generate_series(-30, 40) As j) As foo;

I'm sure I have made this behave irradically on a 1.3 beofre. Perhaps it was a 64-bit and I don't have a non-production one to test with.

by mcayland, 15 years ago

Attachment: postgis-nullunion-v3.patch added

in reply to:  22 comment:24 by mcayland, 15 years ago

Replying to robe:

Regina,

Thanks for the test cases - they were all down to a small change in the SRID detection logic which should now be resolved in v3 of the patch.

Incidentally, the patch also solves the issue whereby the SRID check was not being done in the new CascadedUnion() code - this should now be consistent for both the cascaded/non-casaded versions of the code.

ATB,

Mark.

in reply to:  23 comment:25 by mcayland, 15 years ago

Replying to robe:

More update. This does not return same result as my old install.

Yes, this is correct. In the old code, PostGIS would run off the end of the array for each NULL row encountered by the aggregate. Now depending on what was previously in the array, this would either point to a random geometry from elsewhere in the query, or otherwise to a random part of memory causing a segfault.

So in short, I would expect any query containing NULLs that doesn't crash to return a wrong answer in 1.3 - however what is important is that queries that don't contain NULLs still return the same answer for both 1.3 and 1.4 with this patch.

HTH,

Mark.

comment:26 by robe, 15 years ago

Well I compared my torture test before and after the patch and got the same exact answer so I assume your desire to get the same answer for non-nulls as Paul affirmed is correct.

Okay good I thought I was imagining things for a moment that I had sometimes seen this behavior in 1.3. So I guess we are just seeing it more now because Paul exposed the carpet a little bit and the bugs started flying out. Sorry for the gross graphic, but couldn't resist it. :)

comment:27 by robe, 15 years ago

Mark,

Sorry to do this to you again. Just when I thought I was out of ideas on how to trip you up. The above test works fine now.

I have this dataset I know is bad — its got invalid geometries. Never been able to union it — because it always throws topological errors.

In the 1.4b1 build — for this SELECT ST_NumGeometries(ST_Union(the_geom)) from neighborhoods;

I get this thoughtful message NOTICE: TopologyException: found non-noded intersection between 779062 2.92623e+06, 779069 2.92622e+06 and 779072 2.92622e+06, 778970 2.92625e+06 779068 2.92622e+06

ERROR: Union returned a NULL geometry.

Error

ERROR: Union returned a NULL geometry. SQL state: XX000

In your current incarnation — I just get a null back - no explanation what so ever. It does work of course in both cases if I exclude the invalid geometries.

comment:28 by robe, 15 years ago

Mark,

Sorry I misspoke a little bit. I think it may be fine. I do get a notice but no error. So I get a notice that says

NOTICE: TopologyException: found non-noded intersection between 779062 2.92623e+06, 779069 2.92622e+06 and 779072 2.92622e+06, 778970 2.92625e+06 779068 2.92622e+06 (but forgot to switch to the notice tab of PgAdmin to see)

In the old model on 1.3 I would get a notice saying GEOS topological exception accompanied with a ERROR: GEOS union() threw an error!

In the PostGIS 1.4b1 I would get an error saying returned null geometry in addition to the topological notice. In this patch I just get the topological notice — no error which hmm in some cases I like better.

So I guess I'm not sure what to consider this. It is slightly different behavior, but perhaps acceptable.

I'll attach the file in question.

by robe, 15 years ago

Attachment: badapples.zip added

example with some invalid geometries

in reply to:  28 comment:29 by mcayland, 15 years ago

Replying to robe:

Mark,

Sorry I misspoke a little bit. I think it may be fine. I do get a notice but no error. So I get a notice that says

NOTICE: TopologyException: found non-noded intersection between 779062 2.92623e+06, 779069 2.92622e+06 and 779072 2.92622e+06, 778970 2.92625e+06 779068 2.92622e+06 (but forgot to switch to the notice tab of PgAdmin to see)

In the old model on 1.3 I would get a notice saying GEOS topological exception accompanied with a ERROR: GEOS union() threw an error!

Hmmm. Well this bit I didn't change in the patch attached here…

In the PostGIS 1.4b1 I would get an error saying returned null geometry in addition to the topological notice. In this patch I just get the topological notice — no error which hmm in some cases I like better.

So I guess I'm not sure what to consider this. It is slightly different behavior, but perhaps acceptable.

Yeah I had to change this. The old code assumed that ST_Union() returning NULL was an error, but of course NULL is actually valid return value when you perform f(NULL) on a set of NULLs…

ATB,

Mark.

comment:30 by robe, 15 years ago

Okay so the Geos threw an Error being missing is probably an artifact of the new cascading union feature —- should we consider that a bug? Its different behavior from 1.3

The 1.3 I was testing on is also using GEOS 3.1.1 so I think we can rule GEOS out as the culprit. I'm okay to let this slide if everyone else is.

comment:31 by pramsey, 15 years ago

Sounds Good Enough to me.

by mcayland, 15 years ago

Attachment: postgis-nullfix.patch added

comment:32 by mcayland, 15 years ago

Okay - here is a reasonably complete patch which should fix the following aggregates/functions in terms of NULLs:

ST_Union() / ST_Union( array ) ST_Collect() / ST_Collect ( array ) ST_Polygonize() / ST_Polygonize ( array ) ST_MakeLine() / ST_MakeLine ( array )

Regina/Paul, can you test at your end? I've done some quick tests here, however I'm not 100% confident enough to commit yet. Basically inputs with no nulls should match that from 1.3, while inputs with varying combinations of NULLs (and on a table with a column containing NULLs as this is a different code path) should just now return NULL.

ATB,

Mark.

comment:33 by robe, 15 years ago

Mr,

I assume I misread you here. So we are saying something like ST_Agg(allnulls) should return NULL now rather than crash, but

ST_Agg(somenulls) may or may not return NULL? e.g. in case of makeline a NULL, point would return null, but in the case of union Null, Point would return the point?

Well I assume that is the behavior people expect. Just want to verify since your comment suggested you would return NULL if any combination of NULLS is input.

comment:34 by pramsey, 15 years ago

All tests checkout on my end (OS/X), Mark. But I'd wait until the Marquesse de Sade of testing finishes working over your patch. :)

comment:35 by robe, 15 years ago

Mark,

I'm happy with this one for at least an RC. I guess I misread your message thank goodness. I get same results for old as I did before and my all null make line no longer crashes.

There is one example that is different from the old - and the error is not because of this patch, but of course because of the error checking we are doing. I personally think both the 1.3 and the 1.4 answers are wrong in this case, but as to whether the 1.4 answer is wrong is debatable.

If you call ST_MakeLine on a single point which could happen if you have a 1 row table.

In 1.3 — you get a magical one pointed line: SELECT ST_AsText(ST_MakeLine(ST_GeomFromTExt('POINT(1 1)'))) ;

LINESTRING(1 1)

yes I thought it was very cool too. Many budding geometry Frankensteins will be unhappy they can no longer pull this off.

In 1.4 you get ERROR: geometry requires more points

I personally think if you have one point in a make line — it should just return NULL, but I'm not strongly feeling about this except for the fact it will break code in cases where only one record has been put in.

Cool I can still create a line with identical points which is same behavior as 1.3.

SELECT ST_IsValid(ST_MakeLine(the_geom)) FROM (VALUES (ST_GeomFromTExt('POINT(1 1)')), (ST_GeomFromTExt('POINT(1 1)') ) ) As foo(the_geom) ;

I think this patch may be a winner so go ahead and commit it if Paul is okay.

comment:36 by pramsey, 15 years ago

Resolution: fixed
Status: assignedclosed

Applied v4 patch to trunk (r4238) and 1.4 (r4237).

comment:37 by robe, 15 years ago

Milestone: postgis 1.3.7postgis 1.4.1
Priority: mediumhigh
Resolution: fixed
Status: closedreopened

I have to retest this on 1.4, but I had changed the garden test to test for nulls, and it managed to crash server with this. The annoying thing is then I couldn't replicate it without performing the same execise 5 times.

SELECT ST_AsEWKT(ST_Collect(foo1.the_geom))

FROM ((SELECT CAST(Null As geometry) As the_geom)) As foo1 LIMIT 3;

Anyrate it seems that this is still an issue

comment:38 by mcayland, 15 years ago

Yuck. There was another non-NULL safe array iteration within LWGEOM_collect_garray which should now be fixed as of r4603 for 1.4 branch and r4604 for trunk. Please test and confirm that the bug disappears for you.

ATB,

Mark.

comment:39 by robe, 15 years ago

Okay that seems to work for me on trunk. Still have to test 1.4

comment:40 by mcayland, 15 years ago

I think this should be okay, as it's exactly the same patch applied to both. Feel free to close when you're happy with it.

ATB,

Mark.

comment:41 by robe, 15 years ago

Resolution: fixed
Status: reopenedclosed

Okay this one seems to work fine on my 1.4.1svn install too. Something is bugging me though. Wasn't Nicklas still able to make this break with ST_MakeLine (I mean before this patch, and I can't recreate that) in my tests. Perhaps I was mistaken. will have to check the other bug ticket.

Note: See TracTickets for help on using tickets.