Opened 6 years ago

Closed 7 months ago

#2529 closed defect (wontfix)

LWGEOM API for read-only PointArrays is not enforced

Reported by: marcelosoaressouza Owned by: pramsey
Priority: blocker Milestone: PostGIS 3.0.0
Component: postgis Version: master
Keywords: ST_FlipCoordinates, PostGIS, PostgreSQL Cc:

Description

Using Debian GNU/Linux 7.2.0 (amd64), PostgreSQL 9.3.1 and PostGIS 2.2.0dev r12082 (GEOS 3.3.3 and GDAL 1.10.1).

I've created a test Database with PostGIS extension and inserted two shapes (Polygon). Then i call the ST_FlipCoordinates function.

"SELECT id, ST_AsEWKT(ST_FlipCoordinates(shape)) AS ST_FlipCoordinates, ST_AsEWKT(shape) AS Shape FROM pol;"

And i got strange result, ST_FlipCoordinates, a IMMUTABLE function, changed some data in database.

* In the First execution i got

ID 1 (It's Okey)

ST_FlipCoordinates(shape): "SRID=4674;POLYGON((-50.09765625 -2.63578857416661,-59.4140625 -15.4536802243458,-28.30078125 -15.2841851140764,-29.53125 3.68885514314705,-50.09765625 -2.63578857416661))"

shape: "SRID=4674;POLYGON((-2.63578857416661 -50.09765625,-15.4536802243458 -59.4140625,-15.2841851140764 -28.30078125,3.68885514314705 -29.53125,-2.63578857416661 -50.09765625))"

ID 2 (Wrong Result)

ST_FlipCoordinates(shape): "SRID=4674;POLYGON((-8.75479470243561 -24.78515625,-23.2413461023861 -26.015625,-17.4764321971955 -17.578125,-11.5230875068685 -17.2265625,-8.40716816360108 -23.73046875,-8.75479470243561 -24.78515625))"

shape: "SRID=4674;POLYGON((-8.75479470243561 -24.78515625,-23.2413461023861 -26.015625,-17.4764321971955 -17.578125,-11.5230875068685 -17.2265625,-8.40716816360108 -23.73046875,-8.75479470243561 -24.78515625))"

* Second execution i got

ID 1 (Continues Normal)

ST_FlipCoordinates(shape): "SRID=4674;POLYGON((-50.09765625 -2.63578857416661,-59.4140625 -15.4536802243458,-28.30078125 -15.2841851140764,-29.53125 3.68885514314705,-50.09765625 -2.63578857416661))"

shape: "SRID=4674;POLYGON((-2.63578857416661 -50.09765625,-15.4536802243458 -59.4140625,-15.2841851140764 -28.30078125,3.68885514314705 -29.53125,-2.63578857416661 -50.09765625))"

ID 2 (Data Changed permanently and Flipped all data)

ST_FlipCoordinates(shape):"SRID=4674;POLYGON((-24.78515625 -8.75479470243561,-26.015625 -23.2413461023861,-17.578125 -17.4764321971955,-17.2265625 -11.5230875068685,-23.73046875 -8.40716816360108,-24.78515625 -8.75479470243561))"

shape: "SRID=4674;POLYGON((-24.78515625 -8.75479470243561,-26.015625 -23.2413461023861,-17.578125 -17.4764321971955,-17.2265625 -11.5230875068685,-23.73046875 -8.40716816360108,-24.78515625 -8.75479470243561))"

I've tested in PostgreSQL 9.1 and 9.2, with PostGIS 2.0.4 and 2.1.0 too.

Attachments (3)

SQL_Script_ST_FlipCoordinates_Data_Problem.txt (717 bytes) - added by marcelosoaressouza 6 years ago.
SQL Script to Create Database and Insert Shapes
PostgreSQL_Log_ST_FlipCoordinates_Problema.txt (8.3 KB) - added by marcelosoaressouza 6 years ago.
PostgreSQL Log from Execution of ST_FlipCoordinates
ptarray_rw.patch (605 bytes) - added by pramsey 6 years ago.
Guard against writing into core memory

Download all attachments as: .zip

Change History (24)

Changed 6 years ago by marcelosoaressouza

SQL Script to Create Database and Insert Shapes

Changed 6 years ago by marcelosoaressouza

PostgreSQL Log from Execution of ST_FlipCoordinates

comment:1 Changed 6 years ago by robe

Milestone: PostGIS 2.0.5

should be patched in 2.0.5 and hopefully also 2.1.1 (or 2.1.2)

Changed 6 years ago by pramsey

Attachment: ptarray_rw.patch added

Guard against writing into core memory

comment:2 Changed 6 years ago by pramsey

The actual problem is relatively shallow: the flip points code write into the database core memory, using the ptarray_set_point4d to directly manipulate the underlying coordinate storage. Point arrays since 2.0 are outfitted with a flag that provides their read/write status: point arrays that sit on top of core storage are supposed to be read only. The flip coordinates code line ignores that, and we get this bug.

However, if you add in a guard, as in the ptarray_rw_patch and run regression, you find that lots of other functions, important core functions that have been around forever and a day, also do the same thing! They write into core memory. The difference seems to be that they use PG_DETOAST_DATUM_COPY instead of PG_DETOAST_DATUM when they read their inputs. So they get a copy that they can merrily screw around with.

However, this is where we get into an inconsistency. The point arrays read/write flag indicates not only that its OK to alter values in the underlying storage memory, it also indicates then when freed, the point array should free the underlying storage too. This is incompatible with point arrays built on top of database allocated segments. As *both* the PG_DETOAST_DATUM_COPY and PG_DETOAST_DATUM geometries are.

The "correct" fix, which should be applied to 2.2+, would be to update all the memory-altering functions to use PG_DETOAST_DATUM and do an explicit lwgeom_clone_deep call before they start altering their geometries. Or, that functions like lwgeom_transform and lwgeom_flip_coordinates be changed to create new geometries as outputs, not to alter their inputs. (This would also allow us to declare their inputs const, and avoid side-effects, always(?) a good thing)

The interim fix is just to make flip_coordinates use PG_DETOAST_DATUM_COPY instead of PG_DETOAST_DATUM.

comment:3 Changed 6 years ago by pramsey

Milestone: PostGIS 2.0.5PostGIS 2.2.0

Patched in 2.0 at r12089, 2.1 at r12090, trunk at r12091.

API fix needs to be done in trunk still.

comment:4 Changed 6 years ago by pramsey

Summary: Weird behavior in ST_FlipCoordinatesLWGEOM API for read-only PointArrays is not enforced

comment:5 Changed 6 years ago by marcelosoaressouza

Resolution: fixed
Status: newclosed

It worked perfectly! The IBAMA and HexGis? team thanks you immensely

comment:6 Changed 6 years ago by pramsey

Resolution: fixed
Status: closedreopened

Am keeping this open for the API change req'd in 2.2+

comment:7 Changed 6 years ago by robe

We should probably open a separate ticket for that so its easier to log and mark this as a 2.0.5, 2.1.1 change

comment:8 Changed 6 years ago by strk

Branches 2.0 and 2.1 lack a NEWS entry for what's been committed in there

comment:9 Changed 6 years ago by strk

About always copying I made an effort in the past to avoid that, for performance reasons. Probably not a big deal today to make more copies, but is that really needed ?

The lilwgeom header should document which versions write to underlying pointarray and which not, and the caller should pay attention to that. The liblwgeom API includes functions to clone geometries when that's the desired behavior. For example a caller from PostGIS may clone only if PG_DETOAST_DATUM didn't give it a copy already (like PG_FREE_IF_COPY).

If we want to add more "security triggers" we should make the flag capable of expressing more states, like 2 different bits for read/write and owned/to-be-freed

comment:10 Changed 6 years ago by pramsey

Well, we're making copies regardless. We're doing it at the PgSQL level (PG_DETOAST_DATUM_COPY) and then letting liblwgeom mutate that. I don't see why making liblwgeom strict and doing the copies using liblwgeom would be vastly worse than having them done by PgSQL first.

comment:11 Changed 6 years ago by strk

Doing the copy in liblwgeom would make it impossible to avoid a copy, if a caller knows it already has a copy. A PG_DETOAST_DATUM might give you a copy anyway, so if liblwgeom made another you'd end up with two copies for no reason.

comment:12 Changed 6 years ago by pramsey

For practical purposes, though, we have a situation where we are *usually* taking a copy at the start anyways. Clarifying the liblwgeom API and making the read/write guards actually have effect would be beneficial in general. Do we have actual performance use cases where having, for example, lwgeom_transform() always return a fresh geometry, be an insupportable performance penalty? worth not having a clearer API (enforced by const!)

comment:13 Changed 4 years ago by pramsey

Milestone: PostGIS 2.2.0PostGIS 2.3.0

comment:14 Changed 3 years ago by robe

Milestone: PostGIS 2.3.0PostGIS 2.4.0

comment:15 Changed 2 years ago by dbaston

Priority: mediumblocker

comment:16 Changed 2 years ago by dbaston

Let's have these checks be driven by PARANOIA_LEVEL, and then set PARANOIA_LEVEL to zero for production builds.

comment:17 Changed 2 years ago by robe

guys is one of you going to get this done? I really want a beta out this coming week.

comment:18 Changed 2 years ago by robe

Milestone: PostGIS 2.4.0PostGIS 2.5.0

comment:19 Changed 17 months ago by robe

Milestone: PostGIS 2.5.0PostGIS next

comment:20 Changed 17 months ago by robe

Milestone: PostGIS nextPostGIS 3.0.0

Milestone renamed

comment:21 Changed 7 months ago by pramsey

Resolution: wontfix
Status: reopenedclosed

This has been around forever, and I don't feel any impetus to change it... in particular, in-place alteration functions now exist, and detoasting a copy is minimally more efficient than detoasting a non-copy and then cloning it. I'm not sure there's enough guards around to avoid people building on top of other memory and then mucking with it, so it would be sanity theatre rather than effective.

Note: See TracTickets for help on using tickets.