Opened 8 months ago

Closed 3 months ago

#5697 closed defect (fixed)

fix or document id escaping for ST_EstimatedExtent params

Reported by: nyurik Owned by: pramsey
Priority: medium Milestone: PostGIS 3.4.3
Component: postgis Version: 3.4.x
Keywords: Cc:

Description (last modified by nyurik)

Function ST_EstimatedExtent - https://postgis.net/docs/ST_EstimatedExtent.html has an usual param pattern - it requires schema, table, and geometry columns to be strings in some uncertain escaping - later it gets used directly without quotes in an internal schema/table identity lookup. (uncertain)

As the result, it is not clear what escaping is needed in "complex" cases. Moreover, I suspect there might be a security bug here too, but not certain.

I could not get this function to run for this (convoluted) case.

CREATE SCHEMA """Quotes' and Space.Dot.";

CREATE TABLE """Quotes' and Space.Dot.".". Points"" 'quote"
(
    "' id '"      SERIAL PRIMARY KEY,
    ".namE "      TEXT,
    ". '""Geom""" GEOMETRY(POINT, 4326)
);

INSERT INTO """Quotes' and Space.Dot.".". Points"" 'quote"
values (1, '02daedc70702ec68753fde38351f5d9d', '0101000020E610000050C4D38CE9DA61401EFC0EC7C3DA2740'),
       (2, '7418427ba8a960c3661235f47cc13d46', '0101000020E6100000CC2F4170E9DA6140DEDB02B581DA2740');

CREATE INDEX ON """Quotes' and Space.Dot.".". Points"" 'quote" USING GIST (". '""Geom""");

SELECT ST_EstimatedExtent(?, ?, ?) as bounds;

An even more convoluted case (might be better as it covers a few more edge cases):

CREATE SCHEMA """Quotes' \ \' \"" and Space.Dot.";

CREATE TABLE """Quotes' \ \' \"" and Space.Dot.".". Points"" \ \' \"" 'quote"
(
    "' id '"                SERIAL PRIMARY KEY,
    ".namE "                TEXT,
    ". ' \ \' \"" ""Geom""" GEOMETRY(POINT, 4326)
);

INSERT INTO """Quotes' \ \' \"" and Space.Dot.".". Points"" \ \' \"" 'quote"
values (1, '02daedc70702ec68753fde38351f5d9d', '0101000020E610000050C4D38CE9DA61401EFC0EC7C3DA2740'),
       (2, '7418427ba8a960c3661235f47cc13d46', '0101000020E6100000CC2F4170E9DA6140DEDB02B581DA2740');

CREATE INDEX ON """Quotes' \ \' \"" and Space.Dot.".". Points"" \ \' \"" 'quote" USING GIST (". ' \ \' \"" ""Geom""");

Change History (7)

comment:1 by nyurik, 8 months ago

Testing the above, this query works:

SELECT "' id '", ".namE ", ". '""Geom"""
FROM """Quotes' and Space.Dot.".". Points"" 'quote";

This one does not:

SELECT ST_EstimatedExtent(
               '"""Quotes'' and Space.Dot."',
               '". Points"" ''quote"',
               '". ''""Geom"""'
       );

comment:2 by nyurik, 8 months ago

Description: modified (diff)

comment:3 by nyurik, 8 months ago

After lots of back and forth, it seems this may work:

  • for schema, use identifier escaping rules to produce a double-quoted string, and afterwards remove the first and last double-quote character.
  • do the same for the table
  • pass geometry column as is

Which means that when calling, schema and table name have to first be encoded as identifiers, remove first/last ", and then encode using literal string encoding. The geometry must only be literally encoded… Uncertain of this, but seems to work in some testing…

comment:4 by pramsey, 7 months ago

If you have a good improvement to the doco to submit, I'd be happy to look it over. The namespace and table get parsed using the regclassin function, like this:

snprintf(nsp_tbl, sz, "\"%s\".\"%s\"", nsp, tbl);
tbl_oid = DatumGetObjectId(DirectFunctionCall1(regclassin, CStringGetDatum(nsp_tbl)));

The attribute name gets searched for a little more literally, like this

HeapTuple att_tup = SearchSysCache2(
   ATTNAME,
   ObjectIdGetDatum(idx_oid),
   PointerGetDatum(colname));

Hence the slight differences in handling. I think regclassin seemed like a cute trick at the time.

comment:5 by pramsey, 4 months ago

@robe, I dunno what you think about this one, I don't know what to put in doco, and in a perfect world I would go back and just change the function so the arguments are directly regclass arguments, but doing so now *might* break existing users (of which there area a lot of important ones, as every postgis client calls estimated extent) and would also involve explicitly dropping the current signature (or it would interpose itself on the case where a user calls using text parameters, instead of allowing autocasting to work its magic and convert to regclass).

in reply to:  5 comment:6 by robe, 3 months ago

Replying to pramsey:

@robe, I dunno what you think about this one, I don't know what to put in doco, and in a perfect world I would go back and just change the function so the arguments are directly regclass arguments, but doing so now *might* break existing users (of which there area a lot of important ones, as every postgis client calls estimated extent) and would also involve explicitly dropping the current signature (or it would interpose itself on the case where a user calls using text parameters, instead of allowing autocasting to work its magic and convert to regclass).

Beats me. Do what you think is the best and we'll blame @strk for it after.

comment:7 by Paul Ramsey <pramsey@…>, 3 months ago

Resolution: fixed
Status: newclosed

In a331a8e/git:

Add documentation note about escaping for inputs to ST_EstimatedExtent
when inputs include quotes and spatial characters, closes #5697

Note: See TracTickets for help on using tickets.