#2131 closed enhancement (fixed)
Replace regress_lots_of_points.sql data with generated query
| Reported by: | strk | Owned by: | pramsey |
|---|---|---|---|
| Priority: | low | Milestone: | PostGIS Fund Me |
| Component: | postgis | Version: | master |
| Keywords: | Cc: |
Description
The regress/regress_lots_of_points.sql file is 1.7MB. It defines a table with 50000 records (int, point geometry). Could very well be replaced by a one-line query with a similar effect. Would cut at least 700KB from the tarball…
Change History (11)
comment:1 by , 12 years ago
| Milestone: | PostGIS 2.0.2 → PostGIS 2.1.0 |
|---|---|
| Priority: | medium → low |
| Type: | defect → enhancement |
| Version: | 2.0.x → trunk |
comment:2 by , 11 years ago
| Milestone: | PostGIS 2.1.0 → PostGIS Future |
|---|
comment:4 by , 7 years ago
| Resolution: | → wontfix |
|---|---|
| Status: | new → closed |
OK, having looked this over, I don't see any compelling reason to re-write the index regression checks. They are pretty good, and to save a few K we have to muck with them a fair amount.
comment:5 by , 7 years ago
How about we make a deterministic pseudo-random point sequence, based on a linear-feedback shift register. This one should generate a uniform distribution of points between [0, 1000] in both x and y directions. And because it's based on a bit(16) postgres type, I'd expect it to be portable.
CREATE OR REPLACE FUNCTION
lfsr_point(len integer, start_state integer DEFAULT 6342)
RETURNS SETOF geometry AS
$$
DECLARE
lfsr bit(16);
bt bit(16);
x float8;
y float8;
BEGIN
lfsr := start_state::bit(16);
FOR i IN 1..$1 LOOP
bt := ((lfsr >> 0) # (lfsr >> 2) # (lfsr >> 3) # (lfsr >> 5)) & (1::bit(16));
lfsr := (lfsr >> 1) | (bt << 15);
x := round(lfsr::integer::numeric / 65536.0 * 1000, 4);
bt := ((lfsr >> 0) # (lfsr >> 2) # (lfsr >> 3) # (lfsr >> 5)) & (1::bit(16));
lfsr := (lfsr >> 1) | (bt << 15);
y := round(lfsr::integer::numeric / 65536.0 * 1000, 4);
RETURN NEXT ST_MakePoint(x, y);
END LOOP;
RETURN;
END;$$ LANGUAGE plpgsql;
And then generate the "random" points:
DROP TABLE IF EXISTS test;
CREATE TABLE test (
"num" serial PRIMARY KEY,
"the_geom" geometry
);
INSERT INTO test(the_geom)
SELECT lfsr_point(50000);
This is all fine an dandy, until I looked at the points: https://imgur.com/FgMjyc4 so it just needs a bit of fine tuning…
comment:6 by , 7 years ago
New idea: improve an existing function by adding a new (3rd) parameter:
ST_GeneratePoints( g geometry , npoints numeric , seed integer DEFAULT NULL)
where seed (if provided) could be used to generate a predictable pseudo-random number sequence. I've investigated that the Mersenne Twister (e.g. MT19937) provides the same pseudo-random sequence on different platforms. There are a few small C examples floating around that can be tailored into liblwgeom/lwgeom_geos.c.
comment:8 by , 7 years ago
I like it in principle… in practice it sure better be 100% stable on different platforms, otherwise it could be quite hard to use effectively. And that's a hard guarantee to provide. I was fairly surprised how much variability PgSQL's on random() had when run with the same seed on different OS's, but all on Intel chips.

punt