Opened 6 years ago
Closed 6 years ago
#4368 closed enhancement (fixed)
Reorder struct layouts to minimize padding
Reported by: | Algunenano | Owned by: | Algunenano |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.0.0 |
Component: | postgis | Version: | master |
Keywords: | Cc: |
Description
The current LWGEOM structs have too much padding which makes them bigger than what they could be.
For example, currently they look like this:
typedef struct { uint8_t type; uint8_t flags; GBOX *bbox; int32_t srid; void *data; } LWGEOM;
When you build them in a 64 bit machine which I'm considering the default, the struct looks like this:
{ uint8_t type; /* 1 byte */ uint8_t flags; /* 1 byte */ /* 6 bytes padding */ GBOX *bbox; /* 8 bytes */ int32_t srid; /* 4 bytes */ /* 4 bytes padding */ void *data; /* 8 bytes */ }
For a total of 32 bytes.
I'm proposing to use this instead:
{ GBOX *bbox; /* 8 bytes */ void *data; /* 8 bytes */ int32_t srid; /* 4 bytes */ uint8_t type; /* 1 byte */ uint8_t flags; /* 1 byte */ char pad[2]; /* 2 bytes of padding */ }
So the new way uses 24 bytes instead (25% less). I expect this to have a impact in big multigeometries as it reduces their memory footprint which should improve performance. OTOH, I want to test it first as changing the position of certain elements might be harmful in certain scenarios.
Change History (4)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
With Komzpa's idea, I decided to hack both ST_Subdivide and ST_Collect to print memory stats and test what was the impact using the bigpolygon table:
diff --git a/postgis/lwgeom_accum.c b/postgis/lwgeom_accum.c index db7ccd3fe..1d9c5844e 100644 --- a/postgis/lwgeom_accum.c +++ b/postgis/lwgeom_accum.c @@ -240,6 +240,7 @@ pgis_geometry_collect_finalfn(PG_FUNCTION_ARGS) geometry_array = pgis_accum_finalfn(p, CurrentMemoryContext, fcinfo); result = PGISDirectFunctionCall1( LWGEOM_collect_garray, geometry_array ); + MemoryContextStats(CurrentMemoryContext); if (!result) PG_RETURN_NULL(); diff --git a/postgis/lwgeom_dump.c b/postgis/lwgeom_dump.c index 133a2d7d0..92789c675 100644 --- a/postgis/lwgeom_dump.c +++ b/postgis/lwgeom_dump.c @@ -400,6 +400,7 @@ Datum ST_Subdivide(PG_FUNCTION_ARGS) else { /* do when there is no more left */ + MemoryContextStats(funcctx->multi_call_memory_ctx); SRF_RETURN_DONE(funcctx); } }
SQL: Select ST_Collect(geom) from ( select ST_Subdivide(geom) As geom FROM big_polygon ) _a;
- Before changes:
ST_Subdivide:
multi-call context: 11795280 total in 1722 blocks; 3172560 free (9560 chunks); 8622720 used total: 11795280 bytes in 1722 blocks; 3172560 free (9560 chunks); 8622720 used
ST_Collect:
8069592 total in 11 blocks; 494824 free (2 chunks); 7574768 used total: 8069592 bytes in 11 blocks; 494824 free (2 chunks); 7574768 used
- After changes:
Subdivide uses 2 blocks less (1710/1712). ~1% less total bytes used:
multi-call context: 11696976 total in 1710 blocks; 3134384 free (8883 chunks); 8562592 used total: 11696976 bytes in 1710 blocks; 3134384 free (8883 chunks); 8562592 used
ST_Collect uses 1 block less (10 vs 11). ~6.5% less total bytes used:
7545304 total in 10 blocks; 30032 free (2 chunks); 7515272 used total: 7545304 bytes in 10 blocks; 30032 free (2 chunks); 7515272 used
I don't see any measurable impact in performance in my machine
Initial WIP in https://github.com/postgis/postgis/pull/390
If I see a clear benefit I might investigate further and check other structs too.