#5666 closed defect (fixed)
Build reproducibility: timestamps in extension upgrade SQL scripts
Reported by: | jayaddison | Owned by: | strk |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.4.3 |
Component: | build | Version: | 3.4.x |
Keywords: | Cc: | jayaddison |
Description
Problem
When PostGIS builds extensions, the upgrade-paths-rules.mk
makefile emits a few comments into templated SQL files, including a 'Built on' line that includes a timestamp.
-- Just tag extension testing version as "ANY" -- Installed by testing -- Built on 2024-02-12 12:47:27
The timestamp is generated by invoking the date
command in a subshell (code ref), and this currently causes the output to differ on each build of the extensions (even from the same source code), making the build non-reproducible.
The expected/recommended behaviour is either to remove the timestamp completely (removing the source of divergence) or to derive a timestamp deterministically - typically by using the last-modified time of the source code or a timestamp of the release.
As far as I am currently aware, this is the only source of non-determinism when building the PostGIS extensions from the v3.4.2 code as Debian-packaged. I can't guarantee that there aren't others, but this is the only case I've encountered.
Additional Context
Related to ticket: #4148 (introduction of reproducible build time handling in PostGIS)
Originally reported downstream at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1063724
More information about timestamps in relation to reproducible builds: https://reproducible-builds.org/docs/timestamps/
Notes about a similar solution
There is an existing pattern in the configure.ac
file that extracts a POSTGIS_BUILD_DATE
string variable from a SOURCE_DATE_EPOCH
environment variable (the recommended way to read a reproducible time value for use during builds).
To handle the possibility that it could be confusing or disruptive to remove the 'Built on' line completely, I've attached a patch that applies this same pattern to the upgrade-paths-rules.mk
file. I've tested this using a checkout from v3.4.2 using:
postgis $ cd extensions postgis/extensions $ rm -rf sql; postgis/extensions $ mkdir -p sql; postgis/extensions $ SOURCE_DATE_EPOCH=0 EXTENSION=testing make -f upgrade-paths-rules.mk sql/testing--TEMPLATED--TO--ANY.sql
Attachments (1)
Change History (13)
by , 11 months ago
Attachment: | extension-upgrade-sql-reproducible.patch added |
---|
comment:1 by , 11 months ago
Please note: the patch as-written specifies an explicit format-string for the output date (+%Y-%m-%d %H:%M:%S
), copied from the POSTGIS_BUILD_DATE
format string (code ref).
The example SQL snippet in my description uses that format too - but in practice existing strings will have been formatted according to the build system's date
program (-- Built on Mon Nov 20 20:03:08 UTC 2023
, for example).
comment:2 by , 10 months ago
Component: | postgis → build |
---|---|
Owner: | changed from | to
Apologies for the edit noise; this appears to be a build-related issue, so I'm reassigning the component (and that auto-adjusts the owner, by the looks of it).
I'll also see whether I can propose a fix for this using gitea.
comment:3 by , 10 months ago
I'll also see whether I can propose a fix for this using gitea.
Pull request opened with the patch applied here: https://git.osgeo.org/gitea/postgis/postgis/pulls/186
(changeset created using the gitea web editor)
comment:4 by , 10 months ago
I think I'd prefer the date removed over trying to pull what configure gives us and relying on any format string.
comment:5 by , 10 months ago
Thanks robe - removing the timestamp would be a good (probably the best) solution from my point of view too, but I could understand if there are reasons/requirements to retain a timestamp within the file (either in a specific format or generally).
comment:6 by , 10 months ago
Seems strk thinks he needs it so I'll commit the patch in a bit. Thanks for the patch. I tested on my windows system and seems to be okay. The failure on gitea is caused by a different issue not related to your patch (about it treating warnings as errors).
I think this patch is safe enough to backport to at least 3.4 so I'll put in that version as well. Not sure if it's worth while to backport further than that. I'd be concerned cause I know our autoconf requirements changed somewhere between 3.2 and 3.4 and can't remember which.
comment:9 by , 10 months ago
Milestone: | → PostGIS 3.4.3 |
---|
comment:10 by , 10 months ago
Sorry forgot to ask how you wanted to be credited. I had put Jay Addison, but realize now you have James Addison to. Let me know if I should change to James Addison.
comment:11 by , 10 months ago
Ah, thanks for checking! Yep, a small preference for James Addison instead of Jay Addison - either is fine though.
Patch to read SOURCE_DATE_EPOCH during extension upgrade script templating. Should have the value f1ca910dfcf382c01e0024a0419c9c49224b169bcff32ca09692f7de27c54f76 as sha256sum.