Opened 7 months ago

Last modified 7 months ago

#5736 reopened defect

latest releases md5 mismatch

Reported by: strk Owned by: robe
Priority: medium Milestone: Website Management, Bots
Component: management Version: 3.4.x
Keywords: Cc:

Description (last modified by strk)

The current https://download.osgeo.org/postgis/source/postgis-3.4.2.tar.gz file has an MD5 sum of 632abda8b4267af437db6cde1bc9d9dc while the file https://postgis.net/stuff/postgis-3.4.2.tar.gz.md5 expects it to be 9298e0a81013b44ac39cfbabf2f95ae9

I've computed the sum with md5sum (GNU coreutils) 9.1

A user reported the problem using openssl dgst -md5 ... in https://lists.osgeo.org/pipermail/postgis-users/2024-May/046467.html

Change History (28)

comment:1 by strk, 7 months ago

Description: modified (diff)

comment:2 by robe, 7 months ago

I think the issue here is that Paul manually generates his and I just use the tarballs that debbie generates and upload them to download.osgeo.org.

Debbie always generates regardless.

Guess the easiest fix is to download the md5 and tarballs from download.osgeo.org and reupload to postgis.net

comment:3 by darkblueb, 7 months ago

no, the MD5 hash in /osgeo/download/postgis/source$ cat postgis-3.4.2.tar.gz.md5 does appear to match $ openssl dgst -md5 postgis-3.4.2.tar.gz

Version 0, edited 7 months ago by darkblueb (next)

comment:5 by strk, 7 months ago

Yes, sorry, the one on download.osgeo.org is indeed correct, so we'd only need to upload that one to postgis.net, assuming nobody messed with it. It would be a good idea to also include the MD5 in the announces, so we'd have more verifications

comment:6 by strk, 7 months ago

Description: modified (diff)

comment:7 by strk, 7 months ago

And I confirm the tarball from postgis.net matches the md5 of the tarball on postgis.net: https://postgis.net/stuff/postgis-3.4.2.tar.gz

So, if I understand correctly the HOWTO_RELEASE file instructions were not followed, as they talk about letting Debbie generate the tarball

comment:8 by strk, 7 months ago

Resolution: fixed
Status: newclosed

I've copied the correct MD5 file from osgeo.org to postgis.net and did the same for the tarball.

comment:9 by strk, 7 months ago

Resolution: fixed
Status: closedreopened
Summary: 3.4.2 release md5 mismatchlatest releases md5 mismatch

I'm reopening because we have the same problem with release 3.3.6 — in this case things are worse in that the .md5 file in download.osgeo.org doesn't match any tarball (neither the one on osgeo.org nor the one on postgis.net) — 3.2.7 is fine

comment:10 by robe, 7 months ago

I think this was mostly my fault, when Paul was releasing, debbie failed to generate the tarballs, cause I had recently upgraded her which broke the webhook that instructs her to build the tarballs. After Paul had alerted me to the issue, I fixed but I think he was done with the tarball generations. But I'm not absolutely sure what happened in the 3.3.6 case.

At anyrate assuming we have determined no rogue actors we should just assume the one on download.osgeo.org is right and fix the md5/etc on postgis.net.

I was debating if we should just automate this whole process entirely. I think the only reason we don't is, we want to test the tarballs before they hit download.osgeo.org cause at that point it's technically too late since packagers (I'm know debian, possibly others) have bots watching that location waiting to pounce to release so once it's up there, we can't take it down since it would violate the release and require a new release.

comment:11 by strk, 7 months ago

Automake has a "distcheck" rule which is meant specifically to test the tarballs, the (automatically generated) rule creates the tarball, then extracts it in a temporary source dir, tags the source dir as read-only, configure for install in a temporary directory, builds, runs tests, installs, runs install tests. Maybe we can craft something like that.

Better discuss this in a mailing list thread or separate ticket.

For this ticket I've proceeded as you suggest and updated postgis.net MD5 to match that of the download.osgeo.org tarball for 3.3.6

comment:12 by strk, 7 months ago

Resolution: fixed
Status: reopenedclosed

comment:13 by strk, 7 months ago

Reopening. I've scripted an MD5 checker which found other mismatches among the 3.0+ series:

[strk@c23:/usr/local/src/postgis/postgis(main)] utils/check_releases_md5.sh
Fetching list of supported releases
Checking postgis-3.0.11.tar.gz ... MD5 mismatch
Checking postgis-3.0.10.tar.gz ... OK
Checking postgis-3.0.9.tar.gz ... OK
Checking postgis-3.0.8.tar.gz ... OK
Checking postgis-3.0.7.tar.gz ... OK
Checking postgis-3.0.6.tar.gz ... OK
Checking postgis-3.0.5.tar.gz ... OK
Checking postgis-3.0.4.tar.gz ... MD5 mismatch
Checking postgis-3.0.3.tar.gz ... MD5 mismatch
Checking postgis-3.0.2.tar.gz ... MD5 mismatch
Checking postgis-3.0.1.tar.gz ... MD5 mismatch
Checking postgis-3.0.0.tar.gz ... MD5 mismatch

comment:14 by strk, 7 months ago

Resolution: fixed
Status: closedreopened

comment:15 by Sandro Santilli <strk@…>, 7 months ago

In d0d3823/git:

Add script to check releases MD5

References #5736

comment:16 by strk, 7 months ago

The rest of the checks:

Checking postgis-3.4.2.tar.gz ... OK
Checking postgis-3.3.6.tar.gz ... OK
Checking postgis-3.2.7.tar.gz ... OK
Checking postgis-3.1.11.tar.gz ... OK
Checking postgis-3.4.1.tar.gz ... OK
Checking postgis-3.3.5.tar.gz ... OK
Checking postgis-3.2.6.tar.gz ... OK
Checking postgis-3.1.10.tar.gz ... OK
Checking postgis-3.4.0.tar.gz ... OK
Checking postgis-3.3.4.tar.gz ... OK
Checking postgis-3.3.3.tar.gz ... OK
Checking postgis-3.2.5.tar.gz ... OK
Checking postgis-3.1.9.tar.gz ... OK
Checking postgis-3.3.2.tar.gz ... OK
Checking postgis-3.2.4.tar.gz ... OK
Checking postgis-3.1.8.tar.gz ... OK
Checking postgis-3.3.1.tar.gz ... OK
Checking postgis-3.3.0.tar.gz ... OK
Checking postgis-3.2.3.tar.gz ... OK
Checking postgis-3.1.7.tar.gz ... OK
Checking postgis-3.2.2.tar.gz ... OK
Checking postgis-3.1.6.tar.gz ... OK
Checking postgis-3.2.1.tar.gz ... OK
Checking postgis-3.1.5.tar.gz ... OK
Checking postgis-3.2.0.tar.gz ... OK
Checking postgis-3.1.4.tar.gz ... OK
Checking postgis-3.1.3.tar.gz ... MD5 mismatch
Checking postgis-3.1.2.tar.gz ... MD5 mismatch
Checking postgis-3.1.1.tar.gz ... MD5 mismatch
Checking postgis-3.1.0.tar.gz ... MD5 mismatch

comment:17 by strk, 7 months ago

So to recap, the still-mismatching versions are:

postgis-3.0.0.tar.gz
postgis-3.0.11.tar.gz
postgis-3.0.1.tar.gz
postgis-3.0.2.tar.gz
postgis-3.0.3.tar.gz
postgis-3.0.4.tar.gz
postgis-3.1.0.tar.gz
postgis-3.1.1.tar.gz
postgis-3.1.2.tar.gz
postgis-3.1.3.tar.gz

comment:18 by strk, 7 months ago

I'm thinking the script could be made to run automatically in the postgis.net repository which is where the download links are found. The script could be passed only the versions that are linked from the website, parsing the hugo config (which could be converted to yml from toml to make it easier)

comment:19 by strk, 7 months ago

I've automated the checking for the non-dev versions linked from the website, it fails as expected due to 3.0.11 mismatch: https://woodie.osgeo.org/repos/127/pipeline/17/4

comment:20 by strk, 7 months ago

I've fixed the MD5 of 3.0.11 to have matches for the published releases, and restarted the build so it is now green: https://woodie.osgeo.org/repos/127/pipeline/18/4

What should we do for the left-overs ?

postgis-3.0.0.tar.gz
postgis-3.0.1.tar.gz
postgis-3.0.2.tar.gz
postgis-3.0.3.tar.gz
postgis-3.0.4.tar.gz
postgis-3.1.0.tar.gz
postgis-3.1.1.tar.gz
postgis-3.1.2.tar.gz
postgis-3.1.3.tar.gz

comment:21 by robe, 7 months ago

I think the other way that this issue happens is that debbie builds a release tar ball twice.

Once for the branch and once for the tag.

So if the releaser is in a rush, they might download the one that is generated by the branch release, and that on occasion might not be the tagged yet, and then that release gets rebuilt on tag.

Can you think of an easy way to prevent tagged builds being built from branches? I'm thinking since debbie has a dedicated script, she could just verify if she is in a tagged branch or regular branch, and just not copy the artifacts to the website.

comment:22 by strk, 7 months ago

I think the other way that this issue happens is that debbie builds a release tar ball twice.

It would be interesting to understand WHY doing so would result in different tarballs. Is it two different commits, you're talking about ?

Can you think of an easy way to prevent tagged builds being built from branches?

It could use git describe --exact-match --tags HEAD, if that exits with a success code we've checked out a tag and decide what to do about that.

But couldn't we only have tags trigger that debbie build ?

Adding a log of Jenkins operations could be useful to tell more about what's going on, btw.

comment:23 by lnicola, 7 months ago

It would be interesting to understand WHY doing so would result in different tarballs.

ImreSamu figured that out, it's the gzip timestamp: https://unix.stackexchange.com/questions/438329/tar-produces-different-files-each-time. The file order is another source of nondeterminism, but it's unlikely to matter for consecutive runs.

Last edited 7 months ago by lnicola (previous) (diff)

comment:24 by strk, 7 months ago

We still get different timestamps on the files in the tar archive. In order to make that part reproducible we should enforce a timestamp on those, after build.

comment:25 by Sandro Santilli <strk@…>, 7 months ago

In 1590723/git:

Make source package creation reproducible

Removes timestamp of tarball, enforces timestamp of source files
to timestamp of top commit

References #5736

comment:26 by strk, 7 months ago

I've made the make_dist.sh script enforce timestamp on all files to the date of the last commit, could you check if you get the same MD5 as me for a tarball created from commit [1590723c4a8ace27ad13aa3b1ec54e317b4276e4/git] ? I get this:

37884c7c4bfa03ebd525b4660c4939ba  postgis-3.5.0dev-3.4.0rc1-1145-g1590723c4.tar.gz

comment:27 by strk, 7 months ago

From another machine I got a different result :(

5e695262e833b6c83acf1177dcaf4913  postgis-3.5.0dev-3.4.0rc1-1145-g1590723c4.tar.gz

There's more work to be done. I guess uid/gid of files.

comment:28 by strk, 7 months ago

I've pushed another commit enforcing ownership and modes of files. New test is (after pull):

./make_dist.sh f69d77e4e3d7f16aca2797fb53ce5021d7b09e82
cat postgis-f69d77e4e3d7f16aca2797fb53ce5021d7b09e82.tar.gz.md5

I get:

b1875f4e593ade72fe59ac7c089fe8c8  postgis-f69d77e4e3d7f16aca2797fb53ce5021d7b09e82.tar.gz
Note: See TracTickets for help on using tickets.