#4330 closed patch (fixed)
postgis_restore fails when output piped to an intermediate process
Reported by: | hranalli | Owned by: | strk |
---|---|---|---|
Priority: | medium | Milestone: | PostGIS 3.0.0 |
Component: | build | Version: | 2.5.x -- EOL |
Keywords: | Cc: |
Description
We are upgrading a database from Postgres 8.2.23 and PostGIS 1.4 to PostgreSQL 11.2 and PostGIS 2.5. As part of our process, we pipe the output of postgis_restore.pl to sed, whic makes some changes to the SQL before it is piped to psql.
When we do this, postgis_restore buffers the output in memory, and then is killed by the OS when it runs out of memory. I've created a patch that eliminates this buffering in the places where it is unnecessary (the culprit is in the TABLE DATA commands, as these are the most memory intensive).
Once postgis_restore is run with the patch in place, it does a fine job of upgrading the database.
Attachments (1)
Change History (11)
by , 6 years ago
Attachment: | postgis_restore-print-lines-as-processed.patch added |
---|
comment:2 by , 6 years ago
I'm ok with the autoflush setting but don't see why there was a need to change other lines. Hranalli ? Is autoflush only enough ?
comment:3 by , 6 years ago
Unfortunately, the autoflush setting is not enough. That is what I tried first, and figured it didn't hurt to leave it in, but it did not solve the problem. It was necessary to output lines on each loop iteration, rather than letting them build up, in order to resolve the issue.
comment:4 by , 6 years ago
I see the change is in the block parsing CREATE TABLE and in the one parsing COMMENT ON How do they affect the TABLE DATA block that you reference in the original submission ?
I suspect the problem is elsewhere. Can it be the lack of anchor in CREATE TABLE pattern ? May it be it is catching TABLE DATA values ?
comment:5 by , 6 years ago
It's been a while since I made this patch, as I didn't want to submit it until after several test runs, but I am pretty sure the CREATE TABLE statement also processes the table data. To figure out the problem, I added logging to see where it crashed. And it was in the middle of the CREATE TABLE loop that it ran out of memory. The change I made there resolves the problem. I've run the script several times on our database, and it always fails without the patch, and succeeds with it.
I adjusted the COMMENT area because, as there was no intermediate processing required, it seemed reasonable to remove a similar, potential failure point (though I agree an unlikely one, given the data sizes involved). Other sections which buffer data need the buffer to make decisions, and so would be harder to adjust.
You can certainly leave out the patch on the COMMENTS section. I do know the patch on CREATE TABLE solves the problem. There may be a more fundamental issue, but I'm not a Perl programmer or PostGIS expert.
comment:7 by , 6 years ago
Ok, I applied the patch with r17318, in 3.0 branch. I'm betting on having enough time for others to scream if it breaks anything
comment:8 by , 6 years ago
strk so you want to apply to 2.5.2 or just stick with 3.0. You closed but didn't flip the milestone to 3.0 so not sure.
comment:9 by , 6 years ago
Milestone: | PostGIS 2.5.2 → PostGIS 3.0.0 |
---|
strk I'm flipping this to 3.0. If you do back commit to 2.5, then flip to 2.5.2.
I'll release 2.5.2 later tonight or tomorrow.
comment:10 by , 6 years ago
I feel the postgis_restore.pl path is not very well tested so any change should really be tested deeply manually before backporting. Can't remember if we have a ticket open about adding such testing to our many bots, would be great to have.
Patch to output statements as they are processed, rather than buffering and running out of memory