Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#2877 closed task (fixed)

tracsvn down out of disk space

Reported by: robe Owned by: sac@…
Priority: normal Milestone: Sysadmin Contract 2023-I
Component: SysAdmin Keywords:
Cc:

Description

Trac shutdown because of out of disk isues agin.

I cleared some snapshots and added another 100GB.

Checking to see if its just a snapshot or something else eating the disk space

Change History (6)

comment:1 by robe, 2 years ago

Okay confirmed most of the size taken up by /home/git/gitea-data/repo-archive.

Clicking the https://git.osgeo.org/gitea/admin "Delete all repositories' archives (ZIP, TAR.GZ, etc..)"

cleared 200GB of space.

so down to only 97GB used.

strk thinks it might be a bots crawling, as gitea creates a physical file whenever an archive is requested.

Also seems we might not have gitea cron job enabled. Going to enable that and close this out and see how it goes.

comment:2 by robe, 2 years ago

gitea is running crons, but not for

"Delete all repositories archives(Zip, tar, gz, etc)"

it's running another "Delete all repository archives" I think we need to change the app.ini to enable this one as it's turned off by default.

But maybe we should just see how much this builds up. This job has never been run so if it took many years to get up to 200GB, maybe it doesn't need to run that often.

comment:3 by strk, 2 years ago

It's running the "delete old archives" where old can be defined in app.ini (by default 24h which is what we use). Schedule can also be defined, by default is at midnight, which we're using. Keep an eye on that directory and please verify that the oldest one was created not more than 24h before "now".

Possible improvement: schedule more frequently than every midnight (once each hour?), make "old" happen sooner (every 8 hours?).

The "delete all archives" is only needed if we think someone could create a lot more archives in a short period of time. This IS technically possible and I don't see any way to prevent this unless by using some limiting factor at the nginx level. Would be worth an issue upstream for Gitea to allow NEVER saving those files, or providing other tasks to determine when to clean based on SIZE of archive directory rather than AGE of those archives

comment:4 by robe, 2 years ago

Looks like it hasn't run yet as I still see N/A for previous time.

I'll check in the next 24 hrs to see if things have changed. I assume it goes by tracsvn clock time which is set for PST and currently Fri 13 Jan 2023 04:44:06 PM PST

right now the repo-archive folder has grown to 53G.

I did up the container size an additional 100G just in case doing snapshot makes it run out of disk space.

Total space used at moment is 144G.

comment:5 by robe, 2 years ago

Resolution: fixed
Status: newclosed

Seems to be working repo-archive folder is now at 16G and space used in container is 107G. So a 30G reduction from last I checked.

The monitoring page shows it last ran at - Sat, 14 Jan 2023 00:00:00 -0800 (and has run for 2 times).

I've changed the tracsvn container snapshots to run at -- 08:32 UTC

Before they were running at - 18:01 UTC

So that way the snapshots can be as small as they can be.

comment:6 by robe, 2 years ago

Thinking about this more, maybe it would be better to run every 12 hrs, I'll leave it alone for now. The reason I'm thinking that is right now the last osgeo4 backup snapshot runs at 5:13 UTC so it would be heftiest at that time.

I suppose I could just change that backup schedule (though that one applies for all osgeo7 containers). Not a huge deal though since it is much smaller now, and 40G here an there is not a huge deal only case is when a bot comes along and crawls all the archives.

Note: See TracTickets for help on using tickets.