Opened 2 years ago

Last modified 2 years ago

#2705 new defect

Download rates from grass.osgeo.org extremely low

Reported by: neteler Owned by: sac@…
Priority: normal Milestone: Unplanned
Component: SysAdmin Keywords:
Cc:

Description

I am trying to download a sample data package from grass.osgeo.org which takes "forever": 80kb/s (!)

USA (Portland) --> Germany

wget https://grass.osgeo.org/sampledata/north_carolina/nc_spm_full_v2alpha2.tar.gz
--2022-01-24 16:18:42--  https://grass.osgeo.org/sampledata/north_carolina/nc_spm_full_v2alpha2.tar.gz
Resolving grass.osgeo.org (grass.osgeo.org)... 140.211.15.30
Connecting to grass.osgeo.org (grass.osgeo.org)|140.211.15.30|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 166928373 (159M) [application/x-gzip]
Saving to: ‘nc_spm_full_v2alpha2.tar.gz’

nc_spm_full_v2alpha2.tar.gz        0%[                ] 551.71K  79.6KB/s    eta 34m 4s

As a comparison, I get 6Mb/s from the mirror site in South Africa:

South-Africa --> Germany

mneteler@caddy: ~/tmp$ wget https://grass.mirror.ac.za/sampledata/north_carolina/nc_spm_full_v2alpha2.tar.gz
--2022-01-24 16:28:49--  https://grass.mirror.ac.za/sampledata/north_carolina/nc_spm_full_v2alpha2.tar.gz
Resolving grass.mirror.ac.za (grass.mirror.ac.za)... 2001:4200:fffc::103, 155.232.191.103
Connecting to grass.mirror.ac.za (grass.mirror.ac.za)|2001:4200:fffc::103|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 166928373 (159M) [application/octet-stream]
Saving to: ‘nc_spm_full_v2alpha2.tar.gz.1’

nc_spm_full_v2alpha2.tar.gz.1     21%[===========>     ]  34.62M  6.44MB/s    eta 46s    

Is there anything throttling the connectivity on the grasslxd container? Esp. versus Europe?

Change History (11)

comment:1 by robe, 2 years ago

No no throttling. checking now to see how it is from here.

comment:2 by robe, 2 years ago

Confirmed pretty slow from Boston too.

nc_spm_full_v2alpha2.tar.gz            4%[==>                                                                    ]   7.58M   287KB/s    eta 10m 3s

and download is slow as well so general issue with osgeo7.

Going to check the other servers.

comment:3 by robe, 2 years ago

Does seem specific to osgeo7.

I did a compare of osgeo4

git clone https://dev.git.osgeo.org/gitea/postgis/postgis.git postgis-test

100% (106594/106594), 53.91 MiB | 15.30 MiB/s, done.

vs. osgeo7

remote: Total 109688 (delta 86395), reused 106890 (delta 83727) receiving objects: 100% (109688/109688), 

Receiving objects: 100% (109688/109688), 56.17 MiB | 520.00 KiB/s, done.

osgeo7 is due for a reboot so I'll do that later today and will investigate further if something specific is eating the bandwidth.

comment:4 by robe, 2 years ago

I installed vnstat on the different servers to get a sense of how much traffic each gets. It's still building up stats, but already I see osgeo7 does get 10 times more traffic than osgeo3 (and osgeo4 gets very little comparatively).

I think there is a limit per server just by the share ethernet card alone, but there might be a limit set also by OSUOSL. I will ask OSUOSL about this.

comment:5 by neteler, 2 years ago

Thanks, @robe, for inspecting this. It is an issue I have observed multiple times and for many months (just only today I wrote this ticket).

comment:6 by neteler, 2 years ago

Right now this takes "forever" (80 kb/s):

wget http://download.osgeo.org/gdal/3.4.1/gdal-3.4.1.tar.gz
gdal-3.4.1.tar.gz   63%[===========================>           ] 12.11M 87.1KB/s  

(download server --> Bonn, Germany)

Anything which could be done about it? Thanks.

comment:7 by robe, 2 years ago

@neteler,

Sorry about that. I'm planning to round-robin download this month (across osgeo8 and osgeo4). But still testing some stuff out before I do. So the load would be split across servers. I know for example osgeo4 is pretty fast. So it's just the terabytes of traffick osgeo7 is going under.

One change I have to do before then is make sure everyone uploads to upload.osgeo.org instead of download.osgeo.org. I'll send out a note in a week or so about that.

comment:8 by robe, 2 years ago

@neteler,

I discovered I can make a light nginx proxy on the other servers without having to pull over all the data since the speed between servers is fast. I tried downloading.

wget http://download-cache.osgeo.org/gdal/3.4.1/gdal-3.4.1.tar.gz

and it was about 11MB/s

This is just a temporary name

My plan is to balance the traffic on download.osgeo.org across the servers and then eventually have some cdns too.

I can't switch download yet since I think a lot of the rsyncs are set to use download.osgeo.org and many people are using the download.osgeo.org name for uploading. I need to have everyone use upload.osgeo.org instead for uploading. Similar can be done with other sites like live.osgeo.org and grass.osgeo.org as long as you aren't using that name for rsync. To do it rsync would need to use a different name.

comment:9 by robe, 2 years ago

Okay I added osgeo9 as a backup for download.osgeo.org. So it may not be as fast anymore if you test. I'm monitoring it very closely and will also investigate if we need to kill some traffic coming to download.osgeo.org

comment:10 by neteler, 2 years ago

What's not clear to me: my report refers to grass.osgeo.org (= osgeo7.osgeo.org), isn't that a different box than download.osgeo.org?

comment:11 by robe, 2 years ago

The are both on the same host osgeo7 but different containers. They use the same ip.

Note: See TracTickets for help on using tickets.