Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#2724 closed defect (fixed)

Downloads from repo.osgeo.org interrupted

Reported by: juanluisrp Owned by: sac@…
Priority: normal Milestone: Unplanned
Component: SysAdmin Keywords: nexus, repo.osgeo.org, repository, maven, connection
Cc:

Description

The Maven artifacts downloads from repo.osgeo.org are failing, at least when the file size is big. The connection gets interrupted. For example:

 curl -v https://repo.osgeo.org/repository/release/org/geoserver/web/gs-web-app/2.20.3/gs-web-app-2.20.3.jar -o gs-web-app-2.20.3.jar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 140.211.15.6:443...
* Connected to repo.osgeo.org (140.211.15.6) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [228 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [4038 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [300 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [37 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=nexus.osgeo.org
*  start date: Jan 10 21:50:40 2022 GMT
*  expire date: Apr 10 21:50:39 2022 GMT
*  subjectAltName: host "repo.osgeo.org" matched cert's "repo.osgeo.org"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fab6080d600)
> GET /repository/release/org/geoserver/web/gs-web-app/2.20.3/gs-web-app-2.20.3.jar HTTP/2
> Host: repo.osgeo.org
> user-agent: curl/7.77.0
> accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200
< server: nginx/1.18.0
< date: Wed, 02 Mar 2022 09:21:41 GMT
< content-type: application/java-archive
< content-length: 107447694
< x-content-type-options: nosniff
< content-security-policy: sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation
< x-xss-protection: 1; mode=block
< last-modified: Wed, 23 Feb 2022 23:54:12 GMT
< etag: "{SHA1{d086b48d0c373b37abf1b13b6065543962d9d5be}}"
< content-disposition: inline
< front-end-https: on
<
{ [7713 bytes data]
  3  102M    3 4139k    0     0   163k      0  0:10:42  0:00:25  0:10:17  192k* HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2)
* stopped the pause stream!
  4  102M    4 4319k    0     0   164k      0  0:10:38  0:00:26  0:10:12  191k
* Connection #0 to host repo.osgeo.org left intact
curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2)

or

 curl -v https://repo.osgeo.org/repository/release/org/geonetwork-opensource/web-app/3.10.3-0/web-app-3.10.3-0.war -o gn.war
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 140.211.15.6:443...
* Connected to repo.osgeo.org (140.211.15.6) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [228 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [4038 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [300 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [37 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=nexus.osgeo.org
*  start date: Jan 10 21:50:40 2022 GMT
*  expire date: Apr 10 21:50:39 2022 GMT
*  subjectAltName: host "repo.osgeo.org" matched cert's "repo.osgeo.org"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7f7afa80f200)
> GET /repository/release/org/geonetwork-opensource/web-app/3.10.3-0/web-app-3.10.3-0.war HTTP/2
> Host: repo.osgeo.org
> user-agent: curl/7.77.0
> accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0< HTTP/2 200
< server: nginx/1.18.0
< date: Wed, 02 Mar 2022 09:30:12 GMT
< content-type: application/java-archive
< content-length: 199722063
< x-content-type-options: nosniff
< content-security-policy: sandbox allow-forms allow-modals allow-popups allow-presentation allow-scripts allow-top-navigation
< x-xss-protection: 1; mode=block
< last-modified: Tue, 11 May 2021 11:33:59 GMT
< etag: "{SHA1{a0bb8403b4049de9cd9ca08f80795d199e2584fa}}"
< content-disposition: inline
< front-end-https: on
<
{ [7713 bytes data]
  2  190M    2 4087k    0     0   197k      0  0:16:27  0:00:20  0:16:07  243k* HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2)
* stopped the pause stream!
  2  190M    2 4103k    0     0   198k      0  0:16:23  0:00:20  0:16:03  254k
* Connection #0 to host repo.osgeo.org left intact
curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2)

Change History (10)

comment:1 by jive, 2 years ago

Oh thanks Juan; this explains why my geoserver release was failing to deploy … it is failing when deploying the first large jar.

I expect some time of network monitoring tool was introduced here and is causing problems.

Last edited 2 years ago by jive (previous) (diff)

comment:2 by jive, 2 years ago

This is causing lots of automations to fail:

Error:  Failed to execute goal on project gt-grib: Could not resolve dependencies for project org.geotools:gt-grib:jar:27-SNAPSHOT: Could not transfer artifact edu.ucar:grib:jar:4.6.15 from/to osgeo (https://repo.osgeo.org/repository/release/): GET request of: edu/ucar/grib/4.6.15/grib-4.6.15.jar from osgeo failed: Premature end of Content-Length delimited message body (expected: 5,319,148; received: 4,455,935) -> [Help 1]

comment:3 by robe, 2 years ago

jive looking at this now. I don't think we put in any new tools to cause this though.

How long has this been going on?

comment:4 by robe, 2 years ago

Milestone: UnplannedSysadmin Contract 2022-II

comment:5 by jive, 2 years ago

Milestone: Sysadmin Contract 2022-IIUnplanned

Some progress with darkblue and Juan troubleshooting:

  • nexus itself is okay, going onto the machine and accessing via localhost:8081 downloads content without issue
  • nginx (osgeo3-nginx) configuration is likely introduced a size limit affecting routing to nexus

in reply to:  3 comment:6 by jive, 2 years ago

Replying to robe:

jive looking at this now. I don't think we put in any new tools to cause this though.

How long has this been going on?

Let me check the geoserver deploy logs, I can at least get the correct date range.

  • feb 23rd deploy worked
  • march 1st: deploy failed

The public is starting to noice with build jobs and integration tests today.

So sometime on the weekend?

comment:7 by robe, 2 years ago

Looks like osgeo3-nginx ran out of disk space. Was provisioned for 100GB. I've upped to 200 GB and will review to make sure nothing unusual. Can you try again and see if that resolved the issue?

comment:8 by robe, 2 years ago

Okay determined cause of disk space out. I forgot to turn on log rotation on osgeo3-nginx when I set it up. As a result 90GB of space is taken up by nginx logs from beginning. I'll enable log rotation and that should prevent this issue moving forward.

comment:9 by jive, 2 years ago

Resolution: fixed
Status: newclosed

Deploy to nexus completed successfully.

Thanks for your assistance everyone!

comment:10 by robe, 2 years ago

Okay logrotate is all set and I forced a rotate with:

logrotate -vf /etc/logrotate.d/nginx

to confirm I have the configs right so shouldn't be a problem moving forward.

Note: See TracTickets for help on using tickets.