Opened 16 years ago

Closed 16 years ago

#344 closed task (fixed)

GeoTools auto builds failing due to maven repository webdav folder not responding

Reported by: jive Owned by: warmerdam
Priority: normal Milestone:
Component: SysAdmin Keywords: download
Cc: aaime

Description

Our auto build box (running hudson) is failing maven builds randomly due to the osgeo webdav folder for geotools (where we publish our jars and pom.xml files) not being able to keep up.

Here is an example error:

Error transferring file

org.opengis:geoapi-pending:jar:2.3-SNAPSHOT

from the specified remote repositories:

central (http://repo1.maven.org/maven2), osgeo (http://download.osgeo.org/webdav/geotools/), maven2-repository.dev.java.net (http://download.java.net/maven/2), geotools (http://maven.geotools.fr/repository)

Path to dependency:

1) org.geotools:gt-mysql:jar:2.6-SNAPSHOT 2) org.geotools:gt-jdbc:jar:2.6-SNAPSHOT 3) org.geotools:gt-api:jar:2.6-SNAPSHOT 4) org.geotools:gt-referencing:jar:2.6-SNAPSHOT 5) org.geotools:gt-metadata:jar:2.6-SNAPSHOT 6) org.opengis:geoapi-pending:jar:2.3-SNAPSHOT

This problem is intermittent; we have set up a second repository on an opengeo server ; but this presents some confusion for the user community (and we do not like "do what we say; not what we do" as a policy).

Change History (14)

comment:1 by jive, 16 years ago

Summary: GeoTools auto builds failing due to maven repository timing outGeoTools auto builds failing due to maven repository webdav folder not responding

comment:2 by aaime, 16 years ago

Btw, I observed code 503 errors as well. Hypothesis made so far:

  • the webdav crashes and it's automatically restarted after a while
  • the webdav has some DOS prevention mechanism and starts answering 503 if too many requests are being made

Mind that a Maven repo is bound to have a relatively high traffic.

comment:3 by aaime, 16 years ago

Cc: aaime added

comment:4 by aaime, 16 years ago

Our builds on the build server (http://gridlock.openplans.org:8080/hudson) keep on failing on 503 errors when trying to retrieve: http://download.osgeo.org/webdav/geotools/org/opengis/geoapi-pending/2.3-SNAPSHOT/geoapi-pending-2.3-SNAPSHOT.jar

A 503 is descried as:

The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.

      Note: The existence of the 503 status code does not imply that a
      server must use it when becoming overloaded. Some servers may wish
      to simply refuse the connection.

If this is not solved soon we'll have to look into a different hosting, it's been over one week already (we waited a few days before reporting to make sure it was not a temporary issue).

A 503 error seems to point to some sort of server issue, but if you can look into your webdav logs and tell us there is a problem in the way with hit it, well, we can have a look on our side

comment:5 by aaime, 16 years ago

For the record, I run this script from the same machine that hosts the build:

for i in `seq 1 100`; do  wget http://download.osgeo.org/webdav/geotools/org/opengis/geoapi-pending/2.3-SNAPSHOT/geoapi-pending-2.3-SNAPSHOT.jar; done;

and I haven't seen a 503 happen... so I'm wondering what's different with our build server. A server side log, if there is any, could shed some light on the issue

comment:6 by tmitchell, 16 years ago

Component: GeneralSAC
Owner: changed from tmitchell to sac@…

Changing component to SAC instead of general. Can someone in SAC look at this?

comment:7 by warmerdam, 16 years ago

Keywords: download added

I would note that download.osgeo.org has machinery in place to refuse too many connections from the same source at the same time. Is it possible that maven is (sometimes) making requests in parallel?

Some information on this is available in #216.

If this is the issue, we may be able to raise the maxip count for the maven directories or even make this a distinct virtual server on the same machine.

comment:8 by jive, 16 years ago

Yes that is very possible; I also note that our build box (a fast machine) is experiencing more problems then others; and has a very fast internet connection.

Can we try raising the maxip count for these directories and perform a trial?

comment:9 by warmerdam, 16 years ago

Owner: changed from sac@… to warmerdam

I have increased the MaxIP count parameter to six from three in /etc/httpd/conf.d/sites/download.osgeo.org.conf on the server. Could you try again and see if this seems to solve it. If so, I'll try to restrict this setting to the maven repository portion of the site though I'm not exactly sure how to do this.

in reply to:  9 comment:10 by groldan, 16 years ago

That seems to work. The GeoTools build at hudson is back with no complains.

comment:11 by jive, 16 years ago

Gabriel the hudson build box is currently hitting your own repo; and only touching on the osgeo module for this one file. Could we try asking it to hit osgeo for everything in order to stress the system? Or would you rather just close this; we have identified the problem and know what to ask for if teams run into trouble in the future?

comment:12 by groldan, 16 years ago

Jody I'm not sure I understand what you mean by hitting our own repo, though I don't manage the hudson instance and that might be the reason. afaik the repost being hit are: codehaus-snapshot-plugins, osgeo and maven2-repository.dev.java.net

does that make any sense? doesn't the osgeo replaces the RR one for us? btw, I'm running a full build on a clean box right now, if that serves as a stress test

comment:13 by jive, 16 years ago

Just watched this failure crop up again: http://hudson.opengeo.org/hudson/job/geotools-trunk/1563/console

Error transferring file

org.opengis:geoapi-pending:jar:2.3-SNAPSHOT

from the specified remote repositories:

central (http://repo1.maven.org/maven2), osgeo (http://download.osgeo.org/webdav/geotools/), maven2-repository.dev.java.net (http://download.java.net/maven/2), geotools (http://maven.geotools.fr/repository)

Path to dependency:

1) org.geotools:gt-legacy:jar:2.6-SNAPSHOT 2) org.geotools:gt-referencing:jar:2.6-SNAPSHOT 3) org.geotools:gt-metadata:jar:2.6-SNAPSHOT 4) org.opengis:geoapi-pending:jar:2.3-SNAPSHOT

We are going to try an make use of a repository on opengeo hardware for SNAPSHOTS; and save the osgeo repository for stable releases. I am a bit worried that this just offloads the problem onto client projects :-(

If we could try an bump up the MaxIP count again that may help?

comment:14 by warmerdam, 16 years ago

Resolution: fixed
Status: newclosed

I have updated the /etc/httpd/conf.d/sites/download.osgeo.org.conf file to include:

<Directory "/osgeo/download">
  Options FollowSymLinks Indexes
  MaxConnPerIP 3
</Directory>
<Directory "/osgeo/download/webdav">
  Options FollowSymLinks Indexes
  MaxConnPerIP 12
</Directory>

This seems to give 12 connections in the webdav area and only 3 in the rest of the respository.

I'm tentatively closing this ticket - reopen if there are still problems.

Note: See TracTickets for help on using tickets.