Opened 16 years ago
Closed 9 years ago
#277 closed task (fixed)
Robots Are Attacking!
Reported by: | warmerdam | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | SysAdmin | Keywords: | trac |
Cc: |
Description
Today we were able to catch one of our load spikes in action. The server-status report indicated:
Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-0 28743 0/909/2477 W 162.91 92 0 0.0 10.29 21.07 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 1-0 2426 0/44/1752 W 11.97 133 0 0.0 2.51 23.90 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 2-0 2876 0/2/1699 W 1.35 120 0 0.0 0.01 14.26 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 3-0 2880 0/8/2075 W 3.45 77 0 0.0 0.01 91.60 70.91.111.164 trac.osgeo.org GET /gdal/log/ HTTP/1.0 4-0 2882 0/11/2494 W 4.84 0 0 0.0 0.14 32.74 70.91.111.164 trac.osgeo.org GET /gdal/log/sandbox/ajolma/swig HTTP/1.0 5-0 2883 0/6/1292 W 1.81 10 0 0.0 0.03 17.24 70.91.111.164 trac.osgeo.org GET /gdal/log/trunk?rev=14376 HTTP/1.0 6-0 540 0/279/952 W 53.38 109 0 0.0 6.77 14.25 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 7-0 543 0/276/1812 W 55.07 109 0 0.0 2.62 14.39 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 8-0 20939 0/2031/2508 W 390.31 200 0 0.0 25.80 30.36 198.253.49.6 trac.osgeo.org GET /ossim/doxygen/classossimImageData.html HTTP/1.1 9-0 2890 0/20/2507 W 4.27 5 0 0.0 0.27 14.41 74.6.22.97 trac.osgeo.org GET /fdo/wiki/WikiFormatting HTTP/1.0 10-0 2893 0/0/1744 W 181.85 101 0 0.0 0.00 42.93 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 11-0 26129 0/1332/1966 W 212.63 0 0 0.0 9.06 25.59 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 12-0 546 0/277/785 W 56.27 115 0 0.0 1.47 5.29 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 13-0 2895 0/18/609 W 4.95 0 0 0.0 0.44 5.58 67.195.37.123 osgeo1.osgeo.org GET /switchuilocale/id?destination=node%2F723 HTTP/1.0 14-0 548 0/283/982 W 59.52 74 0 0.0 1.76 7.41 70.91.111.164 trac.osgeo.org GET /gdal/log/trunk HTTP/1.0 15-0 2896 0/0/591 W 34.98 96 0 0.0 0.00 4.39 70.91.111.164 trac.osgeo.org GET /gdal/log/branches/1.4 HTTP/1.0 16-0 2897 0/7/733 W 3.37 0 0 0.0 0.18 5.03 209.169.157.146 osgeo1.osgeo.org GET / HTTP/1.0 17-0 551 0/273/2312 W 49.57 128 0 0.0 5.62 26.73 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 18-0 552 0/262/1491 W 44.07 127 0 0.0 1.06 22.71 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 19-0 2898 0/9/295 W 4.06 0 0 0.0 0.22 2.45 70.91.111.164 trac.osgeo.org GET /gdal/browser/sandbox/crschmidt?order=size HTTP/1.0 20-0 2899 0/5/433 W 1.43 20 0 0.0 0.10 2.69 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 21-0 20959 0/2073/2346 W 382.95 9 0 0.0 27.21 28.29 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 22-0 2900 0/3/456 W 1.17 20 0 0.0 0.08 2.68 70.91.111.164 trac.osgeo.org GET /gdal/log/trunk?rev=14376 HTTP/1.0 23-0 20966 0/2043/2121 W 362.15 3 0 0.0 39.28 40.03 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 24-0 2901 0/1/377 W 0.00 94 0 0.0 0.000 5.31 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 25-0 20968 0/2090/2137 W 406.93 1 0 0.0 48.82 49.00 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13273/sandbox/crschmidt HTTP/1.0 26-0 2904 0/9/209 W 3.43 2 0 0.0 0.15 0.97 70.91.111.164 trac.osgeo.org GET /gdal/changeset/11871/sandbox/hobu HTTP/1.0 27-0 558 0/265/519 W 54.33 116 0 0.0 1.25 3.52 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 28-0 559 0/282/438 W 46.89 77 0 0.0 2.26 4.42 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 29-0 20982 0/2112/2125 W 394.18 1 0 0.0 22.55 22.84 70.91.111.164 trac.osgeo.org GET /gdal/changeset/11871/sandbox/hobu HTTP/1.0 30-0 2906 0/22/79 W 6.23 0 0 0.0 0.44 1.24 74.6.18.233 osgeo1.osgeo.org GET /pipermail/mapserver-users/2003-December/047445.html HTTP/1 31-0 2907 0/12/1450 W 2.25 58 0 0.0 0.09 8.68 74.6.22.97 trac.osgeo.org GET /grass/query?status=new&status=assigned&status=reopened&mil 32-0 19340 0/2268/2293 W 429.97 78 0 0.0 29.11 29.57 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 33-0 2910 0/15/177 W 5.81 8 0 0.0 0.20 1.93 70.91.111.164 trac.osgeo.org GET /gdal/log/trunk?rev=14376 HTTP/1.0 34-0 2911 0/10/642 W 2.71 0 0 0.0 0.36 4.87 24.61.22.108 trac.osgeo.org GET /mapguide/ HTTP/1.1 35-0 19351 0/2075/2088 W 567.14 102 0 0.0 143.37 143.43 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.5?old_path=%2f&format= 36-0 2912 0/5/2090 W 2.28 2 0 0.0 0.21 22.41 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13273/sandbox/crschmidt HTTP/1.0 37-0 2913 0/11/1972 W 3.02 0 0 0.0 0.15 14.52 209.85.238.11 trac.osgeo.org GET /gdal/timeline?milestone=on&ticket=on&changeset=on&wiki=on& 38-0 20988 0/2101/2118 W 369.99 139 0 0.0 38.18 38.77 192.5.156.252 svn.osgeo.org REPORT /ossim/!svn/vcc/default HTTP/1.1 39-0 2914 0/9/219 W 3.37 7 0 0.0 0.16 1.77 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 40-0 2915 0/10/18 W 3.79 15 0 0.0 0.21 0.79 70.91.111.164 trac.osgeo.org GET /gdal/log/trunk?rev=14376 HTTP/1.0 41-0 2916 0/13/81 W 3.42 7 0 0.0 0.07 0.42 74.6.22.97 trac.osgeo.org GET /grass/query?status=new&status=assigned&status=reopened&mil 42-0 2917 0/8/20 W 2.45 0 0 0.0 0.23 0.79 72.171.0.144 trac.osgeo.org GET /server-status HTTP/1.1 43-0 2918 0/10/39 W 3.26 10 0 0.0 0.21 0.92 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 44-0 2919 0/10/160 W 5.24 7 0 0.0 0.25 1.13 70.91.111.164 trac.osgeo.org GET /gdal/log/trunk?rev=14376 HTTP/1.0 45-0 2920 0/9/54 W 2.17 15 0 0.0 0.11 0.61 70.91.111.164 trac.osgeo.org GET /gdal/changeset/13196/sandbox/ajolma HTTP/1.0 46-0 18139 0/2315/2315 W 539.51 123 0 0.0 153.35 153.35 70.91.111.164 trac.osgeo.org GET /gdal/changeset/14384/branches/1.4?old_path=%2f&format= 47-0 2921 0/22/50 W 5.08 0 0 0.0 0.29 0.87 70.91.111.164 trac.osgeo.org GET /gdal/browser/sandbox/crschmidt?order=date HTTP/1.0 48-0 2928 0/0/100 W 8.54 88 0 0.0 0.00 0.36 70.91.111.164 trac.osgeo.org GET /gdal/log/ HTTP/1.0 49-0 2929 0/2/90 W 0.87 76 0 0.0 0.01 0.96 70.91.111.164 trac.osgeo.org GET /gdal/log/ HTTP/1.0
Of note is that we were getting massive hits (at about 5 requests per second) from a robot against Trac for changesets and trac was not able to keep up -- possibly because the client was unable to consume the results we were sending back fast enough.
It is proposed that we put in place "maximum ip per connection" limits on trac.osgeo.org, similar to what we did on download.osgeo.org for #216.
Change History (6)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
I wonder if it would be worth setting crawl-delay for the major spiders?
Yahoo and Microsoft support this directive in robots.txt, while for Google you have to set up a Webmasters Tools account and tell it to slow down in there.
comment:3 by , 16 years ago
Jason:
What problem are you trying to solve? The 'crawler' causing problems in this case was crawling from a comcast internet connection: clearly not one of the 'big 3' search spiders, which are typically well behaved, according to all of my log-reading and observations.
Anything that opens 45 different connections to your server at once is simply a broken crawler, in my mind, no questions asked.
comment:4 by , 16 years ago
I guess that answers my question :)
I'm wasn't trying to solve a particular problem; you have dealt with that nicely. Just wondering if setting those values would help conserve server resources in general.
comment:5 by , 16 years ago
Yeah. In general, well-behaved bots are not a problem (so far as I can observe) -- only poorly behaved bots which would ignore our "please be polite" requests anyway.
comment:6 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Since we even kind of survive the actual spam storm, closing.
This matches the default of '8' max server connections in Firefox about:config on my mac.
We may want to apply this to other services if we see other problems like this occuring: For now, I'd like to leave it on trac only and see what happens.