Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#1694 closed task (fixed)

IndexError: list index out of range on trac submissing

Reported by: strk Owned by: strk
Priority: blocker Milestone:
Component: SysAdmin/Trac Keywords: spam
Cc: neteler, martin, wenzeslaus

Description

It's been reported by Martin Spott, Vaclav Petras and Markus Neteler that attempts to post a trac comment often fail with errors like this:

*Trac detected an internal error:* 

IndexError: list index out of range 

There was an internal error in Trac. It is recommended that you notify your  
local Trac administrator with the information needed to reproduce the issue. 

To that end, you could a ticket.
The action that triggered the error was:
  
POST: /ticket/3034 
                
TracGuide <https://trac.osgeo.org/grass/wiki/TracGuide> — The Trac User and Administration Guide

It looks like having to do with the spam filter because at least for Martin Spot case he got the error upon posting to /osgeo/ticket/1693 and that attempt is found in the osgeo spam admin page showing it was considered "spam" due to StopForumSpam not liking his username:

https://trac.osgeo.org/osgeo/admin/spamfilter/monitor

StopForumSpam (-4): StopForumSpam says this is spam (username [99.23])

Martin post having -4 score from StopForumSpam and no positive score made things go bad.

For now I've marked his post as ham, but we still need to get back to trust authenticated users more, I guess.

I'm not sure if the training will also communicate we like martin and others nickname...

Change History (20)

comment:1 by strk, 8 years ago

I _think_ the error is related to the Captcha mechanism not working, as I believe when a negative score is assigned, a captcha should kick-in to give the user another chance to raise the karma (a positive captcha can raise the karma again)

comment:2 by wenzeslaus, 8 years ago

Priority: normalcritical

This starts to be really critical. I can't submit URLs now.

comment:3 by strk, 8 years ago

wenzeslaus you were again being considered a spammer (for too many links). May I suggest neteler gives you some SPAM permissions so to at least train the bayes database ? Chances are you are _never_ considered a spammer while having some SPAM permission (very likely so with SPAM_ADMIN, not sure about the others) -- full list of spam permissions: https://trac.edgewall.org/wiki/SpamFilter#Permissions

in reply to:  3 comment:4 by wenzeslaus, 8 years ago

Replying to strk:

Chances are you are _never_ considered a spammer while having some SPAM permission

I can already see the "Report spam" link at the Trac wiki pages. (I had to wait with the answer until switched to a different network :-)

comment:5 by strk, 8 years ago

Then maybe it takes SPAM_ADMIN to avoid being considered a spammer. Do you also have that ? It'd give you an "admin" link with a panel to see which entries were considered spam and which not, and a way to train the bayes database (SPAM_TRAIN and SPAM_MONITOR should be enough for that)

comment:6 by neteler, 8 years ago

Currently on the GRASS GIS trac:

  • strk + wenzeslaus = SPAM_ADMIN

besides the other existing TRAC_ADMINs.

Changed now:

  • wenzeslaus = SPAM_ADMIN + SPAM_MONITOR

comment:7 by neteler, 8 years ago

"BlogSpam says content is spam (IP previously blocked: Listed in HTTP;bl [20-ip.js])"

strk: how can I unblock wenzeslaus' IP?

comment:8 by neteler, 8 years ago

Priority: criticalblocker

I guess I will deactivate something in the spam management since it blocks our workflow.

comment:9 by neteler, 8 years ago

Switching entirely off https://trac.osgeo.org/grass/admin/spamfilter/external leads to

Internal Server Error

wow...

in reply to:  7 ; comment:10 by neteler, 8 years ago

Replying to neteler:

"BlogSpam says content is spam (IP previously blocked: Listed in HTTP;bl [20-ip.js])"

strk: how can I unblock wenzeslaus' IP?

ok I have now changed the Blogspam related setting in:

https://trac.osgeo.org/grass/admin/spamfilter/config

  • BlogSpamFilterStrategy 5 --> 50

Let's see.

in reply to:  10 comment:11 by neteler, 8 years ago

Replying to neteler:

  • BlogSpamFilterStrategy 5 --> 50

Of course the opposite:

  • BlogSpamFilterStrategy 5 --> 0

In a first test wenzeslaus is finally able to use trac again. Hopefully also the other devs with blacklisted IP.

comment:12 by strk, 8 years ago

I understand "BlogSpam" is an external service working both ways: we receive info from it and we _send_ info to it. Chances are the IP blockage was sent from us to BlogSpam (due to excessive posts per minute, due to the automatic preview).

For the time being it is ok to not use BlogSpam. I could make that a global setting. Will look up how to manage BlogSpam entries, if possible.

comment:13 by strk, 8 years ago

In https://trac.osgeo.org/osgeo/admin/spamfilter/external I see we can specify a list of tests to be skipped. It looks like "20-ip.js" is the offending test, so we could be skipping that. I also see that the external services are automatically skipped when the bayes database contains enough spam (how much is "enough" can be configured, for now it is 20 ham / 20 spam, but we only have ham entries).

For now I made the karma=0 global.

I confirm the Internal Server Error on attempts to disable the external service

comment:14 by strk, 8 years ago

The Internal Server Error is related to the non-working captcha. The file catpcha/image.py calls get_pkginfo(PIL)['VERSION'] while get_pkginfo(PIL) returns an empty map. Sounds like a version compatibility issue.

comment:15 by strk, 8 years ago

After fixing the PIL issue, I moved on to figure the "list index out of range" It is due to the captcha/api.py script calling SELECT value FROM system WHERE name='spamfilter_lastclean' and expecting it to return at least a record, while it returns none.

Sounds like TracSpamFilter is really not stable yet for real use. I'm trying to figure how to effectively submit a patch upstream, not easy as all trac plugins live togheter in the same trac instance...

Last edited 8 years ago by strk (previous) (diff)

comment:16 by strk, 8 years ago

Captcha is finally fixed with this patch:

--- api.py.000  2016-06-03 07:49:53.000000000 -0700
+++ api.py      2016-06-03 07:52:23.000000000 -0700
@@ -241,8 +241,7 @@
             row = self.env.db_query("SELECT value FROM system "
                                     "WHERE name='spamfilter_lastclean'")
+            last = int(row[0][0])
         except:
             pass
-        else:
-            last = int(row[0][0])
         tim = int(time.time())
         if last+self.captcha_cleantime < tim:

I'll try to find a way to submit upstream, but meanwhile it should be working. Spam-looking entries would be blocked and user would be prompted to solve a captcha.

comment:17 by strk, 8 years ago

Resolution: fixed
Status: newclosed

So I'm finally closing this as IndexError was found and fixed.

I've now given karma +3 to a solved captcha, and turned BlogSpam karma back to -2. This means, theoretically, that if BlogSpam catches you you can still pass by solving the captcha. If that's not working for you, wenzeslaus, please file a _new_ ticket.

comment:18 by strk, 8 years ago

For the record, patch filed upstream: https://trac.edgewall.org/ticket/12503

comment:19 by wenzeslaus, 8 years ago

Just wanted to report back: I'm now being prompted for captcha every time, so the system works.

comment:20 by strk, 8 years ago

Thanks for confirmation. Sorry I dunno how to fix the BlogSpam record on your blocked IP. That problem should be fixed by NOT using external service (which would happen as soon as we have enough spam records in the bayes database...)

Note: See TracTickets for help on using tickets.