Quantcast

Maximum PC

It is currently Wed Jul 30, 2014 12:43 pm

All times are UTC - 8 hours




Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: Forum Software
PostPosted: Sat Oct 06, 2012 9:48 am 
Bitchin' Fast 3D Z8000*
Bitchin' Fast 3D Z8000*
User avatar

Joined: Tue Jun 29, 2004 11:32 pm
Posts: 2555
Location: Somewhere between compilation and linking
Hey guys,

Those of you that aren't moderators won't realize it, but the forum has been clobbered with hundreds of spam posts from new users that are pending approval. There are literally pages of these posts in every forum category. I had a brief discussion with SpiderMonkey regarding the tools he has at his disposal for fighting spam and the situation doesn't appear to be very promising. So what do you say we write a script or two to clean this place up?

My initial thought is to write a script identifying any user with > 4 pending posts and simply delete all of the posts based on the idea that only a spammer would have five or more posts pending approval. Just throwing out an idea... if you have a different approach that may work let's hear it.


Top
  Profile  
 
 Post subject: Re: Forum Software
PostPosted: Sun Oct 07, 2012 7:54 pm 
Million Club - 5 Plus*
Million Club - 5 Plus*
User avatar

Joined: Sun Sep 12, 2004 6:37 pm
Posts: 4745
Location: In the monkey's litterbox
I'd actually been toying with a similar idea myself, mostly with looking how to tie a Bayesian classifier into the mod queue list.

We could probably also auto-deny any non English titles (Cyrillic characters) or common phrases. (Why would we care about handbags on this forum?)

EDIT: I started working on a greasemonkey script that just checks the disapproval checkboxes and changes the row color for me if my spam check function evaluates to true. I also check for common approval words and turn those rows green so I can find them in a quick skim. This would work much better if I could figure out how to get more than 100 posts per page.


Top
  Profile  
 
 Post subject: Re: Forum Software
PostPosted: Wed Oct 10, 2012 9:59 am 
Bitchin' Fast 3D Z8000*
Bitchin' Fast 3D Z8000*
User avatar

Joined: Tue Jun 29, 2004 11:32 pm
Posts: 2555
Location: Somewhere between compilation and linking
smartcat99s wrote:
I'd actually been toying with a similar idea myself, mostly with looking how to tie a Bayesian classifier into the mod queue list.

I was thinking of something that wasn't tied into the existing forum software that would flush out all of the current spam. It sounds like you're working on something that is a bit more permanent that would hopefully prevent this situation from occurring again. You do realize that setting up a Bayesian filter does take quite a bit of time, right? Part of the reason that Google's spam filter is so effective is that they are able to distribute the work across millions of users.

I just noticed that most of the spam has now been cleared out of the Programmer's Paradise. Is that your script at work?

smartcat99s wrote:
We could probably also auto-deny any non English titles (Cyrillic characters) or common phrases. (Why would we care about handbags on this forum?)

We have to be careful there because people will sometimes use a valid technology word that doesn't appear to be English and/or occasionally use goofy titles as a joke/gag: 4 out of 5 dentists agree that Clojure kills 90% of F# germs.

I'm pretty booked at the moment, but I should be able to start a script soon (maybe tomorrow night). I haven't done this type of programming in C# yet, so I'll probably use it just to gain some more network programming experience in that language. I had started building a network scanner, but that can be put on hold.

Also, are you able to block IP addresses or is SM the only person able to do this?


Top
  Profile  
 
 Post subject: Re: Forum Software
PostPosted: Wed Oct 10, 2012 3:35 pm 
Million Club - 5 Plus*
Million Club - 5 Plus*
User avatar

Joined: Sun Sep 12, 2004 6:37 pm
Posts: 4745
Location: In the monkey's litterbox
Gadget wrote:
smartcat99s wrote:
I'd actually been toying with a similar idea myself, mostly with looking how to tie a Bayesian classifier into the mod queue list.

I was thinking of something that wasn't tied into the existing forum software that would flush out all of the current spam. It sounds like you're working on something that is a bit more permanent that would hopefully prevent this situation from occurring again. You do realize that setting up a Bayesian filter does take quite a bit of time, right? Part of the reason that Google's spam filter is so effective is that they are able to distribute the work across millions of users.

I just noticed that most of the spam has now been cleared out of the Programmer's Paradise. Is that your script at work?


I went and manually cleaned the spam out of the folder as I occasionally do. I sort of worked on developing a script that helps classify the topics right on the page in-browser, but it's by no means permanent -- it's just meant to reduce the pain a little bit. I know a Bayesian filter is a lot of work, and that's why it's future wish list work instead of present work. ;)

Gadget wrote:
smartcat99s wrote:
We could probably also auto-deny any non English titles (Cyrillic characters) or common phrases. (Why would we care about handbags on this forum?)

We have to be careful there because people will sometimes use a valid technology word that doesn't appear to be English and/or occasionally use goofy titles as a joke/gag: 4 out of 5 dentists agree that Clojure kills 90% of F# germs.

I'm pretty booked at the moment, but I should be able to start a script soon (maybe tomorrow night). I haven't done this type of programming in C# yet, so I'll probably use it just to gain some more network programming experience in that language. I had started building a network scanner, but that can be put on hold.

Also, are you able to block IP addresses or is SM the only person able to do this?


For the strange characters, I was more concerned about the titles that were > 50% Cyrillic characters instead of using a dictionary to filter things.

I have the menu link to ban by IP in the mod control panel, but it just gives me a pure white page.


Top
  Profile  
 
 Post subject: Re: Forum Software
PostPosted: Wed Oct 10, 2012 3:51 pm 
Bitchin' Fast 3D Z8000*
Bitchin' Fast 3D Z8000*
User avatar

Joined: Tue Jun 29, 2004 11:32 pm
Posts: 2555
Location: Somewhere between compilation and linking
smartcat99s wrote:
I have the menu link to ban by IP in the mod control panel, but it just gives me a pure white page.

In Chrome, I get a page that suggests the server isn't configured correctly. I believe that I can ban usernames and email address though. I also noticed that the email ban is pretty broad, including:

*@*.de
*@*.ru
*@*.ua
*@*.info

We're banning all German, Russuan, UA (UAE?) and .info email addresses at this point.


Top
  Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC - 8 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group