- Mon Jul 29, 2013 5:21 pm
#170488
After sitting through yet another six-year-old spamming and avoiding the helpbot filter, I wanted to present this idea for nailing a few more of the idiots, more quickly.
The two predominant mechanisms for avoiding the spam filter seem to be doubling letters and adding punctuation. My suggestion is that punctuation and double letters be removed before applying the filter.
Let's say that "nice" is a word in the spam filter. This would nail people who do things like:
nnniiiccceeee or n i c e or even nnii_C_e. (I'm assuming that upper-case is already dealt with).
Obviously words like "happy" would have to be stored as "hapy."
This may not work for some words, especially since you are matching within a long string after the spaces are removed. For instance, the sentence "As someone once said...." would trigger a false positive. The solution would be to allow some short words to trigger against only the raw data. I assume this is already done in some cases.
Let's be honest. Staff can't be on all the time and these trolls can be very persistent and annoying.
The two predominant mechanisms for avoiding the spam filter seem to be doubling letters and adding punctuation. My suggestion is that punctuation and double letters be removed before applying the filter.
Let's say that "nice" is a word in the spam filter. This would nail people who do things like:
nnniiiccceeee or n i c e or even nnii_C_e. (I'm assuming that upper-case is already dealt with).
Obviously words like "happy" would have to be stored as "hapy."
This may not work for some words, especially since you are matching within a long string after the spaces are removed. For instance, the sentence "As someone once said...." would trigger a false positive. The solution would be to allow some short words to trigger against only the raw data. I assume this is already done in some cases.
Let's be honest. Staff can't be on all the time and these trolls can be very persistent and annoying.