[BozemanLUG] Best spam filter

Jordan Schatz jordan at noionlabs.com
Sun Nov 18 20:54:59 MST 2012



> unless it were to be like a stateful firewall... where it can tell the
> difference between a new connection attempt, and one that is part of an
> existing conversation. That seems really, really hard to me.
I believe some spam filters are trying to be semi stateful, keeping track of
things like have I ever emailed this address, has this address been a CC, or
from that I've seen before, have I received mail from this domain before etc.

> I think the reason the big companies seem to be successful at it is because
> they have millions of customers... and I'm guessing they monitor when an email
> with a certain pattern goes to a lot of users and how the first batch of
> receivers react to it...
I've thought that the bigger companies where using their users feedback as well,
and I still think they are probably taking it into account. But services like
MailRoute.net seem to do a very good job at filtering, and don't get those
signals.

I have heard that this http://en.wikipedia.org/wiki/Sender_Policy_Framework may
be the end of spam.

Personally I think the answer is in good AI / machine learning. I understand
that my spam may be another persons ham, but in my preference whether certain
content is ham or spam is very consistent, and should be identifiable by a ML
algorithm.

- Jordan


On Fri, 16 Nov 2012 12:07:40 -0700 (MST), Scott Dowdle <dowdle at montanalinux.org> wrote:
> Jordan.
> 
> ----- Original Message -----
> > I think the problem is artificial intelligence. A human can quickly
> > identify spam, but from what I've seen most software is STILL bad at it. That
> > is what needs to be fixed.
> 
> The difference here is that spam is unwanted unsolicited email.  If you want a
> newsletter from a product vendor, hopefully because you intentionally signed
> up for it, it isn't unwanted and it isn't unsolicited... but the email looks
> exactly the same if it is spam or if it isn't spam.  There is no way to make
> AI know what you wanted and what you didn't want... unless it were to be like
> a stateful firewall... where it can tell the difference between a new
> connection attempt, and one that is part of an existing conversation.  That
> seems really, really hard to me.
> 
> I think the reason the big companies seem to be successful at it is because
> they have millions of customers... and I'm guessing they monitor when an email
> with a certain pattern goes to a lot of users and how the first batch of
> receivers react to it... and maybe harness their judgement to figure out what
> to do with it... but given the typical speedy delivery times of mails... that
> won't work either... but it could give patterns for future emails.  I think
> the more users you have and the more you can muck with their incoming, reading
> and marking patterns... the more data you have to base decisions on.  Most
> small to mid-size companies only have access to the incoming patterns... and
> not the reading and marking... but since people like Google have the vast
> majority if their account holders using their web-based client (gmail), they
> have access to so much (reading, marking, deleting) and more including
> addressbook, all sent emails, etc.  Our simple SMTP servers just don't have
> access to all of those I/O paths.
> 
> TYL,
> -- 
> Scott Dowdle
> 704 Church Street
> Belgrade, MT 59714
> (406)388-0827 [home]
> (406)994-3931 [work]
> _______________________________________________
> Discuss mailing list
> Discuss at bozemanlug.org
> http://lists.bozemanlug.org/mailman/listinfo/discuss


More information about the Discuss mailing list