Mail.app Spam filtering
|
|
Thread rating:  |
Norbert Liecfeldt - 23 Feb 2008 10:01 GMT Hello
I've been training this thing for months now and it still gets the most basic spam wrong, doesn't mark them as spam (I don't get any false negatives, so that's something I suppose).
Things which gmail and Thunderbird instantly pick up, like
. FreeViagraPills ..
Phentrimine Tramadol FemaleViagra & 400 more meds to choose from
Please find your meds on our site
www.xcv
or
Get Out of Debt Today. Avoid Bankruptcy. Save Thousands... The Professional Way!! http://xyz
pass through the smap filter without being marked. Headers (not sure what to hide, so triple-xxx'd quite a bit!) are as follows. The X-Spam-Flaf is set yo Yes, but I believe this is our ISP, rather than Mail
Return-Path: <x@y..com> Received: from gate1.gate.sat.xxx.com (gate1.gate.sat.xxx.com [xxx.xxx.x.xx]) by mail27a.mail.sat.xxx.com (SMTP Server) with ESMTP id 46DE71B4004 for <nl+spam@xxx.org>; Fri, 22 Feb 2008 22:39:56 -0500 (EST) Received: from gate3.r1.iad.xxx.com (iad3.xxx.com [xxx.97.xxx.217]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by gate1.gate.sat.xxx.com (SMTP Server) with ESMTP id 3823F8582D2 for <nl+spam@xxx.org>; Fri, 22 Feb 2008 22:39:56 -0500 (EST) X-Envelope-From: <AshleyrollinsKemp@xxxx.com> X-Envelope-To: <xxx@gmail.com>, <nl@xxx.org>, <research@xxx.org> X-Quarantine-ID: <KnEQmW83yd+h> X-Spam-Flag: YES X-Spam-Score: 10.001 X-Spam-Level: ********** X-Spam-Status: Yes, score=10.001 tag=-100 tag2=6 kill=6 tests=[CLOUDMARK=10.001] Received: from gate3.r1.iad.xxx.com ([127.0.0.1]) by localhost (gate3.r1.iad.xxx.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KnEQmW83yd+h; Fri, 22 Feb 2008 22:39:55 -0500 (EST) X-Originating-Ip: [xxx.28.xxx.70] Received: from wolf (adsl190-28-145-70.epm.net.co [190.28.145.70]) by gate3.r1.iad.xxx.com (SMTP Server) with SMTP id 00CAD44C732; Fri, 22 Feb 2008 22:39:50 -0500 (EST) Message-ID: <280cc01c875bb$eaef13d0$0801a8c0@wolf> From: "Carey Hartman" <xxx@xxx.com> To: <research@xxx.org> Subject: erase Your Credit Card Debt Date: Fri, 22 Feb 2008 22:30:15 +0300 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_280C8_01C875BB.EAEF13D0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.2180
This is a multi-part message in MIME format.
------=_NextPart_000_280C8_01C875BB.EAEF13D0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Get Out of Debt Today. Avoid Bankruptcy. Save Thousands... The = Professional Way!! http://initse.com.cn/ ------=_NextPart_000_280C8_01C875BB.EAEF13D0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META http-equiv=3DContent-Type content=3D"text/html; = charset=3Diso-8859-1"> <META content=3D"MSHTML 6.00.2900.2180" name=3DGENERATOR> <STYLE></STYLE> </HEAD>=20 <BODY bgColor=3D#ffffff> <DIV align=3Dleft><FONT face=3DArial size=3D2>Get Out of Debt Today. = Avoid=20 Bankruptcy. Save Thousands... The Professional Way!!</FONT></DIV> <DIV align=3Dleft><FONT face=3DArial size=3D2><A=20 href=3Dhttp://initse.com.cn/>http://initse.com.cn/</A></FONT></DIV> </BODY></HTML>
------=_NextPart_000_280C8_01C875BB.EAEF13D0--
Marc Heusser - 23 Feb 2008 13:57 GMT > I've been training this thing for months now and it still gets the most > basic spam wrong, doesn't mark them as spam (I don't get any false > negatives, so that's something I suppose). Try SpamSieve, http://c-command.com/spamsieve/
HTH
Marc
 Signature remove bye and from mercial to get valid e-mail <http://www.heusser.com>
Wim - 23 Feb 2008 14:11 GMT In article <marc.heusser-8BAF31.14573823022008@news.uzh.ch>, Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> wrote:
> > I've been training this thing for months now and it still gets the most > > basic spam wrong, doesn't mark them as spam (I don't get any false > > negatives, so that's something I suppose). > > Try SpamSieve, http://c-command.com/spamsieve/ Good idea, i have SS also and there are my stats:
Gefilterde e-mail 1.082 goede berichten 216 spamberichten (17%) 2 spamberichten per dag
SpamSieve nauwkeurigheid 17 onjuist negatief 8 onjuist positief (32%) 98.1% correct
Corpus 162 goede berichten 221 spamberichten (58%) 21.388 totaal aantal woorden
Regels 57 blokkeerlijst regels 761 witte lijst regels
Toont statistieken sinds 7/12/07 19:09
 Signature Please do not top-post. Your answer belongs after (or intermixed with) the quoted material to which you reply, after snipping all irrelevant material.
Marc Heusser - 23 Feb 2008 16:56 GMT > In article <marc.heusser-8BAF31.14573823022008@news.uzh.ch>, > Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> [quoted text clipped - 7 lines] > > Good idea, i have SS also and there are my stats: Have been using it ever since it became available (at the time for Eudora, currently using Mail - and the statistics are *after* the server's spam filters :-)
Filtered Mail 44'591 Good Messages 29'498 Spam Messages (40%) 37 Spam Messages Per Day
SpamSieve Accuracy 0 False Positives 0 False Negatives 100.0% Correct
Corpus 4'373 Good Messages 23'345 Spam Messages (84%) 123'941 Total Words
Rules 74 Blocklist Rules 2'849 Whitelist Rules
Showing Statistics Since 01.01.06 12:00
Michael Tsai updates SpamSieve as soon as new spammer's tricks get en vogue - it has not failed me in 5.5 years of heavy use :-)
Marc
 Signature remove bye and from mercial to get valid e-mail <http://www.heusser.com>
Wim - 23 Feb 2008 18:48 GMT In article <marc.heusser-7DC66A.17563823022008@news.uzh.ch>, Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> wrote:
> Have been using it ever since it became available (at the time for > Eudora, currently using Mail - and the statistics are *after* the > server's spam filters :-) Here also, first, the mailserver of my ISP catch the most crap and then SpamSieve does the rest :-)
> Michael Tsai updates SpamSieve as soon as new spammer's tricks get en > vogue - it has not failed me in 5.5 years of heavy use :-) Indeed. He is really a good dev-er :-)
I just don't like the ugly Dock icon. And a toolbar icon would be nice too.
 Signature Please do not top-post. Your answer belongs after (or intermixed with) the quoted material to which you reply, after snipping all irrelevant material.
Megadave - 26 Feb 2008 04:10 GMT > I've been training this thing for months now and it still gets the most > basic spam wrong, doesn't mark them as spam (I don't get any false > negatives, so that's something I suppose). I've noticed the same thing. In fact spam trapping seems to have gotten *worse* in 10.5 (it was actually pretty good in 10.4).
 Signature [P]eople [E]ating [T]asty [A]nimals
Barry Margolin - 27 Feb 2008 06:05 GMT > > I've been training this thing for months now and it still gets the most > > basic spam wrong, doesn't mark them as spam (I don't get any false > > negatives, so that's something I suppose). > > I've noticed the same thing. In fact spam trapping seems to have gotten > *worse* in 10.5 (it was actually pretty good in 10.4). Or maybe the spammers have gotten better. It's a constant arms race.
 Signature Barry Margolin, barmar@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group ***
Megadave - 27 Feb 2008 06:57 GMT > > > I've been training this thing for months now and it still gets the most > > > basic spam wrong, doesn't mark them as spam (I don't get any false [quoted text clipped - 4 lines] > > Or maybe the spammers have gotten better. It's a constant arms race. Possible...
 Signature [P]eople [E]ating [T]asty [A]nimals
gtr - 27 Feb 2008 15:54 GMT >>>> I've been training this thing for months now and it still gets the most >>>> basic spam wrong, doesn't mark them as spam (I don't get any false [quoted text clipped - 6 lines] > > Possible... Time to mention again that SpamSieve is a greatly appreciated program.
 Signature Thank you and have a nice day.
Megadave - 01 Mar 2008 03:48 GMT > Time to mention again that SpamSieve is a greatly appreciated program. = D
 Signature [P]eople [E]ating [T]asty [A]nimals
erilar - 27 Feb 2008 17:02 GMT > > > I've been training this thing for months now and it still gets the most > > > basic spam wrong, doesn't mark them as spam (I don't get any false [quoted text clipped - 4 lines] > > Or maybe the spammers have gotten better. It's a constant arms race. The spammers have gotten better. I have a really good spam filter at the ISP level and stuff still gets through that mail.app doesn't spot, either.
 Signature Mary Loomer Oliver (aka Erilar)
You can't reason with someone whose first line of argument is that reason doesn't count. --Isaac Asimov
Erilar's Cave Annex: http://www.chibardun.net/~erilarlo
Marc Heusser - 27 Feb 2008 23:56 GMT > > > > I've been training this thing for months now and it still gets the most > > > > basic spam wrong, doesn't mark them as spam (I don't get any false [quoted text clipped - 8 lines] > the ISP level and stuff still gets through that mail.app doesn't spot, > either. A server side spam filter can never be as good a s one on your own computer, see http://c-command.com/spamsieve/manual#identifying-spam, and http://www.paulgraham.com/spam.html for an in-depth explanation. Server-side filters can be circumvented by spammers, whereas client-side statistical analysis cannot. The principle is that spammers cannot know how your particular good e-mails look like. A server-side filter cannot use that knowledge. That is why filters such as SpamSieve are needed and work well.
HTH
Marc
 Signature remove bye and from mercial to get valid e-mail <http://www.heusser.com>
Lewis - 28 Feb 2008 00:35 GMT In article <marc.heusser-92888A.00564628022008@news.uzh.ch>, Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> wrote:
> A server side spam filter can never be as good a s one on your own > computer, see http://c-command.com/spamsieve/manual#identifying-spam, and [quoted text clipped - 4 lines] > e-mails look like. A server-side filter cannot use that knowledge. > That is why filters such as SpamSieve are needed and work well. Erm... you seem to think that server-side filters cannot use per-user settings. This is, of course, wrong. The reason SpamAssassin works so well is because it uses it's own rules and the user's own Bayes score to tag spam.
 Signature Bart: This is the worst day of my life. Homer: This is the worst day of your life SO FAR.
Marc Heusser - 28 Feb 2008 03:12 GMT > In article <marc.heusser-92888A.00564628022008@news.uzh.ch>, > Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> [quoted text clipped - 10 lines] > Erm... you seem to think that server-side filters cannot use per-user > settings. This is, of course, wrong. It is typically right, because of the use of an ISP's filter. That is what the OP has. But of course you are right - I should have used the term per-user and purely statistical analysis.
> The reason SpamAssassin works so > well is because it uses it's own rules and the user's own Bayes score to > tag spam. If it uses rules, quite likely it is either too aggressive or it will let pass some spam mails. I found the true probabilities (Bayesian) filters the only ones that filter the last spam mails. ISP/rule based filters are ok to cut down the spam load by a factor of 5 maybe (for me from some 250 to some 50 spam messages a day). SpamSieve takes care of the rest.
YMMV
Marc
 Signature remove bye and from mercial to get valid e-mail <http://www.heusser.com>
Jerry Kindall - 28 Feb 2008 03:30 GMT > > In article <marc.heusser-92888A.00564628022008@news.uzh.ch>, > > Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> [quoted text clipped - 25 lines] > from some 250 to some 50 spam messages a day). SpamSieve takes care of > the rest. Gmail's spam filter is sufficiently good that I actually route some of my mail through it when my server-side rules think it's spammy enough. Result: SpamSieve is lonely because I typically only ever get a couple spams a week.
I am pretty sure that when you have as many users as Gmail does, just knowing how many of them got a particular piece of mail is a fairly strong spam indicator.
They do seem to go further than your typical ISP, or mail provider for that matter, so in general your advice is sound.
 Signature Jerry Kindall, Seattle, WA <http://www.jerrykindall.com/>
Send only plain text messages under 32K to the Reply-To address. This mailbox is filtered aggressively to thwart spam and viruses.
Lewis - 28 Feb 2008 05:18 GMT In article <marc.heusser-F99916.04122728022008@news.uzh.ch>, Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> wrote:
> > The reason SpamAssassin works so > > well is because it uses it's own rules and the user's own Bayes score to [quoted text clipped - 3 lines] > let pass some spam mails. I found the true probabilities (Bayesian) > filters the only ones that filter the last spam mails. SpamAssasin uses BOTH rules and Bayes. As I said before.
I do no spam filtering on my client, it is all done by SpamAssassin on the server.
 Signature Bart: This is the worst day of my life. Homer: This is the worst day of your life SO FAR.
Dave Balderstone - 28 Feb 2008 05:43 GMT > In article <marc.heusser-F99916.04122728022008@news.uzh.ch>, > Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> [quoted text clipped - 12 lines] > I do no spam filtering on my client, it is all done by SpamAssassin on > the server. Spam Assassin at the server. SpamSieve at the client.
It doesn't get much better if you're prepared to spend some configure time.
 Signature Help improve usenet. Kill-file Google Groups. http://improve-usenet.org/
erilar - 29 Feb 2008 00:01 GMT In article <marc.heusser-F99916.04122728022008@news.uzh.ch>, Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> wrote:
> > In article <marc.heusser-92888A.00564628022008@news.uzh.ch>, > > Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> [quoted text clipped - 25 lines] > from some 250 to some 50 spam messages a day). SpamSieve takes care of > the rest. postini gets most of them before they get to me, but more have been sneaking through with my actual address lately.
 Signature Mary Loomer Oliver (aka Erilar)
You can't reason with someone whose first line of argument is that reason doesn't count. --Isaac Asimov
Erilar's Cave Annex: http://www.chibardun.net/~erilarlo
erilar - 29 Feb 2008 00:00 GMT In article <marc.heusser-92888A.00564628022008@news.uzh.ch>, Marc Heusser <marc.heusser@byeheusser.commercialspammers.invalid> wrote:
> > > > > I've been training this thing for months now and it still gets the > > > > > most [quoted text clipped - 19 lines] > e-mails look like. A server-side filter cannot use that knowledge. > That is why filters such as SpamSieve are needed and work well. Just saved this to my desktop, which is getting cluttered because I've done this several times of late, for later consideration.
 Signature Mary Loomer Oliver (aka Erilar)
You can't reason with someone whose first line of argument is that reason doesn't count. --Isaac Asimov
Erilar's Cave Annex: http://www.chibardun.net/~erilarlo
Robert Peirce - 29 Feb 2008 15:59 GMT > > > I've been training this thing for months now and it still gets the most > > > basic spam wrong, doesn't mark them as spam (I don't get any false [quoted text clipped - 4 lines] > > Or maybe the spammers have gotten better. It's a constant arms race. Try SpamSieve. It really works.
 Signature Robert B. Peirce, Venetia, PA 724-941-6883 bob AT peirce-family.com [Mac] rbp AT cooksonpeirce.com [Office]
Blackjack Joe - 01 Mar 2008 21:45 GMT > > > > I've been training this thing for months now and it still gets the most > > > > basic spam wrong, doesn't mark them as spam (I don't get any false [quoted text clipped - 6 lines] > > Try SpamSieve. It really works. I second that, I've been using SpamSieve for a long time now.
|
|
|