filter valid email addresses

H

Hoang

anyone know of an algorithm to filter out real email addresses as opposed to
computer generated email addresses? I have been going through past email
archives in order to find friends email address. Unfortunately about 75% of
them are junk addresses or spammer addresses. It's quite obvious when you
look at it and delete it... but you don't want to do it by hand.
 
K

Karlheinz klingbeil

Hoang said:
anyone know of an algorithm to filter out real email addresses as opposed
to
computer generated email addresses? I have been going through past email
archives in order to find friends email address. Unfortunately about 75%
of
them are junk addresses or spammer addresses. It's quite obvious when you
look at it and delete it... but you don't want to do it by hand.

the only means to check if an email-address is valid is to send a mail to it
and ask for a reply.... if the syntax is right you cannot say which address
exists and which doesnt.

I have mad a pop3-filter, which checks emails in your inbox and deletes
using multi-staged regular expressions. the python script and documentation
is available at http://www.lunqual.de/poppers.zip
 
A

Andrew Dalke

Hoang:
anyone know of an algorithm to filter out real email addresses as opposed to
computer generated email addresses? I have been going through past email
archives in order to find friends email address. Unfortunately about 75% of
them are junk addresses or spammer addresses.

Why just look at the email addresses? Since you have the emails
themselves, try this. Get SpamBayes or any of the other systems you
can use to recognize ham/spam. Find the emails where the addresses
are used more than once. These are much more likely to be from
your friends. Use these emails as ham. From the remaining addresses,
identify some of the spam. Train SpamBayes on this and use it
to classify the remaining emails. These can be sorted from most
ham-like to most spam-like, making it easier to identify valid emails
and hence valid email addresses.

Andrew
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,166
Messages
2,570,907
Members
47,448
Latest member
DeanaQ4445

Latest Threads

Top