An Information Technology Partner    

Site Navigation

Home
About Us
Products and Services
Application Development
Audio Visual
Digital Security Camera Systems
Document Imaging
Internet Services
Network Infrastructure
IT Outsourcing
Web Development
Online Store
Service Request
Article Archive
Contact

 
The battle againt junk e-mail (SPAM)

1. What is SPAM?
Besides a jelly-like ham substance, SPAM is also irrelevant junk email.

2. Why is junk e-mail called SPAM?
It was named after a Monty Python sketch in which SPAM was included with every meal at a restaurant. The sketch can be found at: http://www.ironworks.com/comedy/python/spam.htm
Because the word SPAM was used so repetitively, many years later, it was associated with the repetitiveness of junk e-mails.

3. How do Spammers get my email address?
Spammers can obtain your email address in many ways.
- They can purchase it (20 million names for $20)
- They can guess it using sophisticated software that can concatenate first name and last names, such as PeteSmith@somecompany.com or first initial and last name like psmith@somecompany.com.
- They can harvest your email address from public forums.
- From websites you visit in which you are required to give your e-mail address to gain access or win things.

4. What is a SPAM Filter?
SPAM filters evaluate each message to determine if the message is junk, and if so, deletes the message before it hits your inbox.

Some filters are rules- based. A rules-based SPAM filter will analyze each message and rank characteristics that could typically classify the message as SPAM. If a message exceeds a given threshold, the SPAM filter classifies it as SPAM, and deletes it. Here’s an excerpt from an actual message advising me to quit smoking as reported by SAproxyPro:

Content analysis details: (7.9 points, 5.0 required)

pts rule name description
---- ---------------------- ---------------------------------------------

 0.4 TO_ADDRESS_EQ_REAL         
 0.1 FREE_TRIAL                          
 0.9 OFFERS_ETC                         
 0.1 HTML_LINK_CLICK_HERE       
 1.5 HTML_IMAGE_ONLY_06          
 1.1 HTML_WEB_BUGS                  
 0.2 HTML_MESSAGE                   
 0.4 HTML_70_80                         
 0.2 MIME_QP_LONG_LINE            
 2.8 RATWARE_STORM_URI          
 0.1 CLICK_BELOW                      
To: repeats address as real name
BODY: Free Trial
BODY: Stop the offers, coupons, discounts
BODY: HTML link text says "click here"
BODY: HTML: images with 400-600 bytes of
BODY: Image tag intended to identify you
BODY: HTML included in message
BODY: Message is 70% to 80% HTML
RAW: Quoted-printable line longer than 76
URI: Bulk email fingerprint (StormPost) found
Asks you to click below

As you can see, this message received 7.9 points of an allowable 5.0 from the SPAM filter. The following message was able to get past my SPAM filter by limiting the number of rule violations within the message text and header.

Content analysis details: (4.4 points, 5.0 required)

pts rule name description
---- ---------------------- ---------------------------------------------

 0.0 OPT_IN                                
 0.7 EXCUSE_19                          
 0.4 EXCUSE_1                            
 0.3 EXCUSE_14                          
 0.8 HTML_30_40                         
 0.2 HTML_MESSAGE                   
 0.4 HTML_TITLE_EMPTY              
 1.1 MIME_HTML_NO_CHARSET    
 0.4 NORMAL_HTTP_TO_IP           
 0.1 CLICK_BELOW                     
BODY: Talks about opting in (lowercase)
BODY: Claims you opted-in or registered
BODY: Gives a lame excuse about why msg sent
BODY: Tells you how to stop further spam
BODY: Message is 30% to 40% HTML
BODY: HTML included in message
BODY: HTML title contains no text
RAW: Message text in HTML without charset
URI: Uses a dotted-decimal IP address in URL
Asks you to click below
 

This message is obviously SPAM, but it did not meet my minimum threshold requirement as being classified as such. I know what you’re thinking – “lower your threshold idiot”. There’s just one problem with lowering the threshold - almost all messages have rule violations. Simply creating a message with some HTML tags breaks a rule. A message which contains a link to a graphic image breaks another rule. Determining the optimum threshold value is what makes rules-based SPAM filters difficult to configure.

A good SPAM filter will also check the message against a black list of known bad senders, and allow you to make your own white list of known good senders. Keep in mind, most professional Spammers will spoof their sender information as a legitimate sender.

Another method of SPAM filtering is called “Bayesian” filtering named after Thomas Bayes, an 18th century cleric who created The Bayes Theorem. The Bayes Theorem uses statistical inference to estimate the probability that various hypotheses are true. In other words, a pattern of e-mail messages which are known to be good are compared against e-mail messages which are known to be bad to determine the validity of the current message. What makes the Bayesian filter so powerful is its ability to learn based on the e-mail messages that you’ve received. The only drawback is Bayesian filters require your input as to which messages are good and bad. Bayesian filters are becoming more and more popular.

There are other types of SPAM filters, however most work similarly to the two examples I’ve given above. All filters, regardless of the theory used to create them, evaluate messages to determine the likelihood that the messages are junk.

5. How does junk e-mail get through my SPAM filter?
Have you ever received a piece of snail mail at home that appeared to be a letter from a friend or relative? It looked so legitimate that you opened it only to discover a sales pitch for a home loan?

Much like the scenario above, Spammers are clever. They know how SPAM filters work and they create messages that can sneak through your filter. They will rephrase words like Viagra as Vi@gra etc. They “spoof” the message header with known valid sender e-mail addresses and servers. They have many tools in their toolbox and they aren’t afraid to use them.

       
:::.Copyright © 2007 .:::Imagine Systems, Inc. .:::