30.05.2013 Views

internet security tHreAt rePOrt GOVernMent 2013

internet security tHreAt rePOrt GOVernMent 2013

internet security tHreAt rePOrt GOVernMent 2013

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

p. 126<br />

Symantec Corporation<br />

Internet Security Threat Report <strong>2013</strong> :: Volume 18<br />

SPAM AND FRAUD ACTIVITy TRENDS<br />

Spam by Category<br />

Background<br />

Spam is created in a variety of different styles and complexities.<br />

Some spam is plain text with a URL; some is cluttered with<br />

images and/or attachments. Some comes with very little in<br />

terms of text, perhaps only a URL. And, of course, spam is<br />

distributed in a variety of different languages. It is also common<br />

for spam to contain “Bayes poison” (random text added to<br />

messages that has been haphazardly scraped from websites to<br />

“pollute” the spam with words bearing no relation to the intent<br />

of the spam message itself). Bayes poison is used to thwart spam<br />

filters that typically try to deduce spam based on a database of<br />

words that are frequently repeated in spam messages.<br />

Any automated process to classify spam into categories would<br />

need to overcome this randomness issue. For example, the<br />

word “watch” may appear in the random text included in<br />

a pharmaceutical spam message, posing a challenge as to<br />

classifying the message as pharmaceutical spam or in the<br />

watches/jewelry category. Another challenge occurs when a<br />

pharmaceutical spam contains no obvious pharmaceuticalrelated<br />

words, but only an image and a URL.<br />

Spammers attempt to get their messages through to recipients<br />

without revealing too many clues that the message is spam.<br />

Clues found in the plain text content of the email can be<br />

examined using automated anti-spam techniques. A common<br />

way to overcome automated techniques is by using random text.<br />

An equally effective way is to include very little in the way of<br />

extra text in the spam, instead including a URL in the body of<br />

the message.<br />

Spam detection services often resist classifying spam into<br />

different categories because it is difficult to do (for the reasons<br />

above) and because the purpose of spam detection is to<br />

determine whether the message is spam and to block it, rather<br />

than to identify its subject matter. The most accurate way to<br />

overcome the ambiguity faced by using automated techniques<br />

to classify spam is to have someone classify unknown spam<br />

manually. While time-consuming, this process provides much<br />

more accurate results. An analyst can read the message,<br />

understand the context of the email, view images, follow URLs,<br />

and view websites in order to gather the bigger picture around<br />

the spam message.<br />

Methodology<br />

Once per month, several thousand random spam samples are<br />

collected and classified by Symantec.cloud using a combination<br />

of electronic and human analysis into one of the following<br />

categories:<br />

• Casino/Gambling<br />

• Degrees/Diplomas<br />

• Diet/Weight Loss<br />

• Jobs/Money Mules<br />

• Malware<br />

• Mobile Phones<br />

• Pharmaceutical<br />

• Phishing<br />

• Scams/Fraud/419s<br />

• Sexual/Dating<br />

• Software<br />

• Unknown/Other<br />

• Unsolicited Newsletters<br />

• Watches/Jewelry

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!