internet security tHreAt rePOrt GOVernMent 2013
internet security tHreAt rePOrt GOVernMent 2013
internet security tHreAt rePOrt GOVernMent 2013
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
p. 126<br />
Symantec Corporation<br />
Internet Security Threat Report <strong>2013</strong> :: Volume 18<br />
SPAM AND FRAUD ACTIVITy TRENDS<br />
Spam by Category<br />
Background<br />
Spam is created in a variety of different styles and complexities.<br />
Some spam is plain text with a URL; some is cluttered with<br />
images and/or attachments. Some comes with very little in<br />
terms of text, perhaps only a URL. And, of course, spam is<br />
distributed in a variety of different languages. It is also common<br />
for spam to contain “Bayes poison” (random text added to<br />
messages that has been haphazardly scraped from websites to<br />
“pollute” the spam with words bearing no relation to the intent<br />
of the spam message itself). Bayes poison is used to thwart spam<br />
filters that typically try to deduce spam based on a database of<br />
words that are frequently repeated in spam messages.<br />
Any automated process to classify spam into categories would<br />
need to overcome this randomness issue. For example, the<br />
word “watch” may appear in the random text included in<br />
a pharmaceutical spam message, posing a challenge as to<br />
classifying the message as pharmaceutical spam or in the<br />
watches/jewelry category. Another challenge occurs when a<br />
pharmaceutical spam contains no obvious pharmaceuticalrelated<br />
words, but only an image and a URL.<br />
Spammers attempt to get their messages through to recipients<br />
without revealing too many clues that the message is spam.<br />
Clues found in the plain text content of the email can be<br />
examined using automated anti-spam techniques. A common<br />
way to overcome automated techniques is by using random text.<br />
An equally effective way is to include very little in the way of<br />
extra text in the spam, instead including a URL in the body of<br />
the message.<br />
Spam detection services often resist classifying spam into<br />
different categories because it is difficult to do (for the reasons<br />
above) and because the purpose of spam detection is to<br />
determine whether the message is spam and to block it, rather<br />
than to identify its subject matter. The most accurate way to<br />
overcome the ambiguity faced by using automated techniques<br />
to classify spam is to have someone classify unknown spam<br />
manually. While time-consuming, this process provides much<br />
more accurate results. An analyst can read the message,<br />
understand the context of the email, view images, follow URLs,<br />
and view websites in order to gather the bigger picture around<br />
the spam message.<br />
Methodology<br />
Once per month, several thousand random spam samples are<br />
collected and classified by Symantec.cloud using a combination<br />
of electronic and human analysis into one of the following<br />
categories:<br />
• Casino/Gambling<br />
• Degrees/Diplomas<br />
• Diet/Weight Loss<br />
• Jobs/Money Mules<br />
• Malware<br />
• Mobile Phones<br />
• Pharmaceutical<br />
• Phishing<br />
• Scams/Fraud/419s<br />
• Sexual/Dating<br />
• Software<br />
• Unknown/Other<br />
• Unsolicited Newsletters<br />
• Watches/Jewelry