By: Lawrence AbramsFor the last year one of the more annoying developments in mass spamming has been a technique where the spam is embedded in an image. With this method it is much more difficult for spam filters to determine if an email with a image is spam or legitimate. Over time developments were made to counteract this technique by performing optical character recognition on the attached images to determine the text encoded in them. This would allow the spam filter to then analyze the text in the image and determine if it was spam.
This week a new technique started being used in which spam image is embedded in an attached PDF document as shown below.

One method for spam filters to combat this new technique would be to extract the images from the PDF and then perform their normal image analysis techniques on the exported image. Unfortunately, the spammers have altered the PDF documents so that they are damaged but still viewable. This makes it so at least three open source PDF converters I tested (Xpdf, ImageMagick, and PDF-111) can't properly extract the image from the spam while they work perfectly with normal PDF files.
Overtime, a new method will be found to correct these PDF files so that they can be parsed, but until that time keep a lookout for these types of spam.

Help
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.



Back to top














