Understanding Spam Traps And Sensor Networks

June 6, 2017

Understanding spam trap & sensor networks.


As a technologist, I hold the core belief that most difficult, tedious, and even dangerous tasks can be made less so through rigorous and innovative application of engineering know-how. After spending time in the trenches to get a truly visceral feel for what’s good and what can be improved, it should then be possible to step back, think, and experiment, and then do it some more.

It seems to always be the case that such an effort yields something simplifying and useful.

When I look at the email ecosystem, I see a playground; people and things sending messages back and forth, telemetry spitting off of everything like hot fat from a frying sausage, a cast of good and bad guys, a web of interconnected systems doing everything imaginable on a scale measured with more zeroes than the US national debt.

It’s so cool and that’s why I love what I do. But.

There’s this other thing, and it drives me absolutely nuts. I hear it at conferences, combat it in support tickets, read it in blogs, have it preached at me in webinars. It has little to no grounding in reality or fact, it has a terrible smell; it’s the email world’s eat-everything-and-lose-weight pill. I call it the Sender Mythology, and its practitioners have created their own set of oral traditions to explain the allegedly mysterious, dangerous, and convoluted path to the holiest of holy lands: The Inbox.

Understanding Spam Trap Sensor Networks

Things like how to get around one or another ISP’s filtering systems, which days are better for sending mail, 10 things to do in your Black Friday campaigns, how to scrub that list you bought, how Hotmail throttles really worked (that especially used to tickle me), five bounces is a very bad thing, a magical sending cadence for re-engagement campaigns, dee-kay-eye-emm encrypts your email, spammers don’t authenticate, use this weird trick to triple customer engagement, and so on.

Sigh.

Last month at the Certified Senders Alliance (CSA) conference in Cologne, I was asked if in five years I’d still be giving the same basic advice about successfully using email. It killed me to say yes if for no other reason than the power of email marketing isn’t declining, which causes a continuous influx of new people into the industry.

I also realized it’s because outside of the dedicated and hard-working subset of deliverability consultants, product marketers, and other folks who really truly get it, there’s no real counter-narrative, no plain-sense education to combat the Sender Mythology. So, I’m going to get off my ass and start with a discussion about email data, in particular about spam traps and reputation.

Mo’ Data, Mo’ Problems

The biggest problem confounding senders today is they have too much data and almost no context binding together what limited visibility is available through the dozen or more tools they’ve duct-taped together (1), and they certainly don’t have the time to build custom data integration and analytics tools from scratch.

This problem is compounded when dealing with types of data that are vague or low-fidelity by definition, as is sometimes the case when talking about data for monitoring email reputation and specifically with trap data.

What is a “trap”?

A trap (or spam trap) is an email address that should exist on exactly zero customer lists. Trap Network Operators (TNO) such as SORBS, Spamhaus, Proofpoint, and others acquire trap addresses a few different ways:

  1. Recycling previous valid addresses. A common-ish practice at major ISPs and private domains (2), and often “loaned” to TNOs via partnership. An address that used to belong to a human was inactive for some period of time, reclaimed by the domain owner, put through a conditioning (3) procedure, and then used officially as a trap.
  2. Brand-new addresses, also known as pristine traps. The TNO creates a never-before-seen address. The clever ones make these addresses things that could be guessed by automated bulk-mailing programs such as those operated by The Bad Guys.

Messages received by a trap (also called trap hits) are processed to extract certain metadata:

  • Sending identities (e.g., from domain and IP)
  • Associated infrastructure (e.g., compromised DNS servers, other MXs)
  • Message content (e.g., link hosts, content signatures)

This information is commonly used to add the sender to one or more blacklists used by domain owners to filter messages from known bad senders.

The great thing about a trap hit from a traditional TNO is it’s an almost perfect signal a sender has list acquisition and/or hygiene issues. The quality of that signal is completely unrelated to the number of trap hits, only the frequency.

In other words, whether you have five or 500 hits, you have a problem. If you keep having those hits, you failed to address your problem, and persistent failure to resolve trap hits will get you blacklisted. Senders often cite TNO’s lack of clear guidance, specifically which traps they’re hitting, as the chief reason remediating trap hits is so hard.

I assert this is a distraction from what we should really be talking about: Senders hit traps because they pay insufficient attention to engagement data.

As a rule, TNOs do their best to protect their trap addresses. Should they become well-known, the value of the network would rapidly approach zero. The same rationale applies to most TNOs not providing message samples: There are too many ways for senders to hide data that would lead them to the trap address.

A relative of traps worth mentioning is something called black holes. A black hole is a domain whose email is routed not to people, but to a system that, like a trap network, collects and analyzes messages. These systems are commonly referred to as sensor networks.

Because sensor networks acquire domains in bulk, often automatically via registrars and domainers (4), they aren’t conditioned into traps. Unconditioned trap results have incredibly high false-positive rates and would destroy the credibility of any list they’re included in.

A very important thing to note: TNOs are reputation brokers. They succeed and have a massive positive impact when they maintain unimpeachable integrity and quickly deliver highly accurate data. It can be a very difficult and personally taxing job (do you really want the Russian Mob to consider you a revenue threat?), so you don’t just wake up one morning and build a trap network. You do it because you’re on a mission to stop spam. We love you guys.

Sensor network data can nonetheless be incredibly useful. With a little cash and elbow grease, anyone can set up the collection piece of the network. The difficulty comes in extracting value from a sensor network: How do you turn all that raw data into useful information? What does “useful” actually mean in this context?

More on that in a moment.

When close enough ain’t good enough

In 2015, we launched a product called Reputation Informant, recently rebranded as 250ok Reputation. The goal was to combine spam complaint (FBL), SNDS, trap, and phishing data into something senders could use to quickly identify problems that, left unchecked, would reduce or destroy deliverability.

We launched with data provided by ThreatWave, who built a massive sensor network. After working with their data for more than a year, and despite a very successful collaboration, we began to chafe under dataset limitations that mirrored traditional trap networks: No access to the full message or the recipient address (5).

Here we saw an opportunity: Rather than constrain our ability to innovate and wed the efficacy of 250ok Reputation to what our partners were willing to disclose, we could build and operate our own sensor network at a fraction of the expense we incurred buying highly-redacted external data.

So we did. MailboxPark, the 250ok sensor network, launched in November 2016 and exceeded our wildest expectations. Back to the question of how one values sensor network data, when we launched 250ok Reputation we defined three metrics to guide our data acquisition efforts.

  • Relevance: Does the data contain content or originate from a source that’s interesting to us?
  • Coverage: Does the relevant data represent enough of our customer base to provide broad value?
  • Frequency: Does the relevant data appear often enough that we’re able to accurately measure trends?

MailboxPark reached parity with ThreatWave across all three metrics so quickly that we stopped using their data in production in late January 2017. The timing couldn’t have been better, as Return Path announced their acquisition of ThreatWave on January 30th, 2017.

Myth or Math: Your Choice

Since then, our esteemed colleagues at Return Path have written about the unique advantages afforded their customers by exclusive access to ThreatWave’s data. Their sales personnel are fond of cold-calling our customers and saying 250ok has been measured for a pine box to be delivered any day.

We have a saying where I grew up to describe folks who make their point that way: All hat, no horse. Let’s just cut to the chase.

Spam Trap Sensor Networks

Myth #1: Size matters.

Raw data is a commodity, and as we mentioned when talking about how reputation data is valued, volume doesn’t really factor into it.

Let’s say you’re driving home. Your friend’s following you back in a separate car and notices your taillight is out. How many times do they need to tell you before you understand there’s a problem? Five times? Probably not 5,000 times.

Whether you consistently send to a handful or several hundred addresses you shouldn’t, you have the same underlying list hygiene and program management problems.

Myth #2: Return Path’s sensor data is a unique and proprietary resource.

Sure, it’s only available from Return Path. However, if you don’t feel like giving them any more money, we can show you how to build one yourself. If you don’t have the resources to do that, consider 250ok Reputation.

Myth #3: Return Path’s sensor data is an early warning system.

Sensor networks aren’t early warning systems. They’re exactly the same thing as a trap network with zero consequences for trap hits. With respect to brand damage, they’re the equivalent of the tree falling in the forest with no one around to hear it.

The messages that hit both systems are sent at the same time. Further, beyond this thought exercise, our own data has shown conclusively that if you’re hitting a sensor network, you are definitely hitting trap networks as well. If it’s brand damage you’re concerned about, you’re far better served by analyzing DMARC reports. This is getting tedious, but yes… we do that too.

A hit to either type of network means it’s time to consider culling the low-to-no engagement addresses from your list. Only by continuously monitoring engagement alongside delivery and negative feedback metrics do you gain a 360-degree view of your sending health.

Your current tools (Return Path, maybe?) don’t do that? We built something amazing that does.

The Winner is… You!

Ten or fifteen years ago, when ISPs were locked in an arms race with spammers, it might have been fair to say that knowing how to build a sustainably-performing messaging program approached a black art.

For you, the well-intentioned professionals, the serious senders, getting email to perform isn’t rocket science, and it’s certainly not something in a dusty old book in a language understood only by a few.

Myths are fantastic stories about make-believe heroes doing amazing things.

At 250ok we bring together a huge diversity of data, process the hell out of it, present in plain and direct terms what it actually means, and make it clear what needs to be done to achieve and maintain outstanding email performance.

One sender at a time, 250ok is making everyone a hero. Isn’t that the story you want to be written about you?


Footnotes

  1. Source: https://blog.hubspot.com/agency/tools-data-complexity-marketing-technology
  2. Domains that aren’t ISPs, such as those belonging to businesses or individuals.
  3. To give anyone sending it fair warning that it is no longer in service.
  4. People and organizations in the business of speculating on and trading domain names.
  5. Again, a totally legitimate practice. Not knocking anybody.

Author: Paul Midgen

Before joining 250ok, Paul was CEO of Message Bus, ran Inbound Delivery & Anti-Spam at Hotmail, co-authored DMARC specification, and has spent a lifetime working on things that allow machines to talk to each other for the benefit of humans.

You may also like...

The Year in Email 2018

The Black Friday emails are deleted, marketers’ email lists are checked twice, we pretty much know which senders have been naughty or nice. Another year in email is coming to a close, and boy, what a ride. While most thought leaders are busy making predictions about 2019, we like to learn from the past to […]

Poorly designed emails could cost you millions of dollars. But what does that really mean?

We partnered with the smart folks at Lab42 to research what people really think about marketing email. Do they like how they look on their preferred device? Do they prioritize the same design elements you do? If you’re not aligned with your recipients, you could end up sending unwanted, unsatisfying email. You know what that […]

The Year in Email 2017

Can you believe it? The year 2017 is coming to a close and what a year it has been in the email ecosystem. Email’s staying power continues to flex its muscles as a dynamic channel that can adapt to the ever-changing landscape of digital marketing. This past year saw many changes, trends, and announcements that […]

Ready to get started?