Peter N. M. Hansteen
It finally happened. Today, I added the three hundred thousandth (yes, 300,000th) spamtrap address to my greytrapping setup, for the most part fished out of incoming traffic here, for spammers to consume.
A little more than fifteen years after I first published a note about the public spamtrap list for my greytrapping setup in a piece called Hey, spammer! Here's a list for you!, the total number of imaginary friends has now reached three hundred thousand. I suppose that is an anniversary of sorts.
If this all sounds a bit unfamiliar, you can find the a brief explanation of the data collected and the list itself on the traplist home page.
And yes, the whole thing has always been a bit absurd.
That said, at the time in the mid noughties this greytrapping setup was announced, we had been battling scammy spam email and malicious software that also abused email to spread for some years, and we were eagerly looking for new ways to combat the spam problem which tended to eat into time and resources we would rather have used on other things entirely.
With that backdrop, collecting made up or generated, invalid email addresses in our home domains from various logs as traps for spammers seemed like an excellent joke and a fun way to strike back at the undesirables who did their damnedest to flood our users' mailboxes.
The initial annoncement shows the early enthusiasm, as does a followup later in the same month, Harvesting the noise while it's still fresh; SPF found potentially useful. With a small helping of scepticism towards some of the other methods and ideas that circulated at the time, of course.
The various followups (search on the site using "spam, "antispam" or for that matter "spamd" and you will find quite a few) reveal that we went to work on collecting, feeding to spamdb and publishing with a grin for quite a while.
I even gave a talk at BSDCan 2007 about the experience up to that point around the time the traplist became public.
A few years later I posted a slightly revised version of that somewhat overweight paper as a blog post called Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools that has also grown some addenda and updates over the years.
I have revisited the themes of spam and maintaining blocklists generated from the traffic that hits our site a few times over the years.
The most useful entries are probably Maintaining A Publicly Available Blacklist - Mechanisms And Principles (April 2013) and In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe (May 2012), while the summary articles Badness, Enumerated by Robots (August 2018) and Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting offer some more detail on the life that includes maintaining blocklists and pass lists.
However, by the time the largest influx of new spamtraps, or imaginary friends if you will, happened during February through April of 2019 I was fresh out of ideas on how to write something entertaining and witty about the episode.
What happened was that the collection that at the time had accumulated somewhat more than fifty thousand entries, at a rate of no more than a few tens of entries per day for years, started swelling by several thousand a day, harvesting again from the greylist.
The flood went on for weeks, and forced me to introduce a bit more automation in the collecting process. I also I tried repeatedly to write about the sudden influx, but failed to come up with an interesting angle and put off writing that article again and again.
As I later noted in that year's only blog entry The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One, starting January 30th 2019
"I noticed via my scriptery that reports on such things that a large number of apparent bounce message deliveries to addresses made up of "Western-firstname.Chinesefirstname.lastname@example.org", such as email@example.com or firstname.lastname@example.org, had turned up, in addition to a few other varieties with no dot in the middle, possibly indicating separate sources."
The IP addresses of the sending hosts were all in Chinese address ranges, and some weeks later, in April, we had ended up harvesting at least 120 000 unique new entries of a very similar kind before the volume went down rather abruptly to roughly what it had been before the indicent.
It is likely that what we were seeing was backscatter from one or more phishing campaigns targeting Chinese users where for reasons only known to the senders they had chosen addresses in our domains as faked sender addresses.
Fortunately by the time this incident occurred I had started keeping a log of spamtraps by date added and the actual greylist dumps generated by the blocklist generating script can be retrieved so more detailed data can be assembled when and if someone can find the time to do so.
As I have kept repeating over the years, maintaining the spamtrap list and the blocklists sometimes turns up bizarre phenomena. Among the things that keep getting added to the spamtraps list are the products of SMTP callbacks, and another source of new variants seems to be simply shoddy data handling at the sender end. We keep seeing things that more likely than not are oddly truncated versions of existing spamtraps.
And finally, while the number of trapped hosts at any time seems to have stabilized over the last couple of years at the mid to low four digits, we seem to be seeing that low number of hosts aggressively targeting existing spamtraps, as detailed in the February 2020 sextortion article.
I have at times been astonished by what appears to be taken as useful addresses to send mail to, and I am sure the collecting and blocking activity will turn up further absurdities unheard of going forward. It is also quite possible that I have forgotten about or skipped over one or more weird episodes in the saga of the spamtraps and blocklists. I hope to be able to deliver, at odd intervals, writeups that are interesting, useful, funny -- at least one and hopefully all.
If you are interested in the issues I touch on here or if the data I accumulate would be useful in your research, please let me know via comments or email.
And yes, since I I know you have been dying to ask, this is the entry, collected in the evening (CEST) of 7 September 2022, which took our population of imaginary friends over the 300 000 line:
Sep 7 19:52:18 skapet sshd: Failed password for invalid user ftpshared from 126.96.36.199 port 45876 ssh2
which by the obvious processing we do here from failed login attempt to offcial spamtrap becomes
Date Source Original Spamtrap 2022-09-07 SSH ftpshared email@example.com
and joins the collection as entry number 300,000 (three hunded thousand).
By the time you read this, the total is likely to have increased yet again.
On a relevant mailing list it was been suggested that if you run a large scale email service, our list of spamtraps could be useful in filtering outgoing mail. If a customer tries to contact one of our imaginary friends, you probably need to pay extra attention to that customer.