Posted:

Finding and blocking spambots and other unwanted guests

6 minute read

Since I finished my 5-part E-Mail Done My Way series, I have made a few minor and bigger changes (replaced fail2ban with crowdsec being one of the bigger changes). So today we will take a look at how I stay connected with the hard work my little e-mail server is doing day in and day out.

One of the things I like to know is who is trying to spam or bruteforce their way in. My defence mechanisms work reliably, but still — I am a curious nerd. So I came up with a way to find the baddies hiding in my log files.

Checking logfiles with the Power of RegEx

This is the gist of the command I run daily as a cronjob to find and block probing bots and spam bots on my mailserver (postfix/dovecot on RHEL, as explained in the series):

journalctl --since "24 hours ago" -o cat -l SYSLOG\_FACILITY=2 | grep -E 
"UGFzc3dvcmQ6|not resolve|reject" | egrep -o 
'([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3})' | sort | uniq

It finds bots brute forcing login attempts, those that show up with a fake hostname or that are rejected for other valid reasons. 99% of such bots come in via IPv4 addresses, so I just ignore IPv6 ATM. But continue reading — we will address this later on!

Should you wonder about that UGFzc3dvcmQ6 entry — it is the base64 encoded version of Password: ;) It shows login attempts that made it past the TLS negotiations and are now trying to randomly check user/password combinations to get access to my mailserver.

This is an example from my log file of such an attempt:

warning: unknown[170.239.136.25]: SASL LOGIN authentication
failed: UGFzc3dvcmQ6

That IP address is now blocked via firewall (I use nftables) for a few days :) And here an example for the “not resolve” part of the regex:

warning: hostname 90.5.90.110.broad.fz.fj.dynamic.163data.com.cn does
not resolve to address 110.90.5.90: Name or service not known

Which was a spambot trying to flood my mailserver. It failed miserably. And an example for the “reject” part of the regex:

NOQUEUE: reject: RCPT from unknown[134.73.96.166]: 450 4.7.25 Client 
host rejected: cannot find your hostname, [134.73.96.166]; 
from=\<20446-29237-190506-2703-USER=MYDOMAIN@mail.dizziness.shop\> 
to=\<USER@MYDOMAIN\> proto=ESMTP helo=\<mono.dizziness.shop\>

(I obscured user and domain) A typical spam attempt that made it quite far through my checks but was ultimately rejected because it tried a non-existent user, which the NOQUEUE tells us AND it used a fake hostname, whch the cannot find your hostname informs us of.

DNS checks everywhere

Doing all these DNS lookups is frowned upon by admins of big e-mail servers, as they are quite expensive and can lead to stalling connections, wasting compute resources. They typically use other checks that avoid network traffic and latency for obvious reasons.

But for small e-mail servers like mine, it is a very effective way to reduce the impact of spambots and keep my mailserver happy. Did I mention I don’t use SpamAssassin or stuff like that? Because with these DNS checks, spam is already reduced to almost zero. Nice :)

If you want to read more on how I use these DNS checks in my postfix config, scroll down to the “keep connections clean” part of https://jan.wildeboer.net/2022/08/Email-1-Postfix-2022/ part of my little series on how to run your own e-mail server.

But what about IPv6?

Now in practice, more than 95% of these attempts come from IPv4 addresses, which is why I have ignored IPv6 for many years. But today I was inspired enough (bored, would be more fitting ;) to finally also take care of that.

The Regex needed to relibaly find IPv6 addresses in log files is just slightly more complex. My sincere thanks to Farhan Khan for the IPv6 part of this one:

journalctl --since "24 hours ago" -o cat -l SYSLOG_FACILITY=2 | grep -E 
"UGFzc3dvcmQ6|not resolve|reject" | egrep -o '([0-9]{1,3}\.[0-9]{1,3}\.
[0-9]{1,3}\.[0-9]{1,3})|(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|
([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|
([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}
(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|
([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:
((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|
fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::
(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}
(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:
((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|
1{0,1}[0-9]){0,1}[0-9]))(\/((1(1[0-9]|2[0-8]))|([0-9][0-9])|([0-9])))?'
 | sort | uniq

I put all of this together in a shell script with the added luxury of accepting two parameters: -c to show counts per found IP adress and -h X to limit the search for entries from the past X hours (defaults to 10). You can find that script and also one for checking the log file of sshd in this gist on my codeberg pages.

I run these scripts every 24 hours (so obviously with -h 24) and get the results as e-mail. I then add them manually to crowdsec and feel happy :)

Results

So with crowdsec and these manual interventions - what do I get? I installed pflogsumm, another relic from The Good Ole Days, that goes through the postfix logs and also mails e every day with its findings. Here’s a typical result:

[...]

message reject detail
---------------------
 RCPT
   blocked using dbl.spamhaus.org (total: 1)
          1   poa.busa.rest
   cannot find your hostname (total: 52)
         37   23.247.31.232
          6   134.73.96.166
          3   134.73.96.164
          2   134.73.96.165
          1   149.106.151.228
          1   147.78.103.91
          1   113.81.14.4
          1   92.222.130.248

[...]

As you can see — my e-mail server doesn’t get a lot of traffic ;) But a big chunk gets rejected based purely on DNS checks. And that is one of the reasons I get maybe 1-3 spam mails a week. So — job done!

Disclaimer

And the usual disclaimer for the “yes, but” crowd, eagerly waiting to punch holes in my solution in the comments: I am not really an expert on all the nuts and bolts of SMTP/IMAP. Please do share your advice, but in a friendly and respectful way, deal?

What I do have is a bit of a talent to find patterns in log files, based on a lot of experience. I share my findings in the hope that others also want to give running a mail-server a try so we can decentralise e-mail again — the way it is supposed to be :)

COMMENTS

You can use your Mastodon or other ActivityPub account to comment on this article by replying to the associated post.