Hello, ladies and gentlemen. In this article, I will tell you how to block Amazonbot on your website. It’s a highly annoying robot that can put a load on your server comparable to a small DDoS attack.

So let’s block its access to the site to prevent Amazonbot from causing problems.

Amazonbot – a harmful bot that needs to be blocked

If you wish to skip ahead to the instructions, scroll to the next heading.

One fine day, I logged into my hosting account and noticed a level of server load that exceeded all reasonable limits.

Increased server load due to Amazonbot operation
Over half of my allocated CPU time was already exhausted by morning.

Initial details: I’m using Beget hosting with 300 CPU cores available. It’s important to note that a basic hosting plan typically offers only 65 CPU cores. So this load could easily lead to a smaller website being blocked by the hosting provider. Hosts don’t appreciate excessive server load, and they might suspend the service if it continues.

I began investigating the issue. Initially, I suspected a hacking attempt, but the logs revealed a different picture.

Amazonbot action logs
My site was persistently scanned by Amazonbot.

The robot was scanning all pages, even those that had already been scanned, doing it on a second round. This led to excessively high server load.

First, I decided to find out what kind of bot it was. It turned out that Amazonbot is the crawler for the Amazon Alexa service. Perhaps Alexa is a useful service that can serve as a voice assistant, play podcasts, and read websites aloud. However, I decided it wasn’t worth the excessive load on the site.

First, I visited the bot’s page: https://developer.amazon.com/support/amazonbot. It was mentioned there that the crawler respects directives in the robots.txt file. So I added the following directives to my robots.txt file:

User-Agent: Amazonbot
Disallow: /

However, this did not help; the bot continued to hammer the site. So, I decided to block it at the server level through the .htaccess file. This solution worked. Of course, the bot continues to try to access the site for a few days, but it always receives a server response of 403.

Amazonbot action logs

A reverse DNS query confirmed that it was indeed Amazonbot, as the response came from Amazon’s servers.

Reverse DNS query to identify Amazonbot

After implementing the block, the server load decreased, but Amazonbot still attempts to access the site. Even consistent server responses of 403 do not deter it.

Amazonbot action logs

Now, let me explain how to block Amazonbot’s access to your site.

Blocking Amazonbot: Step-by-Step Guide

So, even though the Amazonbot page claims to adhere to directives in the robots.txt file, the Disallow directive didn’t work for me. The bot continued to create a massive server load.

CloudFlare also doesn’t help against Amazonbot because it doesn’t consider it a harmful bot. Consequently, the bot will continue to roam the site undisturbed. Bot-blocking plugins didn’t work either, as they only respond to specific bot behaviors, so I found only one way to block Amazonbot.

I prohibited Amazonbot from accessing the site at the server level using the .htaccess file. You can do this with the following code:

SetEnvIfNoCase User-Agent "Amazonbot" blocked_bot
<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=blocked_bot
</Limit>

This code will prevent Amazonbot from visiting the site, and each time Amazonbot attempts to access it, it will receive a server response of 403 (Forbidden).

403 Forbidden for Amazonbot

Of course, this won’t completely eliminate the server load, but serving Amazonbot an almost empty page significantly reduces the load.

This solution is the most reasonable one, at least better than doing nothing. You can certainly block individual Amazonbot IP addresses, but since it identifies itself, this is unnecessary. Blocking it as suggested above is simpler.

I also recommend reading articles on blocking bots and crawlers, as well as lists of harmful and useful bots. There, you will learn more about bots that should be blocked and those that should not.

If you are using a web server based on nginx, there might be some challenges, as hosting users typically cannot access the nginx configuration. However, if you have a VDS or VPS, there should be no issues.

In the /etc/nginx folder, you need to create a file, for example, badbot.conf. You can name it as you like, but make sure the name is unique, and you remember it.

Now, add the following line to this file:

if ($http_user_agent ~* "Amazonbot"){ return 403; }

This way, we will return a server response of 403 to all user agents that match “Amazonbot.”

The next step is to add the following directive inside your server block:

include /etc/nginx/badbot.conf;

After that, restart nginx with one of these commands:

service nginx restart

or

service nginx reload

Remember to run these commands as the root-user.

That’s it. If you are on shared hosting, you will need to contact your hosting support to perform this for you.

Blocking Amazonbot is an easy task

Challenges may only arise if you are using nginx-based hosting, where the process is a bit more complicated. However, if you have a VPS or VDS, there’s nothing complex about it.

Now you know how to block Amazonbot on your website. With that, I bid you farewell and wish you success. All the best!

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

Если материалы с данного сайта были полезны, и вы желаете поддержать блог, то можете воспользоваться формой по ссылке: Донат на поддержку блога