Replies: 6 comments 9 replies
-
Do you put anything in front of your server? I used Cloudflare as its the only way I can manage the bots |
Beta Was this translation helpful? Give feedback.
-
I think some crawlers see the layered nav links like new pages to visit and then it creates massive number of "unique" links b eventually hitting every possible combination of attributes. I coded an "anti-abuse" module long ago for a client and it basically set rates limits for things like layered nav visits, number of products added to the cart, add to compare, advanced search, etc. Basically anything that was potentially resource intensive and could be abused. But, the simplest protection would be something like nginx rate limit with a reasonable "sustained" limit. For example, 1 r/s with a burst of 100 to avoid accidental trips. You gotta be careful that static assets don't count towards this, though. |
Beta Was this translation helpful? Give feedback.
-
If it's not actually a malicious bot you can set Disallow rules in
|
Beta Was this translation helpful? Give feedback.
-
The service provider assured us with a high-performance physically firewall, but I discussed the problem with him and he told me that these IPs behave like human visitors, they are not crawlers or bots. They make a few accesses to the filters in a category, adding to comparison and that's it. The accesses are random in time. We identified 11,200 IP addresses that participated in this action. We have blocked most of them and for the moment there are no problems. |
Beta Was this translation helpful? Give feedback.
-
I'm back with new information. Clearly, the website was harvested before so that all available filters were known. Then the visits started, an IP address only makes one visit and that's it. There are tens of thousands from Saudi Arabia, Indonesia, the Philippines, China, Brazil, the United States. The fact that it comes directly to the website, without any other intermediate page helped us observe the behavior and block all these addresses, but it didn't end there. We went further and blocked the entire range of an IP address to be more sure. It decreased in intensity. It generally uses 2 - 3 filters in the URL. We'll think about other solutions after we manage to solve it definitively. For now we use Varnish that serves all requests of this type from the cache. |
Beta Was this translation helpful? Give feedback.
-
I solved the problem by making a combination of conditions in .htaccess and fail2ban+ipset. After analyzing the behavior of these bots I found some common elements, so a condition was created in .htaccess that serves a 403 error that is recorded in a log file. From the log fail2ban collects all these IP addresses and bans 16 other IP addresses around it (A.B.C.D/28). ipset takes care of keeping the addresses in a set. In two hours 2,700 IP addresses were collected. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Recently I noticed that the daily number of unique visitors exploded in a shop I am managing, From 800 to 20,000. It was clear to me that the website was under attack and I analyzed the webserver logs.
I noticed that a path to a specific category (for example ID:190) was being visited intensely. Here all the addresses started using the available filters. As the filters in the category were around 24, one IP address started to combine them, hence a record number of daily visits over 200,000.
For the moment I had no other idea than to ban filtering in that category and everything calmed down (for the moment). I mention that all the IP addresses did not have the signs of a bot, the fingerprint in the logs seemed like a real visit. All coming from Amazon, Huawei and US owners.
We will certainly face AI in the near future by attacking websites.
Beta Was this translation helpful? Give feedback.
All reactions