|
|
Subscribe / Log in / New account

Honeypots

Honeypots

Posted Feb 14, 2025 21:59 UTC (Fri) by malmedal (subscriber, #56172)
In reply to: Honeypots by mb
Parent article: Fighting the AI scraperbot scourge

> How do you identify a bot as an actual bot that only hits you once per week?

Typically because it followed a honeypot link, at that point you give it a web-page consisting of only such links.

The idea is that the bot will spread these links to other members of the botnet so subsequent bots from other IPs will be immediately recognised and get the same treatment. Hopefully, over time should direct most of the botnet over to the sacrificial server and leave the real alone.


to post comments

Honeypots

Posted Feb 14, 2025 22:05 UTC (Fri) by mb (subscriber, #50428) [Link] (10 responses)

>at that point you give it

But it's already over. You served the request and you spent the resources.
That is the problem.
The CPU/traffic load already happened once you identify the bot. And then it will basically never hit again, unless you keep a multi terabyte timeout-less database with the risk of putting your users into the terabyte ban database.

Honeypots

Posted Feb 14, 2025 22:27 UTC (Fri) by malmedal (subscriber, #56172) [Link] (9 responses)

> But it's already over.

It's not. The bot will report the links it found back to the rest of the botnet and then other bots will come for those links.

> multi terabyte timeout-less database

No database is needed.

Honeypots

Posted Feb 14, 2025 22:28 UTC (Fri) by mb (subscriber, #50428) [Link] (8 responses)

>and then other bots will come for those links.

And consume traffic and CPU.
Lost.

Honeypots

Posted Feb 14, 2025 23:01 UTC (Fri) by malmedal (subscriber, #56172) [Link] (7 responses)

> And consume traffic and CPU.

From the sacrificial server, yes. So the real one gets less load.

Honeypots

Posted Feb 14, 2025 23:15 UTC (Fri) by mb (subscriber, #50428) [Link] (6 responses)

>From the sacrificial server, yes

Which costs real non sacrificial money. Why would it cost less money than the "real" server?

This is a real problem.
It is a real problem for my machines, too.
And I really don't see a solution that is not
a) buy more resources or
b) potentially punish real users

This is a real threat to users. I am currently selecting b), because I think I can't win a).

Honeypots (and tarpits, oh my!)

Posted Feb 15, 2025 0:32 UTC (Sat) by dskoll (subscriber, #1630) [Link] (5 responses)

The sacrificial server can be less beefy than the real server because it doesn't have to generate real content that might involve DB lookups and such. And it can dribble out responses very slowly (like 10 bytes per second) to keep the bots connected but not tie up a whole lot of bandwidth, using something like this.

Honeypots (and tarpits, oh my!)

Posted Feb 15, 2025 0:49 UTC (Sat) by mb (subscriber, #50428) [Link] (4 responses)

If the "sacrificial servers" don't exhaust the bots, then the bots will just go back to the real servers.
Bot administrators are not stupid. Bots are optimized for maximal throughput, no matter what.

Honeypots (and tarpits, oh my!)

Posted Feb 15, 2025 1:58 UTC (Sat) by dskoll (subscriber, #1630) [Link]

Yes, sure, but you might be able to tie some of them up in the tar pit for a while. Ultimately, a site cannot defend against a DDOS on its own; it has to rely on its upstream provider(s) to do their part.

My reply was for the OP who asked how the sacrificial server could be run more cheaply than the real server.

Honeypots (and tarpits, oh my!)

Posted Feb 15, 2025 10:28 UTC (Sat) by malmedal (subscriber, #56172) [Link] (2 responses)

> If the "sacrificial servers" don't exhaust the bots, then the bots will just go back to the real servers.

Yes, obviously. That's why I called this a "mitigation", not a "cure".

LWN does not want to do things like captcha, js-challenges or putting everything behind a login, can you think of a better approach while adhering to the stated constraints?

Honeypots (and tarpits, oh my!)

Posted Feb 15, 2025 10:35 UTC (Sat) by mb (subscriber, #50428) [Link] (1 responses)

>can you think of a better approach while adhering to the stated constraints?

No. That was my original point.

Honeypots (and tarpits, oh my!)

Posted Feb 15, 2025 10:53 UTC (Sat) by malmedal (subscriber, #56172) [Link]

Then I don't understand what we are quarreling about? I think a sacrificial server is going to be a cheaper solution than expanding real capacity, I don't see a third option.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds