this post was submitted on 08 Apr 2026
10 points (100.0% liked)

Forgejo

310 readers
21 users here now

This is a community dedicated to Forgejo.

Useful links:

Rules:

founded 2 years ago
MODERATORS
 

My instance is getting pummeled by scrapers crawling nonsense. Like issue and pull searches with every single variant of label combinations.

Everything's coming from a shitload of different residential IPs at a very fast cadence.

There's just not that much content on my instance to warrant this traffic. It could be scraped in a minute or two like this if it were legitimate traffic.

you are viewing a single comment's thread
view the rest of the comments
[–] Kissaki@programming.dev 8 points 15 hours ago (2 children)

Possibly AI company crawlers. When they came up there was a lot of bad publicity and reports of actively malicious and toxic crawling behavior, including ban evasion.

You can think about locking some url paths behind valid login sessions, or use a proof of work proxy guard.

Anubis is the popular tool for that. I've seen maybe three alternatives, one of which from Cloudflare.

See also related Codeberg ticket (Forgejo instance) https://codeberg.org/forgejo/discussions/issues/319

If you search, you can find various blog posts about these issues. Not just when Forgejo.

[–] treadful@lemmy.zip 4 points 15 hours ago

Possibly AI company crawlers. When they came up there was a lot of bad publicity and reports of actively malicious and toxic crawling behavior, including ban evasion.

That was kind of what I was thinking, but if that's true, they're wasting so much bandwidth and compute. Going through every combination of issue label combinations does not get them any useful code to hoover up. They could've just cloned my repos and be done with it.

You can think about locking some url paths behind valid login sessions, or use a proof of work proxy guard.

Anubis is the popular tool for that. I’ve seen maybe three alternatives, one of which from Cloudflare.

Really don't want to Cloudflare, but Anubis is interesting. If I can't shake these bots, maybe I'll consider this. Thanks.

[–] Eezyville@sh.itjust.works 1 points 14 hours ago

If you think it's AI then maybe you can get another AI to write bad code and poison their training data.