this post was submitted on 03 Dec 2025
3 points (80.0% liked)

Meta (slrpnk.net)

818 readers
6 users here now

Here we can discuss anything about this Lemmy instance/server itself.

Our XMPP support chat: Movim or XMPP client.

Please also refer to our Wiki

founded 3 years ago
MODERATORS
 

I wondered about the robots.txt

I can see the case for it, I could also see the case for allowing at least Google to index the site.

Has there been some discussion about this previously?

all 10 comments
sorted by: hot top controversial new old
[โ€“] poVoq@slrpnk.net 7 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

At this point we try to block pretty much everything even remotely related to AI companies.

Soon we will probably have to block Chrome browsers when they start to use them to scrape websites without their users knowing (yes that is why AI companies started to make their own browsers and Mozilla is planning the same proudly proclaiming how "stealthy" they can be with that.).

Google search results have become so useless that I see little point left trying to accomodate their search bot ๐Ÿคท

Yes I am bitter and can't wait for the AI bubble to pop.

[โ€“] sam_uk@slrpnk.net 2 points 3 weeks ago (1 children)

It's any day now I think, EU pension funds are moving out https://www.removepaywall.com/search?url=https%3A%2F%2Fwww.ft.com%2Fcontent%2F9d90d557-48e5-4f4b-a927-88071cef8ea9

Would you be up for re-enabling Google indexing? It is crappy, but still..

[โ€“] poVoq@slrpnk.net 1 points 3 weeks ago (1 children)

Not very motivated, but I can look into it.

[โ€“] sam_uk@slrpnk.net 1 points 3 weeks ago (1 children)

I think it would just be

User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
[โ€“] poVoq@slrpnk.net 1 points 3 weeks ago* (last edited 3 weeks ago)

Ok I tried to allow-list some search engine spiders in the robot.txt, however they will probably still just run into the AI scraper block if they act too shady.

But honestly, I highly doubt we will get much traffic from Google search. It's completely gone to shit these days.

[โ€“] Nemo@slrpnk.net 1 points 3 weeks ago

There has and IIRC it's to help prevent scraping.