this post was submitted on 19 Aug 2025
818 points (99.0% liked)

Technology

74692 readers
2569 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 5) 14 comments
sorted by: hot top controversial new old
[–] poopkins@lemmy.world -1 points 1 week ago* (last edited 1 week ago) (2 children)

I've developed my own agent for assisting me with researching a topic I'm passionate about, and I ran into the exact same barrier: Cloudflare intercepts my request and is clearly checking if I'm a human using a web browser. (For my network requests, I've defined my own user agent.)

So I use that as a signal that the website doesn't want automated tools scraping their data. That's fine with me: my agent just tells me that there might be interesting content on the site and gives me a deep link. I can extract the data and carry on my research on my own.

I completely understand where Perplexity is coming from, but at scale, implementations like ~~this~~ Perplexity's are awful for the web.

(Edited for clarity)

load more comments (2 replies)
[–] xxce2AAb@feddit.dk -2 points 1 week ago

Ooh, that's though sweetheart. If the owners of those servers want you to visit, they'll just choose another WAF than CF's.

All zero of them.

[–] FauxLiving@lemmy.world -5 points 1 week ago* (last edited 1 week ago) (21 children)

The amount of people just reacting to the headline in the comments on these kinds of articles is always surprising.

Your browser acts as an agent too, you don’t manually visit every script link, image source and CSS file. Everyone has experienced how annoying it is to have your browser be targeted by Cloudflare.

There’s a pretty major difference between a human user loading a page and having it summarized and a bot that is scraping 1500 pages/second.

Cheering for Cloudflare to be the arbiter of what technologies are allowed is incredibly short sighted. They exist to provide their clients with services, including bot mitigation. But a user initiated operation isn’t the same as a bot.

Which is the point of the article and the article’s title.

It isn’t clear why OP had to alter the headline to bait the anti-ai crowd.

[–] unpossum@sh.itjust.works -2 points 1 week ago (1 children)

Thank you for trying to fight the irrational anti-AI brainrot on lemmy! It’s probably a lost cause, but your efforts are appreciated :)

load more comments (1 replies)
load more comments (20 replies)
load more comments
view more: ‹ prev next ›