Ask Lemmy
A Fediverse community for open-ended, thought provoking questions
Rules: (interactive)
1) Be nice and; have fun
Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them
2) All posts must end with a '?'
This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?
3) No spam
Please do not flood the community with nonsense. Actual suspected spammers will be banned on site.   No astroturfing.
4) NSFW is okay, within reason
Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com.
NSFW comments should be restricted to posts tagged [NSFW].
5) This is not a support community.
It is not a place for 'how do I?', type questions.
If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world.  For other questions check our partnered communities list, or use the search function.
6) No US Politics.
Please don't post about current US Politics.  If you need to do this, try !politicaldiscussion@lemmy.world or !askusa@discuss.online
Reminder: The terms of service apply here too.
Partnered Communities:
Logo design credit goes to: tubbadu
view the rest of the comments
LLMs can't do math well. Add in the factor of needing to understand the question first before doing the math and it might work better than you think.
Scrapers aren't using the LLM to scrape. They just gather data the old fashioned way, by spoofing a web browser. Then the LLM can use that data, but that step comes later.
Also, nowadays modern LLMs will have tool APIs available to them, which will likely include a calculator app. So even if LLMs are reading a page directly they likely won't be flummoxed by math problems.
By making thigs worse I was referring to the fact that AI centers already require too much energy
It's not a perfect solution by any means. It doesn't protect user data. It doesn't do anything to help with the energy problem. It merely makes it possible for someone to run their server without getting taken offline by automated systems.
Anything you do to inhibit LLM scrapers is by definition going to cost more energy in the short term. The idea is to drive them away by making it too costly. And realistically, in the short term, the only thing you can do to make AI farms use less energy is to have their maintainers turn them off. I'm not aware of anything we can do to make that happen.
The energy being spent on web scraping is a fraction of a percent of the energy costs to train an LLM. It's a negligible increase.
This process is happening before the LLM is involved; it's probably a standard Python based script.