this post was submitted on 27 Jul 2025
69 points (98.6% liked)

Technology

39791 readers
274 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
 

Another AI fail. Letting AI write code and modify your file system without sandboxing and buckups. What could go wrong?

you are viewing a single comment's thread
view the rest of the comments
[–] megopie@beehaw.org 7 points 5 days ago* (last edited 5 days ago)

It’s insane to me that people are actually trying get these LLMs to do things. Let alone outside of an experimental setting. Like, it’s a non starter at a fundamental conceptual level.

It reminds me of an experiment where they had a few try and run simulated vending machines.

It was pretty clear from the results that none of the LLMs were capable of consistently performing basic tasks, with them routinely introducing irrelevant or incorrect information that would derail things. Such as ordering nonexistent products, assuming capabilities that it was never given, and generally just failing to properly recall information or keep values it was given consistent. Some of the failures were quite spectacular, ranging from insisting it had gone bankrupt and was trying to sell the vending machine, to threatening to nuke suppliers and trying to contact the FBI.