this post was submitted on 29 Aug 2025
491 points (98.4% liked)
Technology
74585 readers
3951 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Can someone who understands this better explain to me how this thing actually places the order into whatever POS they use? Like if LLMs are just advanced auto-complete, I get how they can do "fuzzy" tasks like answering questions or carrying on a conversation, but how do they do rigid tasks like entering the tacos into whatever system the cash register and kitchen use?
The LLM isn't limited to just what it does. It can interact with other programs.
There are a ton of audio recognition systems available, almost all of them predate this LLM bubble. There's already an API for interacting with the ordering system. So it's just down to having the LLM pull what is then do that corresponding action for the order.
This is so simple it doesn't require anything nearly as complicated as an LLM. The old phone assistants like Siri and Alexa could do this type of thing. It's literally the same as telling Alexa to place an order for something, and that's been an ability for years.
So the output from the LLM is just a text description that's fed into another, smarter piece of software that interprets that text into an order? What task is the LLM actually doing in this case?
The LLM is taking the order. Interpreting what people say into that simple text description. Not everyone talks the same or describes things the same. That is i believe where the bulk of the LLM is doing the work. Then I'm sure there is some background stock management and health checks out manages as well
What's wrong with an input machine with buttons or touch screen?
Takes too long to hold down the button for 18,000 waters.
OT4G
(Order Time For Grandma)
Not futuristic enough or something.
They are not able to answer questions or change simply via a software update.
We have apps for that, and they're typically a pita. They certainly take longer than just talking through your order.
I don't think there is an LLM in this application. Not all AI tools involve LLM.
I think the role of the LLM is just to make the system understand the order more accurately.
Probably something like this. Except not trained to be a rebellious troll. Part of her training set is his chat, hehe. Though despite this one being "evil" neuro, I think normal neurosama is more of a troll now, lol.
https://youtu.be/AFtryxMDJQs
This is clipped segments from a live stream, so it jumps ahead at times. It has links to the source channel if you would prefer a full video. This one is probably already too long for most people though.
He does end up figuring out why she has so much trouble correctly inserting code in the right places later.
Edit: also, everytime she says "filtered", it means whatever she was gonna say would have broken youtube or twitch rules. He has two filters, one on the text generated and one on the text to speech. If the text one catches it, it just outputs filtered instead, if the speech one catches it, she'll still type something terrible, but only say roughly the first syllable or 2 before the speech is cut off.
Its just an API.
There's a few ways they could go about it. They could have part of the prompt be something like "when the customer is done taking their order, create a JSON file with the order contents" and set up a dumb register essentially that looks for those files and adds that order like a standard POS would.
They could spell out a tutorial in the prompt, "to order a number 6 meal, type "system.order.meal(6)" calling the same functions that a POS system would, and have that output right to a terminal.
They could have their POS system be open on an internal screen, and have a model that can process images, and have it specify a coordinate pair, to simulate a touch screen, and make it manually enter an order that way as an employee would.
There's lots of ways to hook up the AI, and it's not actually that different from hooking up a normal POS system in the first place, although just because one method does allow an AI to interact doesn't mean it'll go about it correctly.
LLMs, with a little coaxing, perform well at returning well formed JSON.
They do, my concern is more about if that JSON is correct, not just well-formed.
Also, 18000 waters might be correct JSON, but makes an AI a bad cashier.
There is a lot more that goes into it than just being correct. 18000 waters may have been the actual order, because somebody decided to screw with the machine. A human who isn't terminally autistic would reliably interpret that as a joke and would simply refuse to punch that in. The LLM will likely do what a human tells it to do, since it has no contextual awareness, it only has the system prompt and whatever interaction with the user it had so far.
Thats part of correctness to me, delivering an order that taco bell actually would make is important.
Semantics aside, though, we agree. That's very important.
So they just trim the instructions so it doesn't take joke orders, so it can make more reasonable decisions, like:
"May I take your order?"
"Two double whoppers with extra mayo and a chocolate cherry banana sundae"
"Oh you've GOTTA be joking!"
It's trivial to get LLMs to act against the instructions