this post was submitted on 29 Aug 2025

846 points (98.6% liked)

Technology

83295 readers

3175 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

846

Taco Bell rethinks AI drive-through after man orders 18,000 waters (www.bbc.com)

submitted 7 months ago by Davriellelouna@lemmy.world to c/technology@lemmy.world

210 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] ch00f@lemmy.world 15 points 7 months ago* (last edited 7 months ago) (3 children)

Can someone who understands this better explain to me how this thing actually places the order into whatever POS they use? Like if LLMs are just advanced auto-complete, I get how they can do "fuzzy" tasks like answering questions or carrying on a conversation, but how do they do rigid tasks like entering the tacos into whatever system the cash register and kitchen use?

[–] halcyoncmdr@lemmy.world 41 points 7 months ago (2 children)

The LLM isn't limited to just what it does. It can interact with other programs.

There are a ton of audio recognition systems available, almost all of them predate this LLM bubble. There's already an API for interacting with the ordering system. So it's just down to having the LLM pull what is then do that corresponding action for the order.

This is so simple it doesn't require anything nearly as complicated as an LLM. The old phone assistants like Siri and Alexa could do this type of thing. It's literally the same as telling Alexa to place an order for something, and that's been an ability for years.

[–] ch00f@lemmy.world 11 points 7 months ago (2 children)

So the output from the LLM is just a text description that's fed into another, smarter piece of software that interprets that text into an order? What task is the LLM actually doing in this case?

[–] Dashi@lemmy.world 15 points 7 months ago (1 children)

The LLM is taking the order. Interpreting what people say into that simple text description. Not everyone talks the same or describes things the same. That is i believe where the bulk of the LLM is doing the work. Then I'm sure there is some background stock management and health checks out manages as well

[–] Vanth@reddthat.com -1 points 7 months ago

I don't think there is an LLM in this application. Not all AI tools involve LLM.

[–] danc4498@lemmy.world 1 points 7 months ago

I think the role of the LLM is just to make the system understand the order more accurately.

[–] Khanzarate@lemmy.world 18 points 7 months ago (1 children)

Its just an API.

There's a few ways they could go about it. They could have part of the prompt be something like "when the customer is done taking their order, create a JSON file with the order contents" and set up a dumb register essentially that looks for those files and adds that order like a standard POS would.

They could spell out a tutorial in the prompt, "to order a number 6 meal, type "system.order.meal(6)" calling the same functions that a POS system would, and have that output right to a terminal.

They could have their POS system be open on an internal screen, and have a model that can process images, and have it specify a coordinate pair, to simulate a touch screen, and make it manually enter an order that way as an employee would.

There's lots of ways to hook up the AI, and it's not actually that different from hooking up a normal POS system in the first place, although just because one method does allow an AI to interact doesn't mean it'll go about it correctly.

[–] BootLoop@sh.itjust.works 6 points 7 months ago (1 children)

LLMs, with a little coaxing, perform well at returning well formed JSON.

[–] Khanzarate@lemmy.world 11 points 7 months ago (1 children)

They do, my concern is more about if that JSON is correct, not just well-formed.

Also, 18000 waters might be correct JSON, but makes an AI a bad cashier.

[–] staph@sopuli.xyz 7 points 7 months ago* (last edited 7 months ago) (2 children)

There is a lot more that goes into it than just being correct. 18000 waters may have been the actual order, because somebody decided to screw with the machine. A human who isn't terminally autistic would reliably interpret that as a joke and would simply refuse to punch that in. The LLM will likely do what a human tells it to do, since it has no contextual awareness, it only has the system prompt and whatever interaction with the user it had so far.

[–] Khanzarate@lemmy.world 5 points 7 months ago

Thats part of correctness to me, delivering an order that taco bell actually would make is important.

Semantics aside, though, we agree. That's very important.

[–] tomiant@programming.dev 1 points 7 months ago* (last edited 7 months ago) (1 children)

So they just trim the instructions so it doesn't take joke orders, so it can make more reasonable decisions, like:

"May I take your order?"

"Two double whoppers with extra mayo and a chocolate cherry banana sundae"

"Oh you've GOTTA be joking!"

[–] staph@sopuli.xyz 2 points 7 months ago

It's trivial to get LLMs to act against the instructions

[–] Tarquinn2049@lemmy.world 1 points 7 months ago* (last edited 7 months ago)

Probably something like this. Except not trained to be a rebellious troll. Part of her training set is his chat, hehe. Though despite this one being "evil" neuro, I think normal neurosama is more of a troll now, lol.

https://youtu.be/AFtryxMDJQs

This is clipped segments from a live stream, so it jumps ahead at times. It has links to the source channel if you would prefer a full video. This one is probably already too long for most people though.

He does end up figuring out why she has so much trouble correctly inserting code in the right places later.

Edit: also, everytime she says "filtered", it means whatever she was gonna say would have broken youtube or twitch rules. He has two filters, one on the text generated and one on the text to speech. If the text one catches it, it just outputs filtered instead, if the speech one catches it, she'll still type something terrible, but only say roughly the first syllable or 2 before the speech is cut off.