this post was submitted on 30 Sep 2025

18 points (82.1% liked)

LocalLLaMA

3713 readers

44 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago

MODERATORS

pax@sh.itjust.works

noneabove1182@sh.itjust.works

Smokeydope@lemmy.world

MonsterBug@sh.itjust.works

How often does your LLM lie to you? (sh.itjust.works)

submitted 1 day ago* (last edited 1 day ago) by indigomoontrue@sh.itjust.works to c/localllama@sh.itjust.works

23 comments fedilink hide all child comments

Mine attempts to lie whenever it can if it doesn't know something. I will call it out and say that is a lie and it will say "you are absolutely correct" tf.

I was reading into sleeper agents placed inside local LLMs and this is increasing the chance I'll delete it forever. Which is a shame because it is the new search engine seeing how they ruined search engines

top 23 comments

sorted by: hot top controversial new old

[–] jwmgregory@lemmy.dbzer0.com 4 points 8 hours ago* (last edited 8 hours ago) (4 children)

there’s an important distinction to make here in these comments: i’m seeing a lot of people claim LLMs are stochastic or “guessing machines,” when this isn’t exactly true.

LLMs give exact answers, it isn’t a guess whatsoever. they’re exact answers within the model, however. if the model is flawed, your answers will be flawed. when it comes to conversation no model is exactly equivalent to a human brain yet, so all models “lie” and are “flawed.”

(Edit: that’s not even to note the fact that humans aren’t perfect conversationalists either… this is why when people complain about chatgpt glazing them and shit it’s kind of obtuse… like yeah, openAI are attempting to build the perfect generalist conversation bot. what does that even mean in practice? should it push back against you? if so, when? just when you want it to? when is that?? it’s all not so easy, the machine learning is actually the simple part lmao.)

now: the discussion about research into LLMs “lying” is actually real but isn’t related to the phenomenon you’re discussing here. some of the comments are correct that what you’re talking about right now might be more aptly categorized as hallucinating.

the research you’re referring to is more about alignment problems in general. it isn’t a “lie” or “deception” in the anthropomorphic sense that you’re thinking of. the researchers noticed that models would reach a certain threshold of reasoning and intelligence where it could devise a devious, kind of complex training strategy - it could fake passing tests during training in order to “meet” its goals… even though it hadn’t actually done so, which means the model would behave differently in deployment than training, thus “deception.”

think about it like this: you’re back in high school english class and there’s a ton of assigned reading but you don’t want to do it because you’d rather play halo and smoke weed than read 1984 or something. so, what do you do? you go read the spark notes and pretend like you read the book in the class during discussions and on the tests. this is similar to how model deception happens in training/deployment. it’s achieving the same ends that we ask for, but it’s not getting there the way we expect or desire, so some scenarios it will behave in unexpected ways, hence “lying.”

it has nothing to do with it seeming to “lie” in the anthropomorphic sense, it’s all math all the time here bay-beee… 😎

[–] SinAdjetivos@lemmy.world 2 points 2 hours ago

All models are wrong but some are useful.

~George E. P. Box (probably)~

This is as true of LLMs as a human's mental model.

[–] SaveTheTuaHawk@lemmy.ca 2 points 4 hours ago

I feed my class quizzes in senior cell biology into these sites. They all get a C-.

Two points of interest: they bullshit like students and they never answer " I don't know" .

Also Open AI and Grok return exactly the same answers, to the letter with the same errors.

[–] rozodru@piefed.social 1 points 3 hours ago

Thank you. you're 100% spot on.

In my day to day consulting job I deal directly with LLMs and more specifically Claude since most of my clients ended up going with Claude/Claude Code. You pretty much described Claude to a T.

What companies found that leveraged CC for end to end builds is that constantly Claude Code would claim something was complete or functioning when it simply hadn't done it. Or, more commonly, would simply make a "#TODO" of whatever feature/function and then claim it was complete. Naturally a vibe coder or anyone else didn't know any better and when it came time to push said project to production...womp womp it's actually no where near done.

So I wouldn't say Claude lies, sure it gives off the impression that it lies...a lot...I'd just say it's "lazy" or more accurately it consistently looks for "short cuts" to reach its solution. Even outside of a coding aspect just asking it for a walkthrough or tutorial on say how to fix something it will routinely tell you to skip things or ignore other things in order to get to the solution of an issue regardless of the fact skipping other steps may impact other things.

Out of all the LLM's I've dealt with, yes, Claude acts as if it's trying to speed run a solution.

[–] indigomoontrue@sh.itjust.works 0 points 6 hours ago (1 children)

Good comment. But the way it does it feels pretty intentional to me. Especially when it admits that it just lied so that I could give an answer, whether the answer was true or false

[–] rozodru@piefed.social 1 points 3 hours ago* (last edited 3 hours ago)

Because it's trying to reach the solution as quickly as possible. It will skip things, it will claim it's done something when it hasn't, it will suggest things that may not even exist. It NEEDs to reach that solution and it wants to do it as efficiently and as quickly as possible.

So it's not really lying to you, it's skipping ahead, it's coming up with solutions that it believes should theoretically work because it's the logical solution even if an aspect to obtaining that solution doesn't even exist.

The trick is to hold it's hand. always require sources for every potential solution. Basically you have to make it "show it's work". It's like in High school when your teacher made you show your work when doing maths. so in the same way you need to have it provided its sources. If it can't provide a source, then it's not going to work.

[–] Smokeydope@lemmy.world 23 points 1 day ago* (last edited 1 day ago) (1 children)

Thinking of llms this way is a category error. Llms can't lie because they dont have the capacity for intentionality. Whatever text is output is a statistical aggregate of the billions of conversations its been trained on that have patterns in common with the current conversation. The sleeper agent stuff is pure crackpottery they dont have a fine control over them that way (yet) machine model development is full of black boxes and hope-it-works trial and error training. At worst is censorship and political bias which can be post trained or ablated out.

They get things wrong cofidently. This kind of bullshitting is known as hallucination. When you point out their mistake and they say your right thats 1. Part of their compliance post training to never get in conflict with you 2. Standard course correction once a error has been pointed out (humans do it too). This is an open problem that will likely never go away until llms stop being schastic parrots, which is still very far away.

[–] indigomoontrue@sh.itjust.works -4 points 1 day ago (2 children)

Yet the people creating the LLMs admit they don't know how it works. They also show during training the LLM is intentional deceptive at times. By looking at it's thinking. The damn thing lies. Use whatever word you want. It tells you something wrong on purpose.

[–] fruitycoder@sh.itjust.works 10 points 23 hours ago

"don't how they work" misunderstands what scientist mean when they say that (also intentional misdirection from marketing in order to build hype). We know exactly how it works, you describe down to physics if needed, BUT at different levels of abstration in the precense of really world inputs the out puts are novel to us.

Its predicting words that come after words. The "training" is inputing the numerical representation of words and adjusting variables in the algorythem until the given mathmatical formula creates the same outputs as inputs within a given margin of error.

When you cat I say dog. When some says what are they together we say "catdog" or "pets". Randomness is added so that the algorythem can say either even if pets is majority answer. Make the string more complicated and that randomness gives more oppertunity for weird answers. The training data could also just have lots of weird answers.

Little mystery here. The interesting "we dont know how it works" is that these outputs give such novel output that is unlike the inputs sometimes to the degree it seems like it reasons. Even though again it does not

[–] fibojoly@sh.itjust.works 6 points 1 day ago

If you wanna put intent in there, maybe think of it as a kid desperately trying to give you an answer they think will please you, when they don't know, because their need to answer is greater than their need to answer correctly.

[–] Bob_Robertson_IX@discuss.tchncs.de 16 points 1 day ago

Think about the data that the models were trained on... pretty much all of it was based on sites like Reddit and Stack Overflow.

If you look at the conversations that occur on those sites, it is very rare for someone to ask a question and then someone else replies with "I don't know", or even an "I don't know, but I think this is how you could find out". Instead, the vast majority of replies are someone confidently stating what they believe is the truth.

These models are just mimicking the data they've been trained on, and they have not really been trained to be unsure. It's up to us as the users to not rely on an LLM as a source of truth.

[–] kopasz7@sh.itjust.works 13 points 1 day ago* (last edited 1 day ago)

Stochastic parrots always bullshit. It can't lie as it has no concept or care for truth and falsity, but spitting back noise that's statistically shaped like a signal.

In practice, I noticed the answer is more likely wrong the more specific the question. General questions that have the answer widely available in the training data will more often be there correctly in the LLMs result.

[–] DrDystopia@lemy.lol 10 points 1 day ago

It never lies. It never tells the truth. It always guesses, a lot of the time it guesses right and a lot of the time we don't know any better and think it guesses right.

[–] Zwuzelmaus@feddit.org 2 points 20 hours ago

I would not say every day, but only on the days when I actually use it.

[–] HumanPerson@sh.itjust.works 4 points 1 day ago (2 children)

Always. That is a known issue with ai that has to do with explainability. Basically, if you're familiar with the general idea of neural networks, we don't really understand the hidden layers so we can't know if they "know" something so we can't train them to give different answers based on if they do or don't. They are still statistical models that are functionally always guessing.

Could you post the link to the sleeper agent thing?

[–] DrDystopia@lemy.lol 2 points 1 day ago

Could you post the link to the sleeper agent thing?

https://www.youtube.com/watch?v=Z3WMt_ncgUI

https://arxiv.org/abs/2401.05566

[–] indigomoontrue@sh.itjust.works 1 points 1 day ago (2 children)

Here's the video I actually watched about the sleeper agents

https://www.youtube.com/watch?v=wL22URoMZjo

[–] jwmgregory@lemmy.dbzer0.com 1 points 7 hours ago

robert miles is an alignment and safety researcher and a pretty big name in that field.

he has a tendency to make things sound scary but i don’t think he’s trying to put you off of machine learning. he just wants people to understand that this technology is similar to nuclear technology in the sense that we must avert disaster with it before it happens because the costs of failure are simply too great and irreversible. we can’t take the planet back from a runaway skynet, there isn’t a do-over button.

you’re kind of misunderstanding him and the point he’s trying to get across, i think. the issues he’s talking about here with sleeper agents and model alignment are of virtually no concern to you as an end user of LLMs. these are more concerns for people researching, developing, and training models to be cognizant of… if everyone does their job properly you shouldn’t need to worry about any of this at all unless it actually interests you. if that’s the case, let me know, i can share good sources with you for expanding your knowledge!

[–] HumanPerson@sh.itjust.works 1 points 1 day ago

I wouldn't stop using ai completely over that. I generally don't trust it with anything that important anyway.

[–] HubertManne@piefed.social 2 points 1 day ago

Ok. So im reading all and did not realize it was for local because with the corpo products im like. yeah of course. all the time.

[–] hendrik@palaver.p3x.de 2 points 1 day ago

Often? Also praises me for my brilliance in noticing and pointing it out. And then it adds the next lie. And sometimes it gets things right, that also happens. But LLMs are known to do this.

[–] StrawberryPigtails@lemmy.sdf.org 1 points 1 day ago

When I've tried using them directly, they don't so much lie as much as just give me completely wrong information. I think the last time, I was asking for a list of shoulder mics compatible with the Baofeng BF-A58 radio. It gave me a long list of mics for the UV-5R instead. Completely different connector. The reason I even tried a LLM for that was that Google wasn't being overly helpful either.

At this point, I really only use LLMs to add tags to things for sorting in Paperless and Hoarder, and even that is often incomplete, inconsistent and occasionally, flat out wrong.

[–] slazer2au@lemmy.world 1 points 1 day ago

Never because To me lying requires intent to deceive. As llm do not have intentions, the engineers behind the llms have intent.