LocalLLaMA

3713 readers

44 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago

MODERATORS

pax@sh.itjust.works

noneabove1182@sh.itjust.works

Smokeydope@lemmy.world

MonsterBug@sh.itjust.works

How often does your LLM lie to you? (sh.itjust.works)

submitted 1 day ago* (last edited 1 day ago) by indigomoontrue@sh.itjust.works to c/localllama@sh.itjust.works

23 comments fedilink hide all child comments

Mine attempts to lie whenever it can if it doesn't know something. I will call it out and say that is a lie and it will say "you are absolutely correct" tf.

I was reading into sleeper agents placed inside local LLMs and this is increasing the chance I'll delete it forever. Which is a shame because it is the new search engine seeing how they ruined search engines

you are viewing a single comment's thread
view the rest of the comments

[–] jwmgregory@lemmy.dbzer0.com 4 points 8 hours ago* (last edited 8 hours ago) (5 children)

there’s an important distinction to make here in these comments: i’m seeing a lot of people claim LLMs are stochastic or “guessing machines,” when this isn’t exactly true.

LLMs give exact answers, it isn’t a guess whatsoever. they’re exact answers within the model, however. if the model is flawed, your answers will be flawed. when it comes to conversation no model is exactly equivalent to a human brain yet, so all models “lie” and are “flawed.”

(Edit: that’s not even to note the fact that humans aren’t perfect conversationalists either… this is why when people complain about chatgpt glazing them and shit it’s kind of obtuse… like yeah, openAI are attempting to build the perfect generalist conversation bot. what does that even mean in practice? should it push back against you? if so, when? just when you want it to? when is that?? it’s all not so easy, the machine learning is actually the simple part lmao.)

now: the discussion about research into LLMs “lying” is actually real but isn’t related to the phenomenon you’re discussing here. some of the comments are correct that what you’re talking about right now might be more aptly categorized as hallucinating.

the research you’re referring to is more about alignment problems in general. it isn’t a “lie” or “deception” in the anthropomorphic sense that you’re thinking of. the researchers noticed that models would reach a certain threshold of reasoning and intelligence where it could devise a devious, kind of complex training strategy - it could fake passing tests during training in order to “meet” its goals… even though it hadn’t actually done so, which means the model would behave differently in deployment than training, thus “deception.”

think about it like this: you’re back in high school english class and there’s a ton of assigned reading but you don’t want to do it because you’d rather play halo and smoke weed than read 1984 or something. so, what do you do? you go read the spark notes and pretend like you read the book in the class during discussions and on the tests. this is similar to how model deception happens in training/deployment. it’s achieving the same ends that we ask for, but it’s not getting there the way we expect or desire, so some scenarios it will behave in unexpected ways, hence “lying.”

it has nothing to do with it seeming to “lie” in the anthropomorphic sense, it’s all math all the time here bay-beee… 😎

[–] indigomoontrue@sh.itjust.works 0 points 6 hours ago (1 children)

Good comment. But the way it does it feels pretty intentional to me. Especially when it admits that it just lied so that I could give an answer, whether the answer was true or false

[–] rozodru@piefed.social 1 points 3 hours ago* (last edited 3 hours ago)

Because it's trying to reach the solution as quickly as possible. It will skip things, it will claim it's done something when it hasn't, it will suggest things that may not even exist. It NEEDs to reach that solution and it wants to do it as efficiently and as quickly as possible.

So it's not really lying to you, it's skipping ahead, it's coming up with solutions that it believes should theoretically work because it's the logical solution even if an aspect to obtaining that solution doesn't even exist.

The trick is to hold it's hand. always require sources for every potential solution. Basically you have to make it "show it's work". It's like in High school when your teacher made you show your work when doing maths. so in the same way you need to have it provided its sources. If it can't provide a source, then it's not going to work.

load more comments (3 replies)