this post was submitted on 10 Oct 2025
110 points (100.0% liked)

Fuck AI

4289 readers
1276 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 2 years ago
MODERATORS
 

Just 250 malicious training documents can poison a 13B parameter model - that's 0.00016% of a whole dataset Poisoning AI models might be way easier than previously thought if an Anthropic study is anything to go on. …

you are viewing a single comment's thread
view the rest of the comments
[–] stabby_cicada@slrpnk.net 11 points 1 day ago* (last edited 1 day ago) (1 children)

Yeah, and, as the article points out, the trick would be getting those malicious training documents into the LLM's training material in the first place.

What I would wonder is whether this technique could be replicated using common terms. The researchers were able to make their AI spit out gibberish when it heard a very rare trigger term. If you could make an AI spit out, say, a link to a particular crypto-stealing scam website whenever a user put "crypto" or "Bitcoin" in a prompt, or content promoting anti-abortion "crisis pregnancy centers" whenever a user put "abortion" in a prompt ...

[–] IMALlama@lemmy.world 4 points 1 day ago

I've seen this described before, but as AI ingests content written by a prior AI for training things will get interesting.