You can break it by spelling it with two 'a'a and one 'e'.
Wait 'til he finds out about Versace
The article sort of demonstrates it. Instead of needing inordinate amounts of data and memory to increase it's chance of one-shotting the countdown game. It only needs to know enough to prove itself wrong and roll the dice again.
I'm running deepseek-r1:14b on a 12GB rx6700. It just about fits in memory and is pretty fast.
Easier to grasp than your witticisms, clearly.
I'm not particularly surprised by the censorship. That's not really the point of the post.
There's constant arguments going on between people over whether it's censored or not. A lot of people, me included tbh, were under the impression that it wasn't because we were able to get information out of it that we would expect to be censored. Other people have claimed not to be able to when trying similar. Therefore we've ended up with people arguing over whether it is or isn't.
I investigated and proved that both sides are kinda of right, and explained why people are getting different results for doing what is ostensibly the same thing.
I like the idea that the CCP are so desperate to down vote that one comment they don't stop to down vote the post that actually makes them look bad, which said comment is responding to.
Some models are llama and some are qwen. Both sets respond with "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses." when you spell it Tianenmen, but give details when you spell it Tiananmen.
All run on local hardware. Llama models do it too. This is 8b
You get the exact same cookie cutter response in the llama models, and the qwen models process the question and answer. The filter is deepseek's contribution.
It's a slightly facetious comment on how the same model had gone from definitely not censored to definitely censored. The tripwire for the filter was obviously already there.
Considering the humiliating displays of fealty from meta (overtly manipulating search and firing factcheckers) and tik tok (grovelling public announcement) it's not hard to imagine there's some serious threats against any site operator that doesn't bend the knee.