this post was submitted on 08 Apr 2026
237 points (84.3% liked)

Technology

83632 readers
3440 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] tleb@lemmy.ca -4 points 1 day ago (1 children)

Is this the one that when they leaked the Claude Code source code, it had like 3x the fail rate of Opus?

[–] a_gee_dizzle@lemmy.ca 2 points 1 day ago (1 children)

I heard about the leak but I didnt hear about this particular detail. Where can I learn more about this?

[–] Scipitie@lemmy.dbzer0.com 12 points 1 day ago* (last edited 1 day ago) (2 children)

It's bullshit. What leaked was their commandline tool source code (named "claude code") - very juicy in itself but has nothing to do with their models.

[–] lime@feddit.nu 10 points 1 day ago (1 children)

it does show their general style of work, eg no checks of the source at all, complete ignorance of the capabilities of language models, and lots of pleas to not hack the user when they ask a question. with that leak i'm not surprised they think a model is "too dangerous". they could barely stop the old one.

[–] Scipitie@lemmy.dbzer0.com 1 points 1 day ago

Oh I completely agree with that, just the jump to "a flawed model leaked" is too far. There's already enough crap to mock, no need to make up additional stuff.

[–] tleb@lemmy.ca 4 points 1 day ago

There was some references to experimental models not publicly available and some % info.

https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked-heres-what-we-know

Internal comments reveal that Anthropic is already iterating on Capybara v8, yet the model still faces significant hurdles. The code notes a 29-30% false claims rate in v8, an actual regression compared to the 16.7% rate seen in v4.