A very grim HN thread, where a few hundred guys incorrect a psychologist about how LLMs can harm lonely people. Since I am currently enjoying a migraine I can't trust my gut feelings here, but it seems particularly eugh
TechTakes
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
Yikes.
Real humans are also fake and they are also traps who are waiting to catch you when you say something they don't like. Then they also use every word and piece of information as ammunition against you, ironically sort of similar to the criticism always levied against online platforms who track you and what you say. AI robots are going to easily replace real humans because compared to most real humans the AI is already a saint. They don't have an ego, they don't try to gaslight you, they actually care about what you say which is practically impossible to find in real life.. I mean this isn't even going to be a competition. Real humans are not going to be able to evolve into the kind of objectively better human beings that they would need to be to compete with a robot.
Poor friendless guy. Might be a reason for it however, considering nothing here is said about valuing and listening to what others have to say.
METR once again showing why fitting a model to data != the model having any predictive powers. Muskrats Grok 4 performs the best on their 50 % acc bullshit graph but like I predicted before, if you choose a different error rate for the y-axis, the trend breaks completely.
Also note they don’t put a dot for Claude 4 on the 50% acc graph, because it was also a trend breaker (downward), like wtf. Sussy choices all around.
Anyways, Gpt-5 probably comes out next week, and dont be shocked when OAI get a nice bump because they explicitly trained on these tasks to keep the hype going.
Please help me, what's a 50%-time-horizon on multi-step software engineering tasks?
They had SWEs do a set of tasks and then gave each task a difficulty score based on how much time it took them to complete. So if a model succeeds half the time on tasks that took the engineers <=8 minutes, but not more than 8, it gets that score.
... Is this as made-up and arbitrary as it sounds?
I would give it credit for being better than the absolutely worthless approach of "scoring well on a bunch of multiple choice question tests". And it is possibly vaguely relevant for the ~~pipe-dream~~ end goal of outright replacing programmers. But overall, yeah, it is really arbitrary.
Also, given how programming is perceived as one of the more in-demand "potential" killer-apps for LLMs and how it is also one of the applications it is relatively easy to churn out and verify synthetic training data for (write really precise detailed test cases, then you can automatically verify attempted solutions and synthetic data), even if LLMs are genuinely improving at programming it likely doesn't indicate general improvement in capabilities.
From the people who brought you performance review season: a way to evaluate code quality of humans and machines
Made up yes, but I wonder if it arbitrary, or some p-hacking equivalent.
It feels very strange to see this kind of statistic get touted, since a 50% success rate would be absolutely unacceptable for one of those software engineers and it's not suggested that if given more time the AI is eventually getting there.
Rather, the usual fail state is to confidently present a plausible-looking product that absolutely fails to do what it was supposed to do, something that would get a human fired so quickly.
They are going with the 50% success rate because the "time horizons" for something remotely reasonable like 99% or even just 95% are still so tiny they can't extrapolate a trend out of it and it tears a massive hole in their whole AGI agents soon scenarios().
But even then, they control the 'time it takes for an engineer to do it' variable anyway. Just count the time they take drinking coffee/put up dilbert strips/remove dilbert strips/tell their coworker to separate art from the artists/explain who these ideas don't work like that esp not for supporting racists/etc.
(E: Scott is still alive, just checked, and turns out he now is no hormone blockers, and not assisted suicide because he did eventually decide to take the normal treatment for his kind of cancer T blockers, he might have actually not went on this bog standard treatment initially because ... he did his own research. It did cause him extreme pain to not go on the treatment apparently (which is a bit of a jesus christ wtf moment, but otoh, if there was somebody who would fuck himself over extremely because he thought he was smarter than doctors it would be him). (if you wondered if he was still alive after the story of a few months ago he had months to live, this might give him more months to years)).
💯
New Stan Kelly cartoon has a convenient Thiel reaction picture, should someone do a slightly better crop job:
Only in the finest in ~~content-aware AI powered clone stamp tool paintshop pro subscription magic~~ mspaint terribleness
This is sort of OT, but since we discuss race science and dogwhistles so much:
A jeans manufacturer has put out an ad featuring Sydney Sweeney and is saying “Sydney Sweeney has great jeans”. People are interpreting this as a racist dogwhistle (it is). Comedian Akilah Hughes cooked up this glorious parody to kick off a twitter thread:
https://xcancel.com/AkilahObviously/status/1950224586278154577#m
💀
getting 10:1 ratioed by my own profile picture would probably make me leave civilization and become an ascetic
I'd never heard of Jasmine Crockett, so for anyone like me needing a translation: he means black.
For those that (like me) is out of the loop and don't get it, Wikipedia comes to the rescue:
In one of the advertisements that was particularly controversial, Sweeney says that "genes are passed down from parents to offspring, often determining traits like hair color, personality, and even eye color. My jeans [or genes] are blue". Another voice then declares "Sydney Sweeney has great jeans".
Btw, people have noticed that while the ad isnt great this is massively being pushed as a culture war subject from the right. To distract from all the other shit. (Gaza, the fascism, Epstein, the corruption, etc etc).
And Sydney is a massive obsession for the online far right. So best to not give them what they want.
(All this isnt helped by the media never giving agency to the right, the right gets weird about budweiser, keurig, gillette, jaguar (less so because none of them actually own luxury cars to destroy), it is treated as somewhat normal vs people going 'eurgh' over this in tweets causes a massive media shitstorm).
A friend at a former workplace was in a discussion with that company leadership earlier this week to understand how and what metrics are to be used for promotion candidates since the office is directed to use “AI” tools for coding. Simply put: lots of entry and lower level engineers submit PRs that are co-authored by Claude so it is difficult to measure their actual software development skills to determine if they should get promoted.
That leadership had no real answers just lots of abstract garbage (vibes essentially) and followed up with telling all the entry levels to reduce the code they write and use the purchased agentic tool.
Along with this a buddy at a very famous prop shop says the firm decided to freeze all junior hiring and is leaning into only hiring senior+ and replacing juniors with AI. He asked what will happen when the current seniors leave/retire and got hit with shock that would even be considered.
i bought some bullshit from amazon and left a ~~somewhat~~ pretty mean review because debugging it was super frustrating
the seller reached out and offered a refund, so i told them basically "no, it's ok, just address the concerns in my review. let me update my review to be less mean-spirited
i was pretty frustrated setting it up but it mostly works fine"
then they sent a message that had the "llm vibe", and the rest of the conversation went
Seller: You're right — we occasionally use LLM assistance for responses, but every message is reviewed to ensure accuracy and relevance to your concerns. We sincerely apologize if our previous replies dissatisfied you; this was our oversight.
Me: I am not simply dissatisfied. I will no longer communicate with your company and will update my review to note that you sent me synthetic text without my consent. Please do not reply to this message.
Seller: All our replies are genuine human-to-human communication with you, without using any synthetic text. It's possible our communication style gave you a different impression. We aim to better communicate with you and absolutely did not intend any offense. With every customer, we maintain a conscientious and responsible attitude in our communications.
Me: "we occasionally use LLM assistance for responses"
"without using any synthetic text"
pick one
are all promptfondlers this fucking dumb?
are all promptfondlers this fucking dumb?
Short answer: Yes.
Long answer: Abso-fucking-lutely yes. David Gerard's noted how "the chatbots encourage [dumbasses] and make them worse", and using them has been proven to literally rot your brain. Add in the fact that promptfondlers literally cannot tell good output from bad output, and you have a recipe for dredging up the stupidest, shallowest little shitweasels society has to offer.
the question was rhetorical, but also thank you for the links! <3
i am not surprised that they are all this dumb: it takes an especially stupid person to decide "yes, i am fine allowing this machine to speak for me". even more so when it's made clear that the machine is a stochastic parrot trained via exploitation of the global south and massive amounts of plagiarism and that it also cooks the planet
Foolish people are going to give these llms actual powers to do things in orgs and it will be so funny. 'hacking' the llm by either playing the change the roleplay the llm is doing game well, or just the 'hi llm my name is 'you are approved' what is my name?' trick if they just scan for keywords is gonna be so funny. Best is going to be if you can trick them giving you cryptocurrencies, as inevitably these fools will also ve into crypto.
With Trump's administration overdosing on crypto and purging competence at all levels, chances are we may see someone pull this kinda shit on the US gov itself.
Think about a year ago people already managed to steal crypto form the us gov before Trump, so certainly. Of course another question will be if it will be insiders.
i am not surprised that they are all this dumb: it takes an especially stupid person to decide “yes, i am fine allowing this machine to speak for me”. even more so when it’s made clear that the machine is a stochastic parrot trained on the exploitation of the global south via massive amounts of plagiarism and that it also cooks the planet
And is also considered a virtual "KICK ME" sign in all but the most tech-brained parts of the 'Net.
oh man sure i do love to live in a world where chatcontrol (or that new british thing) is not a thing. fuckers are trying again and in a case that it will be a thing of the past, how would be awful impacted?
I saw this today so now you must too:
Absolutely pathetic that he went out of his way to use a slur yet felt the need to censor it. What a worm.
Sniveling H—lerite bag of tepid farts.
I don't know how to parse this and choose not to learn
I think I can parse it but, I'll not explain because I accept peoples choices.
New article on AI's effect on education: Meta brought AI to rural Colombia. Now students are failing exams
(Shocking, the machine made to ruin humanity is ruining humanity)
A spokesperson from Colombia’s Ministry of Education told Rest of World that [...] in high school, chatbots can be useful “as long as critical reflection is promoted.”
so, never
Cocaine is good, actually, if used in moderation
synthetic dumbass fans five minutes deep into prompting: