this post was submitted on 09 Jun 2025

134 points (94.7% liked)

Technology

39856 readers

314 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

134

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic (www.tomshardware.com)

submitted 1 month ago by misk@sopuli.xyz to c/technology@beehaw.org

25 comments fedilink hide all child comments

all 26 comments

sorted by: hot top controversial new old

[–] 30p87@feddit.org 104 points 1 month ago (3 children)

Anyone even believing that a generic word auto completer would beat classic algorithms wherever possible probably belongs into a psychiatry.

[–] realitista@lemm.ee 55 points 1 month ago (2 children)

There are a lot of people out there that think LLM's are somehow reasoning. Even reasoning models aren't really doing it. It important to do demonstrations like this in the hopes that the general public will understand the limitations of this tech.

[–] theangriestbird@beehaw.org 16 points 1 month ago (1 children)

It is important to do demonstrations like this in the hopes that the general public will understand the limitations of this tech.

THIS is the thing. The general public's perception of ChatGPT is basically whatever OpenAI's marketing department tells them to believe, plus their single memory of that one time they tested out ChatGPT and it was pretty impressive. Right now, OpenAI is telling everyone that they are a few years away from Artificial General Intelligence. Tests like this one demonstrate how wrong OpenAI is in that assertion.

[–] p03locke@lemmy.dbzer0.com 2 points 1 month ago

It's almost as bad as the opposition's comparison of it to Skynet. People are never going to understand technology without applying some fucking nuance.

Stop hyping new technology... in either direction.

[–] ByteSorcerer@beehaw.org 7 points 1 month ago

I think the problem is that, while the model isn't actually reasoning, it's very good at convincing people it actually is.

I see current LLMs kinda like an RPG character build with all ability points put into Charisma. It's actually not that good at most tasks, but it's so good at convincing people that they start to think it's actually doing a great job.

[–] jjjalljs@ttrpg.network 24 points 1 month ago

I think I remember some doge goon asking online about using an LLM to parse JSON. Many people don't understand things.

[–] MadMadBunny@lemmy.ca 19 points 1 month ago

That’s too much critical thinking for most people

[–] Showroom7561@lemmy.ca 40 points 1 month ago (1 children)

In a quite unexpected turn of events, it is claimed that OpenAI’s ChatGPT “got absolutely wrecked on the beginner level” while playing Atari Chess.

Who the hell thought this was "unexpected"?

What's next? ChatGPT vs. Microwave to see which can make instant oatmeal the fastest? 😂

[–] valgarf@discuss.tchncs.de 26 points 1 month ago (1 children)

Considering how much heat the servers probably generate, ChatGPT might have a decent chance in that competition 😁

[–] Showroom7561@lemmy.ca 6 points 1 month ago

Air-fried oatmeal, FTW!

[–] Michal@programming.dev 34 points 1 month ago

A simple calculator will also beat it at math.

[–] thefartographer@lemm.ee 31 points 1 month ago

Atari game programmed to know chess moves: knight to B4

Chat-GPT: many Redditors have credited Chesster A. Pawnington with inventing the game when he chased the queen across the palace before crushing the king with a castle tower. Then he became the king and created his own queen by playing "The Twist" and "Let's Twist Again" at the same time.

[–] Wytch@lemmy.zip 29 points 1 month ago

This article makes ChatGPT sound like a deranged blowhard, blaming everything but its own ineptitude for its failure.

So yeah, that tracks.

[–] Opinionhaver@feddit.uk 23 points 1 month ago (1 children)

Isn’t this kind of like ridiculing that same Atari for not being able to form coherent sentences? It’s not all that surprising that a system not designed to play chess loses to a system designed specifically for that purpose.

[–] GammaGames@beehaw.org 18 points 1 month ago (1 children)

Pretty much, but the marketers are still trying to tell people it can totally do logic anyway. Hopefully the apple paper opens some eyes

[–] mormund@feddit.org 7 points 1 month ago

For anyone wondering what "the" apple paper is: https://machinelearning.apple.com/research/illusion-of-thinking

[–] oce@jlai.lu 13 points 1 month ago (1 children)

A PE teacher got absolutely wrecked by a former Olympic sprinter at a sprint competition.

[–] thefartographer@lemm.ee 22 points 1 month ago

Change "PE teacher" to "stack of health magazines" and it's a more accurate equivalence.

[–] Chozo@fedia.io 9 points 1 month ago (1 children)

Well... yeah. That's not what LLMs do. That's like saying "A leafblower got absolutely wrecked by 1998 Dodge Viper in beginner's drag race". It's only impressive if you don't understand what a leafblower is.

[–] misk@sopuli.xyz 5 points 1 month ago* (last edited 1 month ago) (2 children)

People write code with LLMs. Programming language is just a language specialised at precise logic. That’s what „AI” is advertised to be good at. How can you do that an not the other?

[–] TimeSquirrel@kbin.melroy.org 7 points 1 month ago (1 children)

It's not very good at it though, if you've ever used it to code. It automates and eases a lot of mundane tasks, but still requires a LOT of supervision and domain knowledge to not have it go off the rails or hallucinate code that's either full of bugs or will never work. It's not a "prompt and forget" thing, not by a long shot. It's just an easier way to steal code it picked up from Stackoverflow and GitHub.

Me as a human will know to check how much data is going into a fixed size buffer somewhere and break out of the code if it exceeds it. The LLM will have no qualms about putting buffer overflow vulnerabilities all over your shit because it doesn't care, it only wants to fulfill the prompt and get something to work.

[–] misk@sopuli.xyz 7 points 1 month ago

I’m not saying it’s good at coding, I’m saying it’s specifically advertised as being very good at it.

[–] MagicShel@lemmy.zip 1 points 1 month ago* (last edited 1 month ago) (1 children)

"Precise logic" is specifically what AI is not any good at whatsoever.

AI might be able to write a program that beats an A2600 in chess, but it should not be expected to win at chess itself.

[–] misk@sopuli.xyz 4 points 1 month ago* (last edited 1 month ago) (1 children)

I shall await the moment when AI pretends to be as confident about communicating not being able to do something as it is with the opposite because it looks like it’s my job somehow.

[–] MagicShel@lemmy.zip 1 points 1 month ago

Yeah, LLMs seem pretty unlikely to do that, though if they figure it out that would be great. That's just not their wheelhouse. You have to know enough about what you're attempting to ask the right questions and recognize bad answers. The thing you're trying to do needs be within your reach without AI or you are unlikely to be successful.

I think the problem is more the over-promising what AI can do (or people who don't understand it at all making assumptions because it sounds human-like).