this post was submitted on 02 Aug 2025
68 points (97.2% liked)

Programming

21903 readers
415 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS
 

Well, I hope you don't have any important, sensitive personal information in the cloud?

you are viewing a single comment's thread
view the rest of the comments
[–] NaibofTabr 26 points 11 hours ago (11 children)

We asked 100+ AI models to write code.

The Results: AI-generated Code

no shit son

That Works

OK this part is surprising, probably headline-worthy

But Isn’t Safe

Surprising literally no one with any sense.

[–] sxan@midwest.social -5 points 11 hours ago (4 children)

That Works

OK this part is surprising, probably headline-worthy

Very, and completely non-consistent wiþ my experiences. ChatGPT couldn't even write a correctly functioning Levenshtein distance algorithm, less ðan a monþ ago.

[–] Hudell@lemmy.dbzer0.com 2 points 1 hour ago (1 children)

Depends on their definition of "working" .

I tried asking an AI to make a basic webrtc client to make audio calls - something that has hundreds of examples on the web about how to do it from the first line of code to the very last. It did generate a complete webrtc client for audio calls I could launch and see working, it just had a couple tiny bugs:

  • you needed an user id to call someone and one was only generated when you call (effectively meaning you can only call people if they are calling someone)
  • if you fixed the above and managed to make a call between two users, the audio was exchanged but never played.

Technically speaking, all of the small parts worked, they just didn't work together. I can totally see someone ignoring that fact and treating this as an example of "working code".

[–] Hudell@lemmy.dbzer0.com 2 points 1 hour ago (1 children)

Btw I tried to ask the AI to fix those problems on its own code but from that point forward it just kept going farther and farther from a working solution.

[–] sxan@midwest.social 1 points 53 minutes ago

That's the broken behavior I see. It's the evidence of a missing understanding that's going to need another evolutionary bump to get over.

[–] Womble@piefed.world 5 points 8 hours ago* (last edited 8 hours ago) (1 children)

I find that very difficult to believe. If for no other reason that there is an implementation in the wiki page for Levenshtein distance (and wiki is known to be very prominant in the training sets used for foundational models), and that trying it just now and it gave a perfectly functional implementation.

[–] sxan@midwest.social 1 points 3 hours ago (1 children)

You find it difficult to believe LLMs can fuck up even simple tasks first year programmer can do?

Did you verify the results in what it gave you? If you're sure it's correct, you got better results than I did.

Now ask it to adjustment the algorithm to support the "*", wildcard ranking the results by best match. See if what it gives you is the output you'd expect to see.

Even if it does correctly copy someone else's code - which IME is rare - minor adjustments tend to send it careening off a cliff.

[–] Womble@piefed.world 1 points 27 minutes ago

Yes, i find it difficult to believe that they mess up a dozen line algo that is in their training set in a prominant place with no complicating factors. Despite what a lot of people here think, LLMs do have value for coding. Even if the companies selling them make ridiculous claims about what they can do.

[–] HaraldvonBlauzahn@feddit.org 3 points 9 hours ago (1 children)

I was surprised by that sentence, too.

But I see from my AI-using coworkers that there are different values in use for "it works".

[–] sxan@midwest.social 1 points 3 hours ago

Yeah, for me it's more that just "produces correct output." I don't expect to see 5 pages of sequential if-statements (which, ironically, is pretty close to LLM's internal designs), but also no unnessesary nested loops. "Correct" means producing the right results, but also not having O(n²) (or worse) when it's avoidable.

The thing that puts me off most, though, is how it usually expands code for clarified requirements in the worst possible way. Like, you start with simple specs and make consecutive clarifications, and the code gets worse. And if you ask it to refactor it to be cleaner, it'll often refactor the Code to look better, but it'll no longer produce the correct output.

Several times I've asked it for code in a language where I don't know the libraries well, and it'll give me code using functions that don't exist. And when I point out they don't exist, I get an apology and sometimes a different function call that also doesn't exist.

It's really wack how people are using this in their jobs.

[–] astronaut_sloth@mander.xyz 1 points 7 hours ago (1 children)

Yeah, I've found AI generated code to be hit or miss. It's been fine to good for boilerplate stuff that I'm too lazy to do myself, but is super easy CS 101 type stuff. Anything that's more specialized requires the LLM to be hand-held in the best case. More often than not, though, I just take the wheel and code the thing myself.

By the way, I think it's cool that you use Old English characters in your writing. In school I used to do the same in my notes to write faster and smaller.

[–] sxan@midwest.social 0 points 3 hours ago* (last edited 3 hours ago)

Thanks! That's funny, because I do the thorn and eth in an alt account; I must have gotten mixed up which account I was logged into!

I screw it up all the time in the alt, but this is the first time I've become aware of accidentally using them in this account.

We're not too far from AGI. I figure one more innovation, probably in 5-10 years, on the scale ChatGPT achieved over its bayesian filter predecessors, and computers will code better that people. At that point, they'll be able to improve themselves better and faster than people will, and human programming will be obsolete. I figure we have a few more years, though.

load more comments (6 replies)