this post was submitted on 11 Jun 2025
234 points (98.3% liked)

Technology

73534 readers
2433 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 

A study from Profound of OpenAI's ChatGPT, Google AI Overviews and Perplexity shows that while ChatGPT mostly sources its information from Wikipedia, Google AI Overviews and Perplexity mostly source their information from Reddit.

top 6 comments
sorted by: hot top controversial new old
[–] SnotFlickerman@lemmy.blahaj.zone 60 points 1 month ago* (last edited 1 month ago)

Reddit, where the sources are made up and the points only matter as a hit of dopamine for the person making shit up.

[–] a4ng3l@lemmy.world 25 points 1 month ago

One point for ChatGPT it is then… it might be a bit on the posh / haughty side but that’s better than the reddit cesspool in my book.

[–] tabular@lemmy.world 8 points 1 month ago

Wikipedia content is usually copyleft isn't it? BigAI doing the BigEvil, redistribution without attribution or reaffirming the rights given back from Copyright by copyleft.

[–] MagicShel@lemmy.zip 6 points 1 month ago

I used ChatGPT on something and got a response sourced from Reddit. I told it I'd be more likely to believe the answer if it told me it had simply made up the answer. It then provided better references.

I don't remember what it was but it was definitely something that would be answered by an expert on Reddit, but would also be answered by idiots on Reddit and I didn't want to take chances.

[–] AbouBenAdhem@lemmy.world 5 points 1 month ago* (last edited 1 month ago)

There was a recent paper claiming that LLMs were better at avoiding toxic speech if it was actually included in their training data, since models that hadn’t been trained on it had no way of recognizing it for what it was. With that in mind, maybe using reddit for training isn’t as bad an idea as it seems.

[–] nightlily@leminal.space 3 points 1 month ago

Anyone who has any domain knowledge and experience knows how much of reddit is just repeated debunked falsehoods and armchair takes. Please continue to poison your LLMs with it.