this post was submitted on 28 Oct 2025
188 points (96.5% liked)
Programming
23348 readers
240 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities !webdev@programming.dev
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
For reference, every AI image model uses ImageNET (as far as I know) which is just a big database of publicly accessible URLs and metadata (classification info like, "bird" ).
The "big AI" companies like Meta, Google, and OpenAI/Microsoft have access to additional image data sets that are 100% proprietary. But what's interesting is that the image models that are constructed from just ImageNET (and other open sources) are better! They're superior in just about every way!
Compare what you get from say, ChatGPT (DALL-E 3) with a FLUX model you can download from civit.ai... you'll get such superior results it's like night and day! Not only that, but you have an enormous plethora of LoRAs to choose from to get exactly the type of image you want.
What we're missing is the same sort of open data sets for LLMs. Universities have access to some stuff but even that is licensed.