LocalLLaMA
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
Rules:
Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.
Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.
Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.
Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.
view the rest of the comments
I think you're a bit too focused on narratives. I mean how am I supposed to share my perspective without sharing my perspective? Of course that's going to include stories about bad things that happened to me. I've handled some privacy and personal information related issues for not so tech-savy people. You should feel privileged if you didn't have a lot of bad or complicated things happen to you, but I can assure you there are ordinary people with different stories. I didn't handle death threats, but there were some other legitimate reasons, from simple job related to bad and disgusting. And we can't just throw those people under the bus and say »yeah, your well-being just cuts into profit«...
This isn't copyright, so I'm going to move on. But this goes hand in hand with other regulations for datasets and online services.
Well, if I ask them about events and organizations I was part of, AI does seem to know details. And those were small and local things. No celebrities involved. AI however hallucinates a lot and >80% of names or details are currently made up. I bet AI is going to become better, though. It's definitely already able to connect some lesser-known names.
(There is some science here: https://arxiv.org/pdf/2506.17185 )
Might be the same here. Maybe the free market will arrive there after things settled down. You're right, the content industry is a shitty corner of the market. I'd like to mention Spotify as precedent, who are able to license pretty much all important music, despite paying next to nothing to artists. Or my university library, who were able to stock pretty much all important books for their students. This might be achievable in some way for AI, too. Other businesses seem to be able to obtain special licenses for use-cases other than be a regular customer.
No one promised it has to be easy. Other products also cost some extra because we have some minimum requirements. For food safety, cars, fair rides... I wouldn't want to do away with that, so I think this always leads to something beneficial. We just need to strike a balance. Every now and then a rollercoaster crashes and people die. Nothing is perfect. We collectively decide what rate of rollercoaster crashes we deem acceptable. And then the experts write some regulations to achieve that.
Pretty much what I'm arguing for, here. Discard the idea which causes it. It's obviously not working. Likely because it's too simplistic.
As I told before, it already happened. Three years ago I and any independent researcher was able to use the Reddit API and use Youtube. Now we're not. And the monopolists struck deals amongst themselves. Ever wondered why many more paywalls popped up with news outlets lately? Cloudflare and Anubis checks before a page loads? You get locked out of codeberg for 24h+ and can't update your server? Your alt account gets deactivated for "suspicious activity"? That's all indications something has happened behind the scenes. And it achieves the desired effect. More and more information is now under tighter control. For the AI companies and for everyone.
And all of this happened to me, along with me needing to do similar things since they also showed up at my front door. The rate of this happening correlates perfectly. And from personal experience and talking to other admins, I know bots and scraping are the cause.
What slogan? And what hobby project? ChatGPT certainly isn't a hobby project. That thing costs some 3 digit millions of dollars per iteration. And they're also not taking a few screws. They're the employee who takes one screw out of each other packet and with the throughput, they have a nice side-business with the screws.
I was trying to make a point here: Take away copyright since we both don't like it... Now what remains? I think the labour of the author.
And since we're always discussing feudalism and a monopoly... Am I right here and that's the AI industry, or did I miss something? In my eyes, we're currently at Google (which is a monopolist), Microsoft (another monopolist) and the other 51% of OpenAI which seem very well off, we have Apple (I think also monopolist, and they're also in the top 10 richest companies). Nvidia does AI and they're torpedoed to top market cap by AI and have monopoly-like margins. Then we have Meta and Elon Musk's companies in the business and also valued a trillion dollars. Then we have "startups" funded by public money from the Chinese government. Anthropic (interestinly enough now sued by Reddit for scraping their data), Elevenlabs, and in Europe: Mistral, Stability.ai and Black Forest Labs. (And a few other players like Standford and other universities, smaller companies/startups and quite an active fine-tuning community.)
That's pretty much what I read about. Many of them are just the richest companies on planet earth. Several of them are monopolists. Some happen to be the ones who own the big platforms that make up the internet. So if we now say AI training supplies needs to be cheaper, whether that's right or wrong... You know who 90% of that benefit goes to? ...Them.
And that's not wrong. They have a legitimate business and it's not wrong to make money selling GPUs or AI. It's just that you can't say you're against feudalism and monopolies, and then devise a rule and the list of the main benefactors is just a list dominated by monopolies and feudalism from before. There is some desired outcome but that's just among the also-rans.
That's just you being against monopolies where it suits you and you're completely oblivious to them in other areas. En large, probably enabling them.
Now the content industry is bad as well. And we find Disney, Warner Bros, Nexflix in the list of Fortunate 500 companies. Seems the publishing houses aren't even amongst them. And now you want to redistribute resources and the main chunk moves up the chain to the select top. Most of them have several ruling against them for having (for example) devised ecosystems to arrive at a monopoly and then subsequently abuse the powers that come with it. You didn't level the playing field but we can tell from the last few years and how AI law of the USA turned out, you mainly helped the big companies and monopolists. And we can have a look at the financial figures and they're mostly doing record profit since Covid while that's not the case for average economy. Now who do we seem to funnel value towards in practice? And why do these companies by large happen to be identical to the internet feudalism from before gen-AI?
Well, I'm open to other ideas than mine. I mean you propose a clear solution here: Fair Use. Now I would have expected you to have analysed the situation and have some solution on how that content is supposed to get there. I mean it's not created out of thin air. And the other side of the coin has to be factored in as well once we're talking about introducing laws.
I think the entire content industry isn't a healthy model. (Edit: And I'm not so focused on the middle-men and the resulting content owners, that business model is indeed shady. But content still has to be created.) And the average individuals working there aren't well off. And it doesn't seem like we're on a path where this is going to improve in the future. So there aren't any "extra copies" when it gets to these people. That's mainly a thing for copyright owners. The creators don't necessarily have gifts to hand out.
In some cases we already know AI directly takes away. Freelancers, like illustrators, maybe musicians... Without an industry and other entities in between, they're the first who get somewhat fed upon and the same thing directly takes away their business opportunities. And it's the combination of the two which makes it bad. So what's with content in the early 21st century and in the upcoming age of AI? Is it as easy as leave everything as is and slap Fair Use on top? Does that solve a single issue with anything? Or is that just supposed to make business cheaper for some AI companies with a random effect on everyone else? Do they contribute something of value and how does that compare to the negative side-effects and the main thing they do and that is accumulate wealth for themselves? Does Fair Use even work or have companies kind of already turned it into the opposite in practice by (ab)using their power?