General_Effort

joined 2 years ago
[–] General_Effort@lemmy.world 1 points 2 hours ago

I'm changing the order some, because I want to get this off my chest first of all.

Ultimately, I’m not set on any ideology here. I’m regularly more concerned with making things work. And that’s my goal here, too.

That's not what I'm seeing. Here's what I'm seeing:

I wasn’t concerned with copyright here. Let’s say I’m politically active and someone leaks my address and now people start showing up, throwing eggs at my front door and threatening to kill me. Or someone spreads lies about me and that gets ingested. Or I’m a regular person and someone posted revenge porn of me. Or I’m a victim of a crime and that’s always the fist thing that shows up when someone puts in my name and it’s ruining my life. That needs to be addressed/removed. Free of charge. And that has nothing to do with licensing fees for content or celebrities. When companies use data, they need to have a complaints department and that will immediately check whether the complaint is valid and then act accordingly. There needs to be a distinction between harmful content and copyright violations.

First, you start out with a little story. Remember my post about narratives?

You emphasize what "needs" to be achieved. You try to engage the reader's emotions. What's completely missing is any concern with how or if your proposed solution works.

There are reputation management companies that will scrub or suppress information for a fee. People who are professionally famous may also spend much time and effort to manipulate the available information about them. Ordinary people usually do not have the necessary legal or technical knowledge to do this. They may be unwilling to spend the time or money. Well, one could say that this is alright. Ordinary people do not rely on their reputation in the same way as celebrities, business people, and so on.

The fact is that your proposal gives famous and wealthy elites the power to suppress information they do not like. Ordinary people are on their own, limited by their capabilities (think about the illiterate, the elderly, and so on).

AIs generally do not leak their training data. Only fairly well known people feature enough in the training data so that a LLM will be able to answer questions about them. Having to make the data searchable on the net, makes it much more likely that it is leaked with harmful consequences. On balance, I believe your proposal makes things worse for the average person while benefit only certain elites.

It would have been straightforward to say that you wish to hold AI companies accountable for damage caused by their service. That's the case anyway; no additional laws needed. Yet, you make the deliberate choice to put the responsibility on individuals. Why is your first instinct to go this round-about route?


Selling/Buying something is a very common form of contract. In our economy, the parties themselves decide what’s in the contract. I can buy apples, cauliflower or wood screws per piece or per kilogram. That’s down to my individual contract between me and the supermarket (or hardware store) and nothing the government is involved in. It’s similar with licensing, that’s always arbitrary and a matter of negotiation.

But market prices aren't usually arbitrary. People negotiate but they usually come to predictable agreements. Whatever our ultimate goals are, we have rather similar ideas about "a good deal".

I’d do it like with shipments in the industry. If you receive a truck load of nuts and bolts, you take 50 of them out and check them before accepting the shipment and integrating the lot into your products.

All very reasonable ideas. Eventually, the question is what the effect on the economy is, at least as far as I'm concerned.

These tests mean that more labor and effort is necessary. Mistakes are costly. These costs fall on the consumer. The big picture view is that, on average, either people have less free time because more work is demanded, or they make do with less because the work does not produce anything immediately beneficial. So the question is if this work does lead to something beneficial after all, in some indirect way. What do you think?

So I believe we first need to address the blatant piracy before talking about hypothetical scenarios.

No. That is the immediate hands-on issue. As you know, the web is full of unauthorized content.

All the while the internet gets more locked down, enshittified… And everyone who isn’t the big content industry or already a monopolist, loses.

Well? What's your pitch?

See my text above. Even if it was a nice idea, it leads to the opposite in the real world. A few big internet companies “win” in this war with technology, disregarding the idea behind the law, and everyone exept them loses. Cementing monopolies, not helping with them.

That is not happening, though?

And Fair Use now says the labour of the small guy is free of charge for the big company.

You compare intellectual property to physical property. Except here, where it becomes "labor". I don't think you would point at a factory and say that it is the owner's labor. If some worker took some screws home for a hobby project, I don't think you would accuse him of stealing labor. Does it bother you how easily you regurgitate these slogans?

I mean what’s your idea here? I can’t really tell. Let’s say we’re not set on copyright. How do $90,000 arrive at a book author each year so it’s a viable job and they can create something full time? And I’d like a fair solution for society.

Good question. That's an economics question. It requires a bit of an analytical approach. Perhaps we should start by considering if your idea works. You are saying that AI companies should have to buy a copy before being allowed to train on the content. So: How many extra copies will an author sell? What would that mean for their income?

We should probably also extend the question beyond just authors. Publishers get a cut for each copy sold. How many extra copies will a publisher sell and what does that mean for their income?

Actually, the money will go to the copyright owner; often not the same person as the creator. In that way, it is like physical property. Ordinary workers don't own what they produce. A single daily newspaper contains more words than many books. The rights are typically owned by the newspaper corporation and not the author. What does that mean for their income?

Ok... Here's something you should know.

What happened there was suppressing personal data from Google's search engine. In the EU, that is regarded as a fundamental human right. The "right to be forgotten" is exactly about hiding a shady past. The GDPR gives you the right to demand that Google must omit certain links when people search for your name. Google does comply. You don't need a court order or anything.

So, you can't celebrate the GDPR while also condemning what happened here.

It was something like mashed pumpkin. I forget the exact variety.

I was for dinner at some friend's place. He gives me a bit of that pumpkin stuff, saying I have to taste it because it turned out so great. It was left-overs from the previous day. I take a spoon and it tastes absolutely rotten. Well, ok. He is trying his best to be an amateur chef, but I do have doubts about some of his culinary judgments. So, I put on the polite face and just eat it.

After a few spoons, I can't take it anymore. I say: "Sorry, this tastes absolutely rotten." He tastes of it, nods and hurries out the room to throw it away. So yeah. I ate spoiled food. I didn't get sick but I haven't eaten pumpkin since. The taste really stayed with me.

They can request data when the company operates in the US. Any company that doesn't want to stay out of the American market is subject to this.

Könnte schon sein. Nämlich §14 UrhG:

Der Urheber hat das Recht, eine Entstellung oder eine andere Beeinträchtigung seines Werkes zu verbieten, die geeignet ist, seine berechtigten geistigen oder persönlichen Interessen am Werk zu gefährden.

Der Urheber der Werbung könnte durchaus ein berechtigtes Interesse haben, dass man die nicht abklebt.

In der Praxis ist das im Bau relevant. Ein einigermaßen designtes Bauwerk gilt auch als Werk. Das heißt, ein Umbau kann womöglich vom Architekten untersagt werden. In jedem Fall sollte man für den Umbau dasselbe Architekturbüro beauftragen und großzügig entlohnen, um Rechtsstreitigkeiten zu vermeiden. Oder am besten von Anfang an nur nichtssagende Bauten in Auftrag geben, weil dieser Paragraf dann keine Anwendung findet.

[–] General_Effort@lemmy.world 2 points 2 days ago (2 children)

Alas, we have reached the max comment depth. I cannot reply to your latest comment.

Well, there is a distinction between use and obtaining it. For stealing, the use doesn’t matter. For later use, it does. That’s also what licenses are concerned with.

I see what you mean now. It's tricky. It's just another way in which copyright talking points cause problems.

You're saying that using/copying something you have in a database for AI training should always be legal. However, copying something to add it to the database should be judged as if it was done for enjoyment. EG everyone who torrents a movie should be treated the same, regardless of purpose. This will certainly cause problems for some scientific datasets.

Whether you downloaded a legal copy depends on whether the party offering the download had the right to do so. Whether that is the case may not be apparent. The first question is: What duty does someone have to check the provenance of content or data?

Torrents of current movies and the like are very obviously not authorized. For older movies, that becomes less clear. The web contains much unauthorized content. For example, the news stories that people copy/past on Lemmy. What duty is there to determine the copyright status of the content before using such data?

When researchers and developers share datasets, what duty do they have to check how the contents were obtained by whoever assembled it?

What happens when something was wrongly included in a dataset? Is that a problem only for the original curator, or also for everyone who got a copy?

What about streams, live TV, radio, and such things? Are you allowed to record those for training or not?

While Fair Use is a broad limitation/exemption, it’s still concerned with specific exemptions.

That's not quite right. Ultimately, Fair Use derives from the US Constitution; from the copyright clause but also freedom of speech. Copyright law spells out 4 factors that must be taken into account. But courts may also consider other factors. There is also no set way in which these factors have to be weighed. It's very open.

Well, it is. In the United States, willful copyright infringement

There are minimum conditions before prosecution is possible. I think uploading can always be prosecuted.

No, copyright should be toned down. Preferably for regular citizens as well and not just the industry.

Well, over the last few decades it has only been going in the other direction.

How does this fit together with calling copyright infringement theft?

Let me make a suggestion. This is your real opinion. This is what you believe based on what you see. The rest is just slogans by the copyright industry, which you repeat without thinking. The problem is that you are basically shouting yourself down; your own opinion. The media, a big part of the copyright industry, puts these slogans out. Their lobbyists demand favors and harsher laws from politicians. And when the politicians look at what voters think, they hear these slogans. That's one thing I mean when I say the copyright industry defrauds us.

Airbus pays like 100x the price for the same set of nuts and bolts than someone else. A kitchen appliance for industrial use costs like 3x the price of an end user kitchen appliance. Because it’s more sturdy and made for 24/7 use.

Exactly, they don't pay more for the same thing. It's almost exclusive to the copyright industry.

People do have to pay more if they license a picture to show to their 20 million customers or use it in an advertising campaign, than I do for putting it up in the hallway.

Actually, even in the copyright industry, such terms are from universal. Of course, you will have to pay more for the right to make copies than for a single copy. And even more for the exclusive copyright. Those things are different. However, it's usually a flat fee. Can you figure out what economic reasons might exist for a creator being paid per copy or per viewer?

No exceptions, no licensing, no fees. This is strictly to avoid bad things like doxxing, ruining people’s lives…

"No exceptions" means, for example, that a LLM would not be able to answer questions about politicians, actors, musicians, maybe not even about historical figures.

You said that there should be a way that you can remove your personal data from the training set. That implies that an AI company can offer money in exchange for people not removing their data. That's basically a licensing fee, however it is framed.

On second thought, I believe many celebrities, business people, politicians, ... will gladly offer more training data that makes them look. They'd only remove data that makes them look bad. Sort of like how the GDPR works. Far from demanding a licensing fee, they'd pay money to be known by the AI.

I’ve told you how my server was targeted by Alibaba and it nearly took down the database. [...] But I’m prevented from exercising my rights.

I agree that the situation is far from ideal. But let me point out that you do not have a right to other people's computer services. That's the issue with Alibaba hitting your server, right? It's a difficult issue. Mind that an opt-out from AI training does not actually address this.

This application of Fair Use is in favour of the feudal lord companies and to the detriment of the average person.

How so?

[–] General_Effort@lemmy.world 1 points 2 days ago (1 children)

Maybe buying alcohol works differently where you live.

[–] General_Effort@lemmy.world 1 points 2 days ago (3 children)

In both your examples the government service has your full identity, then pinky promises to forget it.

It can be like buying alcohol in a store. They look at you and see your age. Or if it's unclear, the store clerk asks your idea and promptly forgets all about it. Except you're not buying alcohol but a login for some age verifier.

[–] General_Effort@lemmy.world 1 points 3 days ago (5 children)

The reverse is also a necessity: the government approved service should not be allowed to know who and for what a proof of age is requested.

It would send the proof to you. It would not know what you do with it. I gave an example in the previous post how the identity of the user could be hidden from the service.

If the middle man government service knows when and who is requesting proof-of-age, it’s easy to de-anonymise for example users of gay porn sites.

It would be a lot easier to get that information from the ISP.

[–] General_Effort@lemmy.world 4 points 3 days ago

Wieder so eine? Tu sie zu den andern.

[–] General_Effort@lemmy.world 2 points 3 days ago (1 children)

That's a good start.

What I think doesn’t work is saying every normal citizen needs to buy books and Zuckerberg gets to pirate books. In a democracy law has to apply to everyone. And his use-case doesn’t matter here. I can also claim I pirated the 10TB of TV shows and movies for transformative or legitimate use. It’s still piracy.

The laws do apply to everyone equally, though few people are able to litigate for years against the copyright industry.

Your concern is obviously the use case. If the use case doesn't matter, then quotes and parody are illegal, as well as historical archiving and scientific analysis.

I guess you just want AI training to not be fair use. That raises the question of how this should work.

Maybe you think that different standards should be applied to Zuckerberg, after all. Your focus on him makes it seem a little like that.

Perhaps you simply have something more european in mind. Europe and in particular Germany do not have fair use. There is a short list of uses that do not require permission. That means that every time some new use becomes desirable, the law must be changes. This is obviously stifling for progress in science and culture. Think of HipHop with its use of samples. It's hard to imagine some artists successfully petitioning the government to legalize the practice before experimenting with it. You couldn't have developed a search engine that simply copies all web pages for indexing. Something like the Internet Archive, or the Wayback Machine, would be impossible. It would just be a few tech geeks against the copyright industry, including the media.

So, how should this be done?

And other law works the same way. If I steal chocolate in the supermarket, that’s also theft no matter what I was planning to do with it. So that’s out.

Actually, no. Theft is prosecuted by the government; police and courts. Copyright infringement is generally a civil matter. Damages are paid but there is no criminal prosecution.

The government only cares for large-scale, industrial infringement, like EG operating a Netflix-like streaming service. Small scale infringement is not even criminal in the US. I believe, even in Europe, people who torrent movies or such are rarely criminally prosecuted.

Maybe you would like to see copyright infringement to be punished more harshly and enforced more strictly?

A billion dollar company with a service used by millions of people should pay more than a single researcher doing it for 5 people.

That's an interesting idea. It's not how we do anything else. You don't usually have to pay more for the same thing, depending on who you are or how much you use it. I expect, it would be quite devastating if that were the rule.

Should this policy idea apply only to copyright or generally? If only copyright, why?

And if they scraped my personal data, I need a way to get that deleted from the dataset.

Should there be exceptions for celebrities and such, or will they be able to demand licensing fees?

I’d also add an optional opt-out mechanism to appease to the people who hate AI. They can add some machine-readable notice, or file a complaint and their content will be discarded.

Then much public content can't be used, after all. The likes of Reddit, Facebook, or Discord will be able to charge licensing fees for their content, after all. It's very typically European. You rage against Meta's monopoly but you also call for laws to enforce and strengthen it. I think it's the echo of feudalism in the culture.

[–] General_Effort@lemmy.world 1 points 3 days ago (7 children)

The site would only know that the user's age is being vouched for by some government-approved service. It would not be able to use this to track the user across different devices/IPs, and so on.

The service would only know that the user is requesting that their age be vouched for. It would not know for what. Of course, they would have to know your age somehow. EG they could be selling access in shops, like alcohol is sold in shops. The shop checks the ID. The service then only knows that you have login credentials bought in some shop. Presumably these credentials would not remain valid for long.

They could use any other scheme, as well. Maybe you do have to upload an ID, but they have to delete it immediately afterward. And because the service has to be in the EU, government-certified with regular inspections, that's safe enough.

In any case, the user would have to have access to some sort of account on the service. Activity related to that account would be tracked.


If that is not good enough, then your worries are not about data protection. My worries are not. I reject this for different reasons.

 

The most recent South Park episode, featuring a naked Donald Trump, may have violated the law.

 

With Tom Lehrer's passing, I suppose this is a moment to share the story of the prank he played on the National Security Agency, and how it went undiscovered for nearly 60 years.

https://bsky.app/profile/opalescentopal.bsky.social/post/3luxxx2a2f623

 

You fucked with squirrels, Morty!

 

An in depth look at a very narrow and specific set of norms, the consequences of which are rarely considered. I love stuff like this.

 

Somewhere in a government building in the UK: We did it, Patrick...

view more: next ›