Anthropic destroyed millions of print books to build its AI models

Tony Bark@pawb.social · 21 hours ago

Anthropic destroyed millions of print books to build its AI models

FaceDeer@fedia.io · 18 hours ago

There was actually just a big ruling on a case involving this, here’s an article about it. In short: a judge granted summary judgment that establishes that training an AI does not require a license or any other permission from the copyright holder, that training an AI is not a copyright violation and they don’t hold any rights over the resulting model.

I’m assuming this case is why we have this news about Anthropic scanning books coming out right now too.

pwnicholson@lemmy.world · 17 hours ago

That’s disappointing to say the least. I’m sure there will be a few more lawsuits as big publishers like Disney try to get their share of the pie.

FaceDeer@fedia.io · 16 hours ago

Funny, for me it was quite heartening. If it had gone the other way it could have been disastrous for freedom of information and culture and learning in general. This decision prevents big publishers like Disney from claiming shares of the pie - their published works are free for anyone with access to them to train on, they don’t need special permission or to pay special licensing fees.

pwnicholson@lemmy.world · edit-2 16 hours ago

As a photographer and the spouse of a writer, they are making massive profits off of a product that wouldn’t exist if they didn’t train it. By the very way the technology works, there’s a little bit of our work scattered in everything they do. If I included a sample of a piece of music in a song I recorded, or included a copyrighted painting in the background if a movie I was making, is would have to get a license. Why is this any different?

They should have done something more like a commodity license as it exists in music:

The composer of a song cannot prevent a new artist from recording a cover of their music if it has been previously released. The original composer is legally forced to grant them a license (hence “compulsory license”). But that license is at a pre-negotiated minimal rate. The new artist is free to try to negotiate a lower rate if the composer agrees. But the original composer can’t stop the new artist from recording a cover. And the new artist has to pay them for it.

Unfettered access is granted and the composer gets their share. Win-win.

FaceDeer@fedia.io · 14 hours ago

Why is this any different?

The judgment in the article I linked goes into detail, but essentially you’re asking for the law to let you control something that has never been yours to control before.

If an AI generates something that does indeed provably contain a sample of a piece of music in a song you recorded, then yes, that output may be something you can challenge as a copyright violation. But if the AI’s output doesn’t contain an identifiable sample, then no, it’s not yours. That’s how copyright works, it’s about the actual tangible expression.

It’s not about the analysis if copyrighted works, which is what AI training is doing. That’s never been something that copyright holders have any say over.