infosec.place

Conversation

Jocelynephiliac

Imagine if back in the 2000s all the file sharing companies like Napster and Audio Galaxy had issued a statement that copyright laws were going to kill their industry.

HazzBenn

Hazzbenn@mastodon.social

1 year ago

Reply to @twipped@twipped.social

@twipped you just don’t get it. This is “different”.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @Hazzbenn@mastodon.social

@Hazzbenn @twipped
It is different though, because the copyright laws don't let the IP owner to control EVERY aspect of the use of the works.

Napster and co. were infringing of the most BASIC copyright protection: the copying and distributing of the protected material.

AI companies don't do that: they analyze works using maths, and make mathematical representations of the statistical regularities in such works.

David Chisnall (Now with 50% more sarcasm!)

david_chisnall@infosec.exchange

1 year ago

Reply to @nicemicro@fosstodon.org

@nicemicro @Hazzbenn @twipped

The things shared on Napster were MP3s. People took CDs, created a lossily compressed version, and distributed it.

The things Tech Bros are creating are neural networks, which are a form of lossy compression. They are taking vast amounts of copyright material, lossily compressing it, and distributing it.

Any self-consistent argument that allows this would also permit sharing MP3s.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @nicemicro@fosstodon.org

@Hazzbenn @twipped
what AI companies complain about is making copyright law stricter or reinterpreting it in a stricter way where this kind of mathematical analysis for the goal of making a tool that can produce "new" work based would be prohibited.

Which is not, under the most common understanding of copyright law (at least for the US, where most of these corpos are located).

buherator

1 year ago

Reply to @nicemicro@fosstodon.org

@nicemicro @Hazzbenn @twipped I think this is the "copyright can't prevent learning from a book" argument which I like to respond to with a joke:

Little Girl: Ice Cream Man, how much is for an empty cone?
Ice Cream Man: Oh, you can get an empty cone for free!
Little Girl: Great, then I'd like 5000 of them!

In other words, scale (that can imply intent) matters.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @buherator

@buherator @Hazzbenn @twipped And what is the intent? Because unless the intent is to reproduce the original work in full and in significant part, it is irrelevant from a copyright perspective.

I could literally go, and buy a hundred books, cut out a sentence from each of them, and make a book out of those, and it would be fine. I could even make a machine that reshuffles those sentences and makes books of it automatically. Doing it with a thousand million books is not much different.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @nicemicro@fosstodon.org

@buherator @Hazzbenn @twipped copyright does not protect every single component and every single aspect of creative works, it protects specific things. I.e. just because you wrote a novel that has 1323 "E" letters and 1123 "A" letters, that doesn't stop others to count those letters, and write new works with the same amount of "E" and "A" letters. Making a graph out of your book showing what word is followed by what other work by what percentage is also totally fine.

buherator

1 year ago

Reply to @nicemicro@fosstodon.org

@nicemicro @Hazzbenn @twipped "I could literally go, and buy a hundred books," -> The keyword here is "buy".

To elaborate on intent: Little Girl likely won't/can't eat all the empty cones but wants to resell them (or give them away to 5000 buddies at the expense of Ice Cream Man).

As for your second reply, doing statistics _at this scale_ allows producing cheap replacement of the original works which is the CD ripping/compression problem discussed above.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @buherator

@buherator @Hazzbenn @twipped sure, but by "buying" the book, you purchase the right to read it, nothing more, which is a right given to you by any website available, too. So, no, the keyword is not "buy". You gain access to read it, then you are allowed to do certain things with the information. Again, the copyright laws don't give you 100% total control over your work: it gives you control over producing copies of it.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @nicemicro@fosstodon.org

@buherator @Hazzbenn @twipped The ice cream cone analogy doesn't work, because if you do it in a way that doesn't result in the ice cream man losing his ice cream cones (you check the cone, go home, make your own and sell that), that is again, not illegal in general, and the analogy breaks down because patents on things and copyright on creative works are different laws.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @nicemicro@fosstodon.org

@buherator @Hazzbenn @twipped
And yes, an AI allows for the creation of cheap replacements, and that is also not illegal, see:
straight-to-dvd movies with a vaguely similar premise and title to Hollywood movies.
young adult fantasy novels based on ideas found in "Twilight" and "Hunger Games".

copyright does not protect you from "someone making someone similar but for cheaper". If you want copyright to do that you need to change it, and AI companies resisting that change is not weird.

buherator

1 year ago

Reply to @nicemicro@fosstodon.org

@nicemicro @Hazzbenn @twipped

(I attempt to reply to all of your 3 replies, hope it won't cause confusion)

First, I don't think I ever argued about scraping public online content, the original CD ripping analogy is about non-free works, and "AI" companies do scrape copyrighted works (e.g. OSS with non-commercial license clauses).

Second, my little joke is only an example of how scale can change how you want to do business with the other party, independently from the goods or services being exchanged (I.C.M. probably won't give away even 10 cones at once, even though their cost would still be negligible). And yes, copyright probably has to change in order to account for the fact that in 2025 information can be collected and processed in unprecedented scale.

NiceMicro

nicemicro@fosstodon.org

1 year ago

Reply to @buherator

@buherator My main point is, that even non-free works aren't protected against scraping. If you have a license to extract the information from it (i.e. reading), then you can extract whatever information, however you want it (i.e. do statistical analysis using machine learning), at least based on my understanding of standing copyright law.

Regulating scale is a problem imo, because that easily becomes a slippery slope.

Thank you for the good-faith engagement.

buherator

1 year ago

Reply to @nicemicro@fosstodon.org

@nicemicro Yes I also have concerns about how restrictions could be implemented in practice.

Thank you, it's good to see that civilized arguments are still possible online!

Jocelynephiliac

HazzBenn

NiceMicro

David Chisnall (*Now with 50% more sarcasm!*)

NiceMicro

buherator

NiceMicro

NiceMicro

buherator

NiceMicro

NiceMicro

NiceMicro

buherator

NiceMicro

buherator

Terms of Service

David Chisnall (Now with 50% more sarcasm!)