Imagine if back in the 2000s all the file sharing companies like Napster and Audio Galaxy had issued a statement that copyright laws were going to kill their industry.
@Hazzbenn @twipped
It is different though, because the copyright laws don't let the IP owner to control EVERY aspect of the use of the works.
Napster and co. were infringing of the most BASIC copyright protection: the copying and distributing of the protected material.
AI companies don't do that: they analyze works using maths, and make mathematical representations of the statistical regularities in such works.
The things shared on Napster were MP3s. People took CDs, created a lossily compressed version, and distributed it.
The things Tech Bros are creating are neural networks, which are a form of lossy compression. They are taking vast amounts of copyright material, lossily compressing it, and distributing it.
Any self-consistent argument that allows this would also permit sharing MP3s.
@Hazzbenn @twipped
what AI companies complain about is making copyright law stricter or reinterpreting it in a stricter way where this kind of mathematical analysis for the goal of making a tool that can produce "new" work based would be prohibited.
Which is not, under the most common understanding of copyright law (at least for the US, where most of these corpos are located).
@buherator @Hazzbenn @twipped And what is the intent? Because unless the intent is to reproduce the original work in full and in significant part, it is irrelevant from a copyright perspective.
I could literally go, and buy a hundred books, cut out a sentence from each of them, and make a book out of those, and it would be fine. I could even make a machine that reshuffles those sentences and makes books of it automatically. Doing it with a thousand million books is not much different.
@buherator @Hazzbenn @twipped copyright does not protect every single component and every single aspect of creative works, it protects specific things. I.e. just because you wrote a novel that has 1323 "E" letters and 1123 "A" letters, that doesn't stop others to count those letters, and write new works with the same amount of "E" and "A" letters. Making a graph out of your book showing what word is followed by what other work by what percentage is also totally fine.
@buherator @Hazzbenn @twipped sure, but by "buying" the book, you purchase the right to read it, nothing more, which is a right given to you by any website available, too. So, no, the keyword is not "buy". You gain access to read it, then you are allowed to do certain things with the information. Again, the copyright laws don't give you 100% total control over your work: it gives you control over producing copies of it.
@buherator @Hazzbenn @twipped The ice cream cone analogy doesn't work, because if you do it in a way that doesn't result in the ice cream man losing his ice cream cones (you check the cone, go home, make your own and sell that), that is again, not illegal in general, and the analogy breaks down because patents on things and copyright on creative works are different laws.
@buherator @Hazzbenn @twipped
And yes, an AI allows for the creation of cheap replacements, and that is also not illegal, see:
straight-to-dvd movies with a vaguely similar premise and title to Hollywood movies.
young adult fantasy novels based on ideas found in "Twilight" and "Hunger Games".
copyright does not protect you from "someone making someone similar but for cheaper". If you want copyright to do that you need to change it, and AI companies resisting that change is not weird.
@buherator My main point is, that even non-free works aren't protected against scraping. If you have a license to extract the information from it (i.e. reading), then you can extract whatever information, however you want it (i.e. do statistical analysis using machine learning), at least based on my understanding of standing copyright law.
Regulating scale is a problem imo, because that easily becomes a slippery slope.
Thank you for the good-faith engagement.