Are neural nets (the weights file) copyrightable?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Are neural nets (the weights file) copyrightable?

Post by Dann Corbit »

gonzochess75 wrote: Fri Feb 26, 2021 10:44 pm
Dann Corbit wrote: Fri Feb 26, 2021 10:32 pm IF it is true that a program's data is also covered by the GPL, then the GPL will be abandoned instantly by each and every commercial user.
...
If your HL7 medical data is held in a MySQL database, can I demand it be given to me?
If not, why not?
...
Hey Dann, did you see my answer to your hypothetical? I wrote http://talkchess.com/forum3/viewtopic.p ... QL#p884839

What matters for the sake of the GPL is whether the data is a mere aggregate as defined by the GPL or if it is a combined work. In the case of MySQL and the data it carries the answer is clear: it is merely aggregated. In the case of a GPL'd chess engine and the neural net weights I believe the answer is also clear: it is a combined work.

To see this we can ask: what is the purpose of MySQL? Does the data it is carrying fundamentally change the operation of MySQL? Is it an extension of MySQL? Is it combined with MySQL to form a larger program? No.

How about for a GPL'd chess engine and the neural network weights file? Does the data in the weights file fundamentally change the operation of the chess engine? Is it an extension of the chess engine? Is it combined with the chess engine to form a larger program? Yes.

Here is the language from the GPL that talks about this:
"...which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program..."

The weights file - if it is copyrightable - does not meet this definition of a mere aggregate. Rather, it meets the definition of a combined work.
No.
In the first place, the weights are not necessary to make SF run. Only to make it run in NNUE mode.
Second, you can make your own weights the same as Albert did.
There are sites all over where people are doing it right now for SF.
You have the SF code as modified by ChessBase if you like, and you can make your own set of weights.

And if the data were necessary for the program to run at all, it is still data.
If you have the program's code, you can figure out how to make your own, even if it is necessary, even if it is not provided.
Data is not code.
Data is data.
The data is not covered by the GPL, the code is covered.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

syzygy wrote: Fri Feb 26, 2021 10:50 pm
gonzochess75 wrote: Fri Feb 26, 2021 10:35 pmHere is another one for you. Specifically with Linux and a proprietary module:

https://wiki.fsfe.org/Migrated/GPL%20En ... nt%20Cases
What was at issue: Since the terms of the negotiations are undisclosed, the facts are not clear. What follows is a recollection of data from second-hand sources (listed below) and should be treated as such. The WRT54G is a home router manufactured by Linksys; it included a Broadcom chip that shipped with some embedded firmware; development of such firmware was outsourced by Broadcom to some independent contractor. Apparently, the contractor used some code released under the GPL licence without informing Broadcom. Additions to the Linux kernel were made, in the form of statically linked modules. This aspect is relevant because there is doubt if non-essential kernel modules constitute a derivative work (thus allowing for binary modules subject to licences different from GPL). That was not the case, since the modules developed by Linksys/Broadcom/the subcontractor (author is not really clear) were necessary for the working of the product and thus hardlinked to the kernel itself. Unaware of these issues, Cisco bought Linksys and started selling the WRT54G router without releasing the firmware source code or providing the GPL text. Someone noticed the licence infringement and reported to the FSF which, in turn, asked Cisco to respect the terms of the GPL.
So that's about static linking.
Static linking vs dynamic linking is irrelevant in terms of the GPL. It only becomes relevant when talking about the LGPL.

Here is Eben Moglen himself:

https://lwn.net/Articles/147070/
LWN: A while back, you said something about getting an answer from Linus on the Linux kernel license. Since there is a COPYING file that makes it clear that the kernel is governed under the GPL, where's the uncertainty?

If the kernel is pure GPL, then I think we would all agree that non-GPL, non-free loadable kernel modules represent GPL violations. Nonetheless, we all know that there are a large number of such modules and their existence is tolerated or even to some degree encouraged by the kernel maintainers, and I take that to mean that as an indication that there is some exception for those modules.

...

LWN: So, if the kernel is covered solely by the GPL, you would see proprietary modules as an infringement?

Yes. I think we would all accept that. I think that the degree of interpenetration between kernel modules and the remainder of the kernel is very great, I think it's clear that a kernel with some modules loaded is a "a work" and because any module that is dynamically loaded could be statically linked into the kernel, and because I'm sure that the mere method of linkage is not what determines what violates the GPL, I think it would be very clear analytically that non-GPL loadable kernel modules would violate the license if it's pure GPL.
So the intent of the author is pretty clear.
syzygy
Posts: 5774
Joined: Tue Feb 28, 2012 11:56 pm

Re: Are neural nets (the weights file) copyrightable?

Post by syzygy »

gonzochess75 wrote: Fri Feb 26, 2021 10:32 pm
syzygy wrote: Fri Feb 26, 2021 10:17 pm
1) There are numerous court cases where the GPL has been upheld and in particular with regard to linking to shared libraries
Let's find one.
https://en.wikipedia.org/wiki/BusyBox#GPL_lawsuits
http://www.softwarefreedom.org/news/200 ... plaint.pdf

Note the complaint specifically references the section (although in version 2 location) of the GPL we are discussing. Note also that a judge gave a default judgement in one of the suite of cases that arose out of the same/similar violations: http://www.groklaw.net/article.php?stor ... 3132055210
Nothing shocking there. If you distributed a derived work, you have to obey the GPL, obviously.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

Ok, I think my views are understood and it appears Dann and Syzygy at least don't find them convincing. That's fine.
syzygy
Posts: 5774
Joined: Tue Feb 28, 2012 11:56 pm

Re: Are neural nets (the weights file) copyrightable?

Post by syzygy »

gonzochess75 wrote: Fri Feb 26, 2021 11:00 pm
syzygy wrote: Fri Feb 26, 2021 10:50 pm
gonzochess75 wrote: Fri Feb 26, 2021 10:35 pmHere is another one for you. Specifically with Linux and a proprietary module:

https://wiki.fsfe.org/Migrated/GPL%20En ... nt%20Cases
What was at issue: Since the terms of the negotiations are undisclosed, the facts are not clear. What follows is a recollection of data from second-hand sources (listed below) and should be treated as such. The WRT54G is a home router manufactured by Linksys; it included a Broadcom chip that shipped with some embedded firmware; development of such firmware was outsourced by Broadcom to some independent contractor. Apparently, the contractor used some code released under the GPL licence without informing Broadcom. Additions to the Linux kernel were made, in the form of statically linked modules. This aspect is relevant because there is doubt if non-essential kernel modules constitute a derivative work (thus allowing for binary modules subject to licences different from GPL). That was not the case, since the modules developed by Linksys/Broadcom/the subcontractor (author is not really clear) were necessary for the working of the product and thus hardlinked to the kernel itself. Unaware of these issues, Cisco bought Linksys and started selling the WRT54G router without releasing the firmware source code or providing the GPL text. Someone noticed the licence infringement and reported to the FSF which, in turn, asked Cisco to respect the terms of the GPL.
So that's about static linking.
Static linking vs dynamic linking is irrelevant in terms of the GPL.
It is absolutely relevant as I have already explained. A statically linked executable includes library code, so if the library code is GPL, the executable has to be distributed under the GPL. A dynamically linked executable does not include library code.

Distributing a book that includes a large quotation from another book infringes the copyright on that other book.
Distributing a book that includes a reference to another book does not infringe the copyright on that other book.
Big difference.

The FSF's FAQ contains many untruths.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

syzygy wrote: Fri Feb 26, 2021 11:04 pm Distributing a book that includes a large quotation from another book infringes the copyright on that other book.
Distributing a book that includes a reference to another book does not infringe the copyright on that other book.
Big difference.
Few last questions. Let's call the original book FOO and the other book BAR. If the copyright holder for FOO grants you a license for FOO that states if BAR contains a reference to FOO that you must license BAR in a compatible way... and you don't do that... do you still get to make copies of FOO and distribute FOO?

Can I make demands on how you license a non-derivative work in exchange for granting you a portion of my exclusive rights that you otherwise would not have?

Can Disney tell ViacomCBS that they can redistribute Star Wars all they want in exchange for ViacomCBS granting Disney the right to redistribute Star Trek?

Can that deal be made?

If ViacomCBS refuses and says that's unfair since Star Trek is not a derivative work of Star Wars does that mean that ViacomCBS some how automatically gets the right to distribute Star Wars?
User avatar
towforce
Posts: 12536
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Re: Are neural nets (the weights file) copyrightable?

Post by towforce »

gonzochess75 wrote: Fri Feb 26, 2021 10:44 pm Hey Dann, did you see my answer to your hypothetical? I wrote http://talkchess.com/forum3/viewtopic.p ... QL#p884839

What matters for the sake of the GPL is whether the data is a mere aggregate as defined by the GPL or if it is a combined work. In the case of MySQL and the data it carries the answer is clear: it is merely aggregated. In the case of a GPL'd chess engine and the neural net weights I believe the answer is also clear: it is a combined work.

To see this we can ask: what is the purpose of MySQL? Does the data it is carrying fundamentally change the operation of MySQL? Is it an extension of MySQL? Is it combined with MySQL to form a larger program? No.

How about for a GPL'd chess engine and the neural network weights file? Does the data in the weights file fundamentally change the operation of the chess engine? Is it an extension of the chess engine? Is it combined with the chess engine to form a larger program? Yes.

Here is the language from the GPL that talks about this:
"...which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program..."

The weights file - if it is copyrightable - does not meet this definition of a mere aggregate. Rather, it meets the definition of a combined work.

Well written - very clear and understandable.
Human chess is partly about tactics and strategy, but mostly about memory
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

Dann Corbit wrote: Fri Feb 26, 2021 10:55 pm And if the data were necessary for the program to run at all, it is still data.
If you have the program's code, you can figure out how to make your own, even if it is necessary, even if it is not provided.
Data is not code.
Data is data.
The data is not covered by the GPL, the code is covered.
This dog won't hunt. It is trivial to make any data into source code. Indeed, the Fat Fritz binary itself embedded the net into the binary. You can embed all kinds of things - pgns, movies, mp3, database - directly into a binary by making them part of the source code. Source code itself has one of its fundamental data types (pun intended) the integer. Do integers magically become source code in terms of copyright when they are loaded in a computers memory or written into a computer program? Do they magically become data when they are inserted into a text file that isn't compiled. Indeed I can compile the following "data" file directly into a program trivially:

// data.txt
3.141592653589793238

Is that source code or data? Both? Note: the digits by themselves are not copyrightable of course. But that just supports my opinion that neural nets are not copyrightable. If they are copyrightable, then you trying to make a distinction about data vs code does no work. You're trying to make a distinction without a difference.
syzygy
Posts: 5774
Joined: Tue Feb 28, 2012 11:56 pm

Re: Are neural nets (the weights file) copyrightable?

Post by syzygy »

syzygy wrote: Fri Feb 26, 2021 11:04 pmThe FSF's FAQ contains many untruths.
The following paper makes for interesting reading and comments on the absurdity of the FSF's position:
https://courses.cs.washington.edu/cours ... rksGPL.pdf
syzygy
Posts: 5774
Joined: Tue Feb 28, 2012 11:56 pm

Re: Are neural nets (the weights file) copyrightable?

Post by syzygy »

gonzochess75 wrote: Fri Feb 26, 2021 11:13 pm
syzygy wrote: Fri Feb 26, 2021 11:04 pm Distributing a book that includes a large quotation from another book infringes the copyright on that other book.
Distributing a book that includes a reference to another book does not infringe the copyright on that other book.
Big difference.
Few last questions. Let's call the original book FOO and the other book BAR. If the copyright holder for FOO grants you a license for FOO that states if BAR contains a reference to FOO that you must license BAR in a compatible way... and you don't do that... do you still get to make copies of FOO and distribute FOO?
I am giving the example to explain why static vs dynamic linking makes a big difference for copyright law. It is not "irrelevant" as the FSF wants you to believe.

Btw, GPLv2 explicitly states that "derived work" is in the sense of copyright law.
GPLv3 uses different language but there doesn't seem to be the intention of a difference in substance.