Are neural nets (the weights file) copyrightable?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

syzygy
Posts: 5774
Joined: Tue Feb 28, 2012 11:56 pm

Re: Are neural nets (the weights file) copyrightable?

Post by syzygy »

syzygy wrote: Fri Feb 26, 2021 9:57 pm
Michel wrote: Fri Feb 26, 2021 9:52 pm
syzygy wrote: Fri Feb 26, 2021 9:44 pm
gonzochess75 wrote: Fri Feb 26, 2021 9:14 pm
syzygy wrote: Fri Feb 26, 2021 9:06 pm... What is a "work" is defined by copyright law. SF and an NNUE net zipped together are two works, not one.
This is I think where we part ways and the fundamental disagreement that we're arguing over. The GPL explicitly defines what it means by a combined work as anything not meeting the definition of mere "aggregate."
I still don't see where it does that. Condition c) does not mention "aggregate". Nor does it say that it applies to a "compilation" that include the Program or a work based on it.

Condition c), to make sense at all, necessarily must use "work" in the sense in which it is used in copyright law, since otherwise it would be ridiculous to state that "This License will therefore apply, ..., to the whole work, and all its parts". The license on SF ("This License") cannot apply to the NNUE net if the NNUE net is not part of the same "work" as the modified SF. Zipping them together does not make them part of the same "work".

The other text you cited doesn't impose any extra conditions beyond a), b), c) and d).
As I understand it when they say “This License will therefore apply, ..., to the whole work, and all its parts” is that if this conflicts with the licence of one of the parts you cannot distribute.
Then it should have said that the other parts have to be distributed under the GPLv3 (or a compatible license). And it should not have used "work" if it didn't mean "work".
On reconsideration, the sentence "This License will therefore apply" merely clarfiies the preceding sentence:
You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged.
So the important sentence is "You must license the entire work, as a whole, under" the GPLv3 when distributing. The second sentence explains that, as a result, the GPLv3 will apply to "the whole of the work, and all its parts, regardless of how they are packaged".

Still I don't accept that the term "work" as it is used here is to be understood as going beyond the concept of "work" from copyright law.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

syzygy wrote: Sat Feb 27, 2021 12:10 am Still I don't accept that the term "work" as it is used here is to be understood as going beyond the concept of "work" from copyright law.
Work just means "something which can be copyrighted." There is no mention of derivative work in GPLv3 as you noted and I'd argue that was by design.

Have a look - Chapman law review - https://www.chapman.edu/LAW/_files/publ ... -chern.pdf
"Among the significant issues,
GPLv3 tried to clarify the scope of the license by introducing
newly defined terms and consolidated requirements.47 GPLv2
borrowed many terms from U.S. copyright law, such as
“derivative works” and “collective works”,48 but this only led to
ambiguities about how far the license should reach.49
Furthermore, because the GPL is meant to be a global license
that can apply in any country regardless of that country’s
copyright laws, the GPLv3 abandoned the term “derivative work”
and redefined the license to meet global standards.50 "
And footnote 50 gives: 50 Moglen & Stallman, supra note 19; compare GPLv3, supra note 47, at § 0, with
GPLv2, supra note 22, at § 0.

Derivative work is a US copyright law concept. GPLv3 does away with it and provides its own definition specifically because it is meant as an upgrade to GPLv2 in a global setting.
Last edited by gonzochess75 on Sat Feb 27, 2021 12:27 am, edited 2 times in total.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Are neural nets (the weights file) copyrightable?

Post by Dann Corbit »

gonzochess75 wrote: Fri Feb 26, 2021 11:33 pm
Dann Corbit wrote: Fri Feb 26, 2021 10:55 pm And if the data were necessary for the program to run at all, it is still data.
If you have the program's code, you can figure out how to make your own, even if it is necessary, even if it is not provided.
Data is not code.
Data is data.
The data is not covered by the GPL, the code is covered.
This dog won't hunt. It is trivial to make any data into source code. Indeed, the Fat Fritz binary itself embedded the net into the binary. You can embed all kinds of things - pgns, movies, mp3, database - directly into a binary by making them part of the source code. Source code itself has one of its fundamental data types (pun intended) the integer. Do integers magically become source code in terms of copyright when they are loaded in a computers memory or written into a computer program? Do they magically become data when they are inserted into a text file that isn't compiled. Indeed I can compile the following "data" file directly into a program trivially:

// data.txt
3.141592653589793238

Is that source code or data? Both? Note: the digits by themselves are not copyrightable of course. But that just supports my opinion that neural nets are not copyrightable. If they are copyrightable, then you trying to make a distinction about data vs code does no work. You're trying to make a distinction without a difference.
This is the requirement for distribution to the public, read it carefully:
"But if you release the modified version to the public in some way, the GPL requires you to make the modified source code available to the program's users, under the GPL."

If you confuse 3.141592653589793238 in an external file for source code, then I feel there is no way that you can be convinced.
But I do not feel compelled to convince you.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

Dann Corbit wrote: Sat Feb 27, 2021 12:17 am If you confuse 3.141592653589793238 in an external file for source code, then I feel there is no way that you can be convinced.
But I do not feel compelled to convince you.
https://www.quora.com/In-Lisp-what-is-t ... e-and-data

This is a well known theoretical computer science question. A lot of people said that Lisp proved that data is code. However, they don't know their history. The lambda calculus proves it. Of course Turing aficionados would say that Turing proved it.
Last edited by gonzochess75 on Sat Feb 27, 2021 12:32 am, edited 1 time in total.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

syzygy wrote: Fri Feb 26, 2021 11:50 pm Btw, GPLv2 explicitly states that "derived work" is in the sense of copyright law.
GPLv3 uses different language but there doesn't seem to be the intention of a difference in substance.
There most definitely is a distinction and it was deliberately. See here for more evidence:

https://copyleft.org/guide/comprehensiv ... ech10.html
9.2.1 Modify and the Work Based on the Program
GPLv2 included a defined term, “work based on the Program”, but also used the term “modify” and “based on” throughout the license. GPLv2’s “work based on the Program” definition made use of a legal term of art, “derivative work”, which is peculiar to USA copyright law.2 GPLv2 always sought to cover all rights governed by relevant copyright law, in the USA and elsewhere. Even though differently-labeled concepts corresponding to the derivative work are recognized in all copyright law systems, these counterpart concepts might differ to some degree in scope and breadth from the USA derivative work. GPLv3 therefore internationalizes on this issue by removing GPLv2’s references to derivative works and by providing a more globally useful definition. GPLv3 drops all reference to USA “derivative works” and returns to the base concept only: GPL covers the licensed work and all works where copyright permission from the licensed work’s copyright holder.

The new definitions returns to the common elements of copyright law. Copyright holders of works of software have the exclusive right to form new works by modification of the original — a right that may be expressed in various ways in different legal systems. GPLv3 operates to grant this right to successive generations of users (particularly through the copyleft conditions set forth in GPLv3 §5, as described later in this tutorial in its § 9.8). Here in GPLv3 §0, “modify” refers to basic copyright rights, and then this definition of “modify” is used to define “modified version of” and “work based on” as synonyms.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Are neural nets (the weights file) copyrightable?

Post by Dann Corbit »

In this case, your medical records belong to anyone who asks, if your medical data is stored using MySQL.
After all, data is code and therefore, your medical records must be released to anyone who asks.
Your banking records as well.
Somehow, I think your interpretation is amiss.

True, data and programs are all just ones and zeros.
Same thing for music and movies. Just ones and zeros.
Somehow, I think we are able to make a distinction about these things being different.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

Dann Corbit wrote: Sat Feb 27, 2021 12:31 am In this case, your medical records belong to anyone who asks, if your medical data is stored using MySQL.
After all, data is code and therefore, your medical records must be released to anyone who asks.
Your banking records as well.
Somehow, I think your interpretation is amiss.

True, data and programs are all just ones and zeros.
Same thing for music and movies. Just ones and zeros.
Somehow, I think we are able to make a distinction about these things being different.
My interpretation implies none of those things. This is one hell of a strawman.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Are neural nets (the weights file) copyrightable?

Post by Dann Corbit »

Of course it does.
You said that data is code.
Therefore, anyone who asks for the data must be given that data.
MySQL is a GPL program.
Your medical records (if stored using MySQL) are therefore modified code.
This code therefore belongs to anyone who asks.

That is, unless data is not really code and the intention of the words "source code" in the license apply to what we normally call "source code".

You do not get to pick and choose which data is code and which data is not code. If data is code, then the license applies and it must be distributed to anyone who asks.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
syzygy
Posts: 5774
Joined: Tue Feb 28, 2012 11:56 pm

Re: Are neural nets (the weights file) copyrightable?

Post by syzygy »

gonzochess75 wrote: Sat Feb 27, 2021 12:12 am
syzygy wrote: Sat Feb 27, 2021 12:10 am Still I don't accept that the term "work" as it is used here is to be understood as going beyond the concept of "work" from copyright law.
Work just means "something which can be copyrighted."
And in that case the "whole work" need not go beyond a single source file, for example, or the (modified) set of source files that formed the Program.
There is no mention of derivative work in GPLv3 as you noted and I'd argue that was by design.

Have a look - Chapman law review - https://www.chapman.edu/LAW/_files/publ ... -chern.pdf
I already had this open in my browser :)
Note the criticism on the FSF's position on dynamic linking.
"Among the significant issues,
GPLv3 tried to clarify the scope of the license by introducing
newly defined terms and consolidated requirements.47 GPLv2
borrowed many terms from U.S. copyright law, such as
“derivative works” and “collective works”,48 but this only led to
ambiguities about how far the license should reach.49
Furthermore, because the GPL is meant to be a global license
that can apply in any country regardless of that country’s
copyright laws, the GPLv3 abandoned the term “derivative work”
and redefined the license to meet global standards.50 "
And footnote 50 gives: 50 Moglen & Stallman, supra note 19; compare GPLv3, supra note 47, at § 0, with
GPLv2, supra note 22, at § 0.
OK, so it was to avoid ambiguity and to make the text of the GPL more stand alone. Still there doesn't seem to be a big change in substance in this regard.

As explained later in the paper, the part relevant to dynamic linking is this:
The “Corresponding Source” for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work.
So indeed there is an attempt to force you to release the source code of a dynamically linked (shared) library when distributing a program. Interestingly, this still does not seem to cover the case of a program linking to a GPL'd library. The program's copyright is still untouched by the dynamically linked library.

If you distribute a compiled SF separately from an NNUE net specifically designed for SF (e.g. the two in a zip file), I don't see how this would force you to distribute the NNUE net under the GPL (assuming there is copyright on the NNUE net).

Of course the starting point here would still be the FF2 NNUE net, which doesn't contain any copyrightable expression from SF.
If you start from FF2-modified SF, then there are a zillion nets that can be run in FF2-modified SF to obtain a functioning chess engine, so there is no reason to require the specific FF2 NNUE net to be included with the same license.
gonzochess75
Posts: 208
Joined: Mon Dec 10, 2018 3:29 pm
Full name: Adam Treat

Re: Are neural nets (the weights file) copyrightable?

Post by gonzochess75 »

Dann Corbit wrote: Sat Feb 27, 2021 12:43 am Of course it does.
You said that data is code.
Therefore, anyone who asks for the data must be given that data.
You are misunderstanding me.