I think it is a difficult path, no matter what.bob wrote:While I don't think my old idea of "how much is permissible" is the end-all idea, it is a start.
I have taken the approach to use the following conceptual idea:
If a function that is consistent from input to output, where a move generator is a good example since each position has a fixed number of moves that can be played according to the rules of chess, then perhaps copying that is not a deal-breaker on tournament participation. I can think of other similar pieces of code, such as EGTB probing, where a given position always gives the same answer of mate in N or conversion in N or draw.
Other pieces of code clearly do not follow that concept. An evaluation function is one example. For a given position, there are an infinite number of possible different evaluation outputs. Search also, since one can reduce, extend or prune a move. So those are clearly off-limits. What else?
SEE seems to be the one input one output type of code, with some room for flexibility (do you handle pins, either absolute or not, do you check for legality, etc.) But the options are limited, and one could define, precisely, a SEE() (no pins), or SEEP() (recognizes absolute pins only), SEEAP() which recognizes things like knight pinned on queen by a bishop...
RepetitionCheck()? Rules of chess are quite precise so that might be an ok thing to borrow, although there are quite a few ways to do it (separate hash table, or a repetition list, or something else entirely). But since it is a one input, one output, it seems to fit.
So we are left with the important parts, namely the search (normal and q-search), code that deals with extensions, reductions and pruning, and all the evaluation code. The code that handles time utilization.
What about hashing? Harder case, since this is not a single input -> single output precisely. You can get lots of information from a hash table. A search result. A suggested move. A bound. A hint to avoid doing a null move search. A positional evaluation. Flags such as this was a singular position previously, etc... So maybe hashing is not a one input one output idea, particularly when there are lots of replacement strategies and such.
I think if you apply that question to each function in a chess program, you will find some that are "constant" and some that are "unique". And you might find some one input one output functions inside a non one-input-one-output function. For example, in the evaluation function, you might have a piece of code called HasOpposition(wk, bk, stm). Opposition is well-defined and for any position of the two kings and stm, that function will return T or F. So one might borrow that safely, but not the code where it is used since how you use that could vary all over the place.
One other major point. If you choose to borrow something, which for me would only include the magic move generation stuff (which, by the way does not affect my move generator code at all, I just changed a macro in a very simple way), would be that one must at least provide a proper citation or give credit to the originator of the code that was used. And one would have to abide by the GPL if pieces of a GPL program are borrowed. There is a subtle issue of a GPL program that borrows code from a non-GPL program. That's not supposed to happen, but it does, and it certainly clouds the issue.
The "heart and soul" of a chess engine is the search and evaluation. And copying that code should be a clear no-no. For those one-input-one-output functions, I don't see a problem with them. We already have several common examples. egtb.cpp was originally released as a part of Crafty, written by Eugene Nalimov. Many have used this code. Now MB has released his bitbase stuff and several have used that. Ditto for Pradu's magic move generator idea. Ditto for my rotated bitboard stuff. And we have never seen a program excluded from competition for using those. Nor should we.
There are grey areas such as piece/square tables and others. While there are an infinite number of possible piece/square tables, there are probably fewer than that useful ones. Can someone make a case for any particular piece where there are some obvious numbers and no others make any sense at all? I can't think of one, but that doesn't mean it doesn't exist. Individual scoring numbers? Would two different programmers agree on every scoring term in their evaluation? Can one make a case that fixing one value then dictates the rest if the evaluation is going to be "sane"? Doesn't seem reasonable, but doesn't seem like "impossible" could be proven.
This issue is a very complex one, and it is not going to be a quick and dirty fix I am afraid. It will take some logical/rational discourse and thought to arrive at something that is anywhere near workable.
And now you have answered the question "what is permissible?" That is only the first step, and probably the easiest step although it represents a really significant effort itself. Next is how to validate programs to screen out the bad ones. That is the real can of worms...
For instance, let's take piece-square tables. I would argue that there is some perfect set of numbers for piece square tables. Of course, it would be very difficult to find it. But it would be based on fundamental ideas like controlling the center, controlling space, and (perhaps) things like the recent statistical analysis by Joshua Shriver.
Eventually, the best programs will converge to similar values, given enough time. Are they now all cheaters if their numbers start to agree?