Ethics in chess programming

syzygy · Post by **syzygy** » Fri Mar 11, 2022 11:37 pm

Uri Blass wrote: ↑Thu Mar 10, 2022 4:21 pm I agree that virtually no engine with a rating over 2000 can be considered original work in that sense but not because it is so clear that
the author could not do it by himself but because the author usually did not even try.

I have no doubt that neither Turing nor Shannon would have been able to write a strong program in their days, even if they had the powerful computers of today. There is just too much to figure out for one person.

The only reason that I did not start engine in that way is that I am not a good programmer and I saw no example of chess engine that is written in this way but I guess some good programmer should have no problem to use the idea that I mention and get something that is at least stronger than alpha beta engine with no pruning and no extensions that seem to be a stupid idea even if you know nothing.

If you know nothing, you are extremely unlikely to come up with alpha-beta. Most will not even be able to figure out recursion.

j.t. · Post by **j.t.** » Sat Mar 12, 2022 12:35 am

syzygy wrote: ↑Fri Mar 11, 2022 11:37 pm If you know nothing, you are extremely unlikely to come up with alpha-beta. Most will not even be able to figure out recursion.

Of course with hindsight this is always hard to correctly evaluate, but I think if someone already managed to come up with the standard min-max algorithm, the likelihood to discover the alpha-beta improvement is quite high if the person is motivated to make the program really good. The idea of not searching for an even worse refutation seems pretty natural to me.

syzygy · Post by **syzygy** » Sat Mar 12, 2022 1:31 am

j.t. wrote: ↑Sat Mar 12, 2022 12:35 am
syzygy wrote: ↑Fri Mar 11, 2022 11:37 pm If you know nothing, you are extremely unlikely to come up with alpha-beta. Most will not even be able to figure out recursion.
Of course with hindsight this is always hard to correctly evaluate, but I think if someone already managed to come up with the standard min-max algorithm, the likelihood to discover the alpha-beta improvement is quite high if the person is motivated to make the program really good. The idea of not searching for an even worse refutation seems pretty natural to me.

It took several years before alpha-beta was introduced. It did not occur to Shannon and Turing that one can do much better than minimax.
http://www-formal.stanford.edu/jmc/slid ... g-sli.html

John McCarthy wrote:alpha-beta pruning characterizes human play, but it wasn't noticed by early chess programmers--Turing, Shannon, Pasta and Ulam, and Bernstein. We humans are not very good at identifying the heuristics we ourselves use. Approximations to alpha-beta used by Samuel, Newell and Simon, McCarthy. Proved equivalent to minimax by Hart and Levine, independently by Brudno. Knuth gives details.

j.t. · Post by **j.t.** » Sat Mar 12, 2022 3:37 am

syzygy wrote: ↑Sat Mar 12, 2022 1:31 am It took several years before alpha-beta was introduced. It did not occur to Shannon and Turing that one can do much better than minimax.
http://www-formal.stanford.edu/jmc/slid ... g-sli.html
John McCarthy wrote: We humans are not very good at identifying the heuristics we ourselves use.

Fair point. And as I said, I have a hard time to imagine myself in the position knowing nothing about chess programming, since I basically started with that the same time I started programming. But I am wondering, how intensive research was before alpha-beta was discovered. Especially when talking about Shannon and Turing who had probably lots of other research projects going on, I am not sure how strong the motivation was to improve the min-max algorithm. And from the few minutes of research I just made, it seems that Shannon and other person at the time focused mostly on B-type searches because the lacking computational power made any brute-force search infeasible. So most thought-time was probably spent there. Today, even a pure min-max algorithm is on modern hardware already fast enough to be quite good against humans, so for a programmer, this path would make sense to look for optimizations. But that's mostly just speculation by me. Do you know a paper or and article from that time when possible optimizations of min-max were explored but alpha-beta/branch-and-bounds wasn't found?

smatovic · Post by **smatovic** » Sat Mar 12, 2022 7:28 am

j.t. wrote: ↑Sat Mar 12, 2022 3:37 am
syzygy wrote: ↑Sat Mar 12, 2022 1:31 am It took several years before alpha-beta was introduced. It did not occur to Shannon and Turing that one can do much better than minimax.
http://www-formal.stanford.edu/jmc/slid ... g-sli.html
John McCarthy wrote: We humans are not very good at identifying the heuristics we ourselves use.

Fair point. And as I said, I have a hard time to imagine myself in the position knowing nothing about chess programming, since I basically started with that the same time I started programming. But I am wondering, how intensive research was before alpha-beta was discovered. Especially when talking about Shannon and Turing who had probably lots of other research projects going on, I am not sure how strong the motivation was to improve the min-max algorithm. And from the few minutes of research I just made, it seems that Shannon and other person at the time focused mostly on B-type searches because the lacking computational power made any brute-force search infeasible. So most thought-time was probably spent there. Today, even a pure min-max algorithm is on modern hardware already fast enough to be quite good against humans, so for a programmer, this path would make sense to look for optimizations. But that's mostly just speculation by me. Do you know a paper or and article from that time when possible optimizations of min-max were explored but alpha-beta/branch-and-bounds wasn't found?

AFAIK Wiener proposed an in depth-limited mini-max search with an evaluation function as heuristic in 1948, from CPW:

in his 1948 book Cybernetics or Control and Communication in the Animal and the Machine, he describes how a chess program could be developed using a depth-limited minimax search with an evaluation function.

https://www.chessprogramming.org/Norbert_Wiener
https://en.wikipedia.org/wiki/Cyberneti ... he_Machine

My own conclusion for an timeline is:

von Neumann, 1928: MiniMax
Wiener, 1948: in depth-limited MiniMax + eval heuristic
Shannon, 1949: Type A + Type B strategy
McCarthy: 1956: AlphaBeta
Bernstein 1956-1958: Bernstein Chess Program

We can look back into history mostly only via papers/books published, who knows what was left in some dusted drawers, or alike, for example Zuse, his Plankalkuel and chess...
--
Srdja

Uri Blass · Post by **Uri Blass** » Sat Mar 12, 2022 11:00 am

syzygy wrote: ↑Fri Mar 11, 2022 11:37 pm
Uri Blass wrote: ↑Thu Mar 10, 2022 4:21 pm I agree that virtually no engine with a rating over 2000 can be considered original work in that sense but not because it is so clear that
the author could not do it by himself but because the author usually did not even try.
I have no doubt that neither Turing nor Shannon would have been able to write a strong program in their days, even if they had the powerful computers of today. There is just too much to figure out for one person.

The only reason that I did not start engine in that way is that I am not a good programmer and I saw no example of chess engine that is written in this way but I guess some good programmer should have no problem to use the idea that I mention and get something that is at least stronger than alpha beta engine with no pruning and no extensions that seem to be a stupid idea even if you know nothing.
If you know nothing, you are extremely unlikely to come up with alpha-beta. Most will not even be able to figure out recursion.

I am not sure what Turing and Shannon could do in case they had powerful computers of today but trying to use a similiar ideas to ideas chess players use when they play is trivial and it is obvious that chess players do not use minimax but some type of alpha beta and not alpha beta with fixed number of plies(and they search important lines more plies forward).

It is also trivial that important lines are lines with bigger probability to be best so moves with bigger probability to be best are searched to bigger depth.

Humans do not think about the exact probabilities but it is only because the human brain is not build for working in this way and computers work with numbers.

hgm · Post by **hgm** » Sat Mar 12, 2022 1:12 pm

syzygy wrote: ↑Fri Mar 11, 2022 11:37 pmIf you know nothing, you are extremely unlikely to come up with alpha-beta. Most will not even be able to figure out recursion.

I don't think so. Recursion is a general programming technique, and even many mathematical functions are recursively defined. And alpha-beta is natural to chess players, who all know that a single refutation is enough to disqualify a variation. And that is really all there is to it. Chess players consider you crazy when during an analysis you start discussing moves that should have been cut off (i.e. are already refuted).

When I wrote my first chess program I was only vaguely aware of alpha-beta, and the implementation I first made was faulty because it was missing the 'deep cutoffs'. (In modern terminology: I started alpha in every node at -INF, rather than -beta of the parent.) It became immediately obvious to me when I was monitoring what it was thinking about (by printing the current branch in the display) that something was badly wrong, because it was wasting a large time searching variation it had no reason to search, as the initial moves in it were already refuted.

syzygy · Post by **syzygy** » Sun Mar 13, 2022 3:51 pm

j.t. wrote: ↑Sat Mar 12, 2022 3:37 amAnd from the few minutes of research I just made, it seems that Shannon and other person at the time focused mostly on B-type searches because the lacking computational power made any brute-force search infeasible. So most thought-time was probably spent there.

They had to do selective search because they did not have the alpha-beta algorithm available (and it did not naturally occur to them).

Today, even a pure min-max algorithm is on modern hardware already fast enough to be quite good against humans, so for a programmer, this path would make sense to look for optimizations. But that's mostly just speculation by me. Do you know a paper or and article from that time when possible optimizations of min-max were explored but alpha-beta/branch-and-bounds wasn't found?

I'm not sure, but I think some early implementations may have used a simpler approach where the current value for "alpha" is used to do beta=-alpha cutoffs in the child node, but no "beta" is propagated. So search(pos, beta) searches position pos and returns immediately if a move is found that is >= beta. It keeps track of the current best move (i.e. alpha) and recursively calls search(child_pos, -alpha).

syzygy · Post by **syzygy** » Sun Mar 13, 2022 4:49 pm

Uri Blass wrote: ↑Sat Mar 12, 2022 11:00 amI am not sure what Turing and Shannon could do in case they had powerful computers of today but trying to use a similiar ideas to ideas chess players use when they play is trivial and it is obvious that chess players do not use minimax but some type of alpha beta and not alpha beta with fixed number of plies(and they search important lines more plies forward).

It is also trivial that important lines are lines with bigger probability to be best so moves with bigger probability to be best are searched to bigger depth.

Humans do not think about the exact probabilities but it is only because the human brain is not build for working in this way and computers work with numbers.

Going from such grand ideas to an actual algorithm is highly non-trivial.

If Turing and Shannon had dedicated their lives to chess programming, I don't really doubt that both of these geniuses would have come up with alpha-beta, but they would not have had enough time to come up with all the other ideas that we now know work (and to discard all the other ideas that we now know do not work).

Fact is that their papers showed no sign of any awareness of cutoffs.

Shannon accepts a branching factor of 30, points out that one move by white and one by black (so 2 ply) already gives about 10^3 possibilities, and that a full 40-move game therefore requires 10^120 variations to be calculated from the initial position (when searching to terminal nodes; of course 40 moves is far too optimistic here). He then suggests using a heuristic evaluation function in non-terminal positions, but notes that this still requires 10^9 calculations for a 3-move (6-ply) search, which he "very optimistically" estimates at taking 16 minutes.

Turing describes minimax, mentions that it would be desirable to search lines deeper than other lines, but there is nothing that suggests he knew about beta cutoffs. (However, he does not focus so much on the combinatorial search explosion problem as Shannon does.)

The first real chess programs that ran on actual computers did not implement alpha-beta or some lesser variant of it. Bernsteins program, around 1956, considered 7 moves per position and examined 2500 final positions 2 full moves deep, so it implemented pure minimax (indeed 7^4 is about 50^2 = 2500). The 1958 Newell/Shaw/Simon paper mentions that searching 3 full moves deep would have taken 50 (7^2) times longer for Bernstein's program (to show the "power" of selective search: only 50 times as long, not 1000 times). This paper also describes the authors' own program which might have implemented some form of beta cutoff, but it is not really explained (there is a discusison of a search tree example which focusses mostly on whether a position is "dead" or has to be further expanded, but which mentions that if one child node had been searched before the other, the other would not have needed to be searched because this branch was already bad enough). Kotok's 1962 paper mentions that Prof. McCarthy had proposed the alpha-beta "heuristic", which had led Kotok to implement in the summer of 1961 an alpha-beta version of the program he had written in the spring of 1961.

Maybe most of these people were aware that sometimes you can skip a branch, but they probably did not realise the magnitude of the speed up that could be attained this way, and therefore also did not spend much thought on how to precisely pin down this trick (with two bounds, alpha and beta).

hgm · Post by **hgm** » Sun Mar 13, 2022 4:57 pm

syzygy wrote: ↑Sun Mar 13, 2022 3:51 pmI'm not sure, but I think some early implementations may have used a simpler approach where the current value for "alpha" is used to do beta=-alpha cutoffs in the child node, but no "beta" is propagated. So search(pos, beta) searches position pos and returns immediately if a move is found that is >= beta. It keeps track of the current best move (i.e. alpha) and recursively calls search(child_pos, -alpha).

That sounds like what my first implementation did. And when I started monitoring what it was thinking about, it very quickly became clear that it was thinking about variations that had no relevance in the eye of a chess player, because one of the moves leading to it had already been refuted. Once I started analyzing why it overlooked the fact that the move was already refuted, it turned out that one should not only compare the score to that of alternatives on the previous ply, but also 3 (or 5, 7, ...) earlier. I.e. propagate beta.

So if Shannon did not come up with alpha-beta, it was either because he did not play chess himself, or because he did not have the computing power to search deep enough so that deep cutoffs could occur.

BTW, my first true chess program did not use a fixed-depth search, but classified the generated moves based on whether they were captures or not, and whether they were tactically connected with the previous two plies. This resulted in what we would call 'extensions by a fractional number of plies' in modern terminology. E.g. a capture would be extended by a quarter ply, while a recapture to the same square would be extended by a full ply (so that I did not need any quiescence search). So it was tearching coherent tactics much deeper than random shuffling of pieces. The AI of the Interactive Diagram still works that way. Except that it uses reductions rather than extensions, and treat those LMR fashion.

Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming

Re: Ethics in chess programming