Engine vs Engine - Plays Same Moves

Sven · Post by **Sven** » Sun Feb 12, 2017 8:21 pm

Ras wrote:
Sven Schüle wrote:But for testing your changes during development, as well as for testing in the context of rating lists like CCRL, it will not be needed.
Couldn't that involve the risk to optimise the program for the handful of different games it plays - only to find out that in a broader context, the changes were bad?

We are not talking about a "handful" of openings but about many hundreds or even thousands of different openings. These should be randomly selected once from an even larger set.

Luis Babboni · Post by **Luis Babboni** » Sun Feb 12, 2017 10:07 pm

Post removed at users request

Ras · Post by **Ras** » Sun Feb 12, 2017 10:18 pm

Sven Schüle wrote:We are not talking about a "handful" of openings but about many hundreds or even thousands of different openings.

From the original posting: "If they play 50 games, there will actually be about 5 different games up to about 40 moves".

Sven · Post by **Sven** » Sun Feb 12, 2017 10:41 pm

Ras wrote:
Sven Schüle wrote:We are not talking about a "handful" of openings but about many hundreds or even thousands of different openings.
From the original posting: "If they play 50 games, there will actually be about 5 different games up to about 40 moves".

How is that related to my posting you replied to? My point was that someone proposed to use an (internal) opening book for testing to avoid repetition of games, and the OP said he'd plan to do so, but I objected that this would contradict to the goal of getting reproducible testing results, and that it were better to play games from (a fixed set of) many different starting positions.

Luis Babboni · Post by **Luis Babboni** » Sun Feb 12, 2017 11:20 pm

Luis Babboni wrote:LOL!!!
...

Sorry, this post was a mistake, I tried to send a PM and I do not know why I ended posting here!!

I just asked to admin to remove it casue I couldn´t because now we are more than 15 minutes after I posted it.

Ras · Post by **Ras** » Mon Feb 13, 2017 12:33 am

Sven Schüle wrote:How is that related to my posting you replied to?

Because what was happening in the overall thread was precisely lack of variation. Of course, choosing various defined starting positions (other than only the initial one) will also remedy this problem.

But it will run into the next problem. How to choose these starting positions? This problem is more or less equivalent to choosing the opening book because the opening book decides what starting positions are likely to arise.

Cheney · Post by **Cheney** » Tue Feb 14, 2017 1:00 pm

I got it, I think

.

Now that I have reviewed my old notes and code what I was doing with my opening database a year ago, it was to create variance. I have a decent opening book and had the engine always play a new line. Other than obtaining file full of EPD/PGN openings, I think implementing this will help a little bit with driving the variance. What I have noted is certain lines my engine seems to lose with frequency.

Earlier, Sven, you wrote about having a proper testing strategy. I have tested my engine against WAC and other test positions looking for the best move, but never setup a board (like from WAC) for two engines to play.

The things that catch me here are:
(1) If the position is an already winning position - do I load it?
(2) If I load these winning positions, will this not become noise in the win/loss statistics?
(3) All positions should probably be repeated so my engine plays both white and black.

I , too, use CuteChess-cli and am testing it loading an opening file (pgn or epd) to make sure my engine plays nice with that (so far so good). Next will be to check out CPW and this forum for some various opening positions.

Thanks again for all the ideas and help!

Sven · Post by **Sven** » Tue Feb 14, 2017 1:48 pm

Cheney wrote:Earlier, Sven, you wrote about having a proper testing strategy. I have tested my engine against WAC and other test positions looking for the best move, but never setup a board (like from WAC) for two engines to play.

The things that catch me here are:
(1) If the position is an already winning position - do I load it?
(2) If I load these winning positions, will this not become noise in the win/loss statistics?
(3) All positions should probably be repeated so my engine plays both white and black.

(1) and (2): Opinions differ on that. I prefer using a set of more or less "balanced" positions. But I can't give an exact reason for it.
(3) This is crucial, even more so when using many "unbalanced" positions.

One possible source for opening positions is Kirr's Chess Opening Sampler which is also referred to in section "Conducting Engine Tournaments Under Various GUIs" on Adam Hair's computer chess pages. In the latter you can also find some information about opening books.

AlvaroBegue · Post by **AlvaroBegue** » Tue Feb 14, 2017 9:03 pm

Sven Schüle wrote:(1) and (2): Opinions differ on that. I prefer using a set of more or less "balanced" positions. But I can't give an exact reason for it.

I can give you an exact reason. If a position has a probability p of being a win for white (I'll ignore draws for simplicity), the amount of information you expect to obtain by playing it (a.k.a. "entropy") is -(p*log(p)+(1-p)*log(1-p))/log(2) bits. This quantity is maximized when p=1/2.

In simple words: Running games from positions that are very imbalanced costs the same as running from balanced positions, but you learn less from their results.

Cheney · Post by **Cheney** » Sat Feb 25, 2017 4:17 pm

Hey - thanks again for all the insight on this. I did follow your stance on "proper testing" and pulled some positions using the links your provided and some that were listed on CPW.

I ran the tests across the three separate computers and tests results were practically identical. My engine would have the same winning or losing percentage no matter what computer the tests were run on.

There was a time where, on one computer, my engine had a very high loss rate with a set of 64 positions against one other engine. On the other two computers, the win percentage was significantly higher. So I restarted that test with those positions again, my engine had a higher win percentage identical to the other two computers. I tested this a 3rd time, still identical. Not sure what happened during that period of time, I checked my code for a "how to behave under psychological pressure" function but could not find one

Again, thanks for the help with this

Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves

Re: Engine vs Engine - Plays Same Moves