Origin of noise in chess engines matches

mcostalba · Post by **mcostalba** » Mon Aug 10, 2009 8:07 pm

bob wrote: We have an error bar because we are taking a statistical sample from the set of all possible games. Each chess position is different, and requires a different search depth and/or evaluation to choose the right move. Programs are not perfect. So you need a sample large enough to be sure that you didn't just luck into positoins you could play well, or positions where you play poorly.

IMHO is not a question that programs are not perfect. The crucial point is that the search depth before an engine does a move is finite (and cannot be possible otherwise).

The two fundamental points are these IMHO

1) An engine must stop the search at afinite depth and this depth, for a given position, could be not enough to find the best move.

There is nothing you can do about the above, in some position as example a position with an hanging queen, a depth of 1 is enough to find the best move, but in other positions you may need 20, 50, 100 to teorethical find the best move, so engine is relistically forced to stop the search before reaching the needed depth to spot the best move.

But this is not enough, there is another anavoidable intrinsic point.

2) The opponent, for the very basic fact that searches _AFTER_ you have made the move needs a ply LESS then you to reach your prior search depth. I don't know it it is clear, I write again in a diffrent way.

Suppose a match of an engine against itself at fixed depth search, say 20.

Suppose you have searched unitil depth 20 and then you make your move bm. Your move bm is the best move you could find up to depth 20, the corresponding pondering move you find let's say is pm.

Now you opponent has to reply. If the opponent searches at fixed depth 19 then your opponent (that has the same engine of you) will find as his best move EXACTLY pm, aka the move you were pondering on.

But, and this is the crucial point, your opponent can search up to depth 20, not 19, and so he _could_ find a better confutation of bm, because although both engine seacrh at fixed depth 20 the fact that your opponent searches _AFTER_ you have made the move gives him an INTRINSIC and unavoidable ply more then you.

So the above, unavoidable and intrinsic two points are the ones that create what we call noise and are the main reason why we need thousand of games to spot the stronger between two engines.

All the other conditions: timing issues, varying depth search, books, etc.. only exacerbate the problem, but the problem of noise is intrinsics and would exsist even without all the worstening conditions.

I don't know if I have been clear enough, please tell me if something is not well expressed.

Uri Blass · Post by **Uri Blass** » Mon Aug 10, 2009 8:56 pm

It is known that the orign of noise is the fact that engines play chess and not perfectly and you give no new information.

Of course engines play different positions when they play chess and depth d of position A is not depth d of position B.

Note that engine X at depth d-1 can also win against engine X at depth d because the engine may be lucky to find the best move a better move at smaller depth and the fact that the engine converge to better move at higher depth does not mean that the quality of the move always improve when you increase the depth.

Again the problem is that engines play chess
You can avoid the noise by not playing chess but I see nothing productive to reduce the noise in your words.

Uri

mcostalba · Post by **mcostalba** » Mon Aug 10, 2009 10:09 pm

Uri Blass wrote: Note that engine X at depth d-1 can also win against engine X at depth d because the engine may be lucky to find the best move a better move at smaller depth and the fact that the engine converge to better move at higher depth does not mean that the quality of the move always improve when you increase the depth.

Yes this is true, it is what I have formalized with the formula

Code: Select all

e&#40;rbm, d&#41; - e&#40;bm, d&#41; > 0   is not true for any d above fd

It means that until a certain minimum disambiguation depth (that, for certain positions, could be so high to be not practically rachable by any engine at the moment, as example the starting position itself) there is no assurance that the sign above will be positive, in other words for some position could very possibly exists a depth d so that:

e(rbm, d) - e(bm, d) > 0
e(rbm, d+1) - e(bm, d+1) < 0
e(rbm, d+2) - e(bm, d+2) > 0

This is the formalization of your concept, if I have understood correctly.

Instead of you I would think that a quantitative caraterization of this intrinsic noise could be useful.

As example the concept that for a given position above a minimum disambiguation depth an engine always stay with the best move could be translated in the experimental evidence that at very long time controls engines of similar strenght tend to draw.

I am still trying to find a way to _measure_ this noise experimentally and so plot its distribution and I have some ideas on how to do it, but are still not well cooked.

bob · Post by **bob** » Tue Aug 11, 2009 11:11 pm

ernest wrote:"That is BayesElo output. +/- 4 as I said."

???
My (theoretical) formula for the Standard Deviation of the score, in a engine-engine match (N games, W wins, L losses) is
SD = Sqrt ((W+L)/4N)
(ok, there is a (N-1)/N term, rounded to 1 )

You ignored the rest of my post. I am not seeing 33% draws. The BayesElo output showed 22% I believe.

ernest · Post by **ernest** » Wed Aug 12, 2009 1:30 am

bob wrote:I am not seeing 33% draws. The BayesElo output showed 22% I believe.

Still something not right, Bob!

With 22% draws, the formula gives 3.1 Elo (at 2 SD), not 4

bob · Post by **bob** » Wed Aug 12, 2009 2:04 am

ernest wrote:
bob wrote:I am not seeing 33% draws. The BayesElo output showed 22% I believe.
Still something not right, Bob!
With 22% draws, the formula gives 3.1 Elo (at 2 SD), not 4

I'm only quoting BayesElo. Since this is integer math, 3.1 is not possible. It might well round up for all I know which would make it dead on.

Origin of noise in chess engines matches

Re: Origin of noise in chess engines matches

Re: Origin of noise in chess engines matches

Re: Origin of noise in chess engines matches

Re: Origin of noise in chess engines matches

Re: Origin of noise in chess engines matches

Re: Origin of noise in chess engines matches