Hello,
I was following some of the discussion about the Ivanov cheating case and I have a couple of questions for engine programmers.
In the discussions about Ivanov you often read that his moves matched the "first choice of houdini" in the given position. However it is rarely defined how the "first choice of houdini" is selected. I suppose you'd get different moves more often than not if you run houdini on a slow computer for few seconds or if you sun houdini on a fast computer for several minutes.
First question: assuming to start an engine on infinite analysis, how often would the engine on average change his mind with respect to the best move (assuming it would stabilize on a selection on the long term)?
The discussion about Ivanov for some reason uses houdini as the default benchmark, but he might have used any other engine or a combination.
Second question: assuming to compare two different engines (for example houdini and stockfish) and run them for a good amount of time on a decent hardware (so to avoid the inial noise at low depth), how often would you expect the best move of the two engines to be the same?
I tried to look at this correlation looking at the last houdini/stockfish game from TCEC, comparing the move predicted by each engine in their PV with the actual move played by the other engine and I got a relatively low match of about 45%, I wonder if this estimate has any general validity.
For all example assume to average results over a large number of positions of different nature (like all positions of a number of recent GM games, so a mix of tactical/forced positions and more strategic positions).
I wonder if anyone calculated similar stats based on real games, otherwise the "best guess" of engine programmers would be welcome.
Thanks.
How often does an engine change is best move?
Moderators: hgm, Rebel, chrisw
-
- Posts: 164
- Joined: Wed Dec 23, 2009 1:57 pm
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: How often does an engine change is best move?
I did my own study of move frequency matches with strong programs and came up with some numbers. It's posted here somewhere but I don't remember when or where. It was less than a year ago.casaschi wrote:Hello,
I was following some of the discussion about the Ivanov cheating case and I have a couple of questions for engine programmers.
In the discussions about Ivanov you often read that his moves matched the "first choice of houdini" in the given position. However it is rarely defined how the "first choice of houdini" is selected. I suppose you'd get different moves more often than not if you run houdini on a slow computer for few seconds or if you sun houdini on a fast computer for several minutes.
First question: assuming to start an engine on infinite analysis, how often would the engine on average change his mind with respect to the best move (assuming it would stabilize on a selection on the long term)?
The discussion about Ivanov for some reason uses houdini as the default benchmark, but he might have used any other engine or a combination.
Second question: assuming to compare two different engines (for example houdini and stockfish) and run them for a good amount of time on a decent hardware (so to avoid the inial noise at low depth), how often would you expect the best move of the two engines to be the same?
I tried to look at this correlation looking at the last houdini/stockfish game from TCEC, comparing the move predicted by each engine in their PV with the actual move played by the other engine and I got a relatively low match of about 45%, I wonder if this estimate has any general validity.
For all example assume to average results over a large number of positions of different nature (like all positions of a number of recent GM games, so a mix of tactical/forced positions and more strategic positions).
I wonder if anyone calculated similar stats based on real games, otherwise the "best guess" of engine programmers would be welcome.
Thanks.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How often does an engine change is best move?
There have been several studies done, and papers published. Monty and I did one, Heinz did one, and there was at least one more. We all found that if you search one ply deeper, a program will change its best move somewhere around 15% of the time. Note that this kind of test would not include tactical positions with just one best move like a mate in N.Don wrote:I did my own study of move frequency matches with strong programs and came up with some numbers. It's posted here somewhere but I don't remember when or where. It was less than a year ago.casaschi wrote:Hello,
I was following some of the discussion about the Ivanov cheating case and I have a couple of questions for engine programmers.
In the discussions about Ivanov you often read that his moves matched the "first choice of houdini" in the given position. However it is rarely defined how the "first choice of houdini" is selected. I suppose you'd get different moves more often than not if you run houdini on a slow computer for few seconds or if you sun houdini on a fast computer for several minutes.
First question: assuming to start an engine on infinite analysis, how often would the engine on average change his mind with respect to the best move (assuming it would stabilize on a selection on the long term)?
The discussion about Ivanov for some reason uses houdini as the default benchmark, but he might have used any other engine or a combination.
Second question: assuming to compare two different engines (for example houdini and stockfish) and run them for a good amount of time on a decent hardware (so to avoid the inial noise at low depth), how often would you expect the best move of the two engines to be the same?
I tried to look at this correlation looking at the last houdini/stockfish game from TCEC, comparing the move predicted by each engine in their PV with the actual move played by the other engine and I got a relatively low match of about 45%, I wonder if this estimate has any general validity.
For all example assume to average results over a large number of positions of different nature (like all positions of a number of recent GM games, so a mix of tactical/forced positions and more strategic positions).
I wonder if anyone calculated similar stats based on real games, otherwise the "best guess" of engine programmers would be welcome.
Thanks.
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: How often does an engine change is best move?
Is this true for any depth? It would seem that the percentage would be much higher for lower search depths. I also assume the tests for the paper were done with an empty hash, starting the search from scratch, right?bob wrote: There have been several studies done, and papers published. Monty and I did one, Heinz did one, and there was at least one more. We all found that if you search one ply deeper, a program will change its best move somewhere around 15% of the time. Note that this kind of test would not include tactical positions with just one best move like a mate in N.
jm
-
- Posts: 90
- Joined: Sun Nov 02, 2008 4:43 pm
- Location: Barcelona
Re: How often does an engine change is best move?
A few months ago I made the following test that can provide some data for your segond question.
Note that this set is not filtered by tactics and probably there is more end game positions...
Then I evaluate this set with the following engines: Komodo 3, Ivanhoe 999949j and fruit 2.3.1.
In a first set of 10.000 positions, the agreement rate was:
At depth 1, when the recaptures (quiesce search) and static evaluation dominate the result, the rate of agreement was 39-40%
As the search progresses to greater depths, the agreement rate is falling below 20%
unfortunately, I can not say what happens at greater depths, but I hope this data help you to get some insight.
Best regards.
Traveling a pgn file, skipping the first 10 ply of each game, I build a set of positions.how often would you expect the best move of the two engines to be the same?
Note that this set is not filtered by tactics and probably there is more end game positions...
Then I evaluate this set with the following engines: Komodo 3, Ivanhoe 999949j and fruit 2.3.1.
In a first set of 10.000 positions, the agreement rate was:
Code: Select all
for depth 1 -> 3950 agreements
depth 2 -> 3089
3 -> 2659
4 -> 2353
5 -> 2151
6 -> 1976
with a second set of 10.000 position
depth 1 -> 4019 agreements
2 -> 3123
3 -> 2659
4 -> 2334
5 -> 2149
6 -> 1973
then a set of 100.000 position
depth 1 -> 39582 agreeements
2 -> 30849
3 and up -> beyond the user's patience ;-)
As the search progresses to greater depths, the agreement rate is falling below 20%
unfortunately, I can not say what happens at greater depths, but I hope this data help you to get some insight.
Best regards.