lkaufman wrote: Also, is it possible to see whether the ratings of the top few single-core engines would be much different if only pairings among them were rated?
Put them into a database programme, and make a subset of the pgn with just the opponents that you are interested in, and run bayeselo against it to answer your question
The majority of the 40/4 games for Komodo 5 were sent in by me. I used cutechess-cli as the tournament manager. I also contributed ~1000 games for both Komodo 3 and Komodo 4. In those cases, I used Arena 2.0.1.
Adam Hair wrote:The majority of the 40/4 games for Komodo 5 were sent in by me. I used cutechess-cli as the tournament manager. I also contributed ~1000 games for both Komodo 3 and Komodo 4. In those cases, I used Arena 2.0.1.
Thanks. What was the hardware used for these tests? Were games adjudicated by TBs or played to 50 move rule? Can you suggest any other variable that might account for our different results?
Here are all of the relevant details that I can think of:
OS: Windows XP 64-bit
CPU: Intel QX6700 at 3.05 GHz
Time Control: 40/3'
GUI: cutechess-cli
Hash: 128 MB
EGTB: None
Starting Positions: PGN of ~17,900 positions 4 moves deep
Resign: off
Draws: game adjudicated as a draw if both engines' score is within 50 centipawns after 250 moves. I do not remember if cutechess uses the 50 moves rule (I think it does).
Adam Hair wrote:Here are all of the relevant details that I can think of:
OS: Windows XP 64-bit
CPU: Intel QX6700 at 3.05 GHz
Time Control: 40/3'
GUI: cutechess-cli
Hash: 128 MB
EGTB: None
Starting Positions: PGN of ~17,900 positions 4 moves deep
Resign: off
Draws: game adjudicated as a draw if both engines' score is within 50 centipawns after 250 moves. I do not remember if cutechess uses the 50 moves rule (I think it does).
Two comments:
1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
2. We learned that it is very important for testers to use the 50 move rule. If they do not, engines may make ridiculous moves when they think the 50 move rule is about to apply. You should verify that it does use the 50 move rule and switch if it does not.
Adam Hair wrote:Here are all of the relevant details that I can think of:
OS: Windows XP 64-bit
CPU: Intel QX6700 at 3.05 GHz
Time Control: 40/3'
GUI: cutechess-cli
Hash: 128 MB
EGTB: None
Starting Positions: PGN of ~17,900 positions 4 moves deep
Resign: off
Draws: game adjudicated as a draw if both engines' score is within 50 centipawns after 250 moves. I do not remember if cutechess uses the 50 moves rule (I think it does).
lkaufman wrote:
Two comments:
1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
Yes, two 40/4 testers have SSE4 CPUs. And our results for Komodo 4 showed no measurable difference between non-SSE4 and SSE4. Though, if we played 20,000 games, it is possible that a statistically significant difference would be found.
lkaufman wrote:
2. We learned that it is very important for testers to use the 50 move rule. If they do not, engines may make ridiculous moves when they think the 50 move rule is about to apply. You should verify that it does use the 50 move rule and switch if it does not.
Thanks for your answers and your testing!
I have confirmed that cutechess does use the 50 move limit. I was 99% certain before; now I am 100% certain since at least 1 game was adjudicated as a draw because of the 50 move limit.
lkaufman wrote:
1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
You keep saying this, but where is the proof ? We found no difference with Komodo 4 SSE and Non-SSE at 40/40, and we ran hundreds of games with each. Incredibly wishful thinking to think SSE is worth nearly 20 Elo. What evidence do you have of that ?
lkaufman wrote:
1. I believe your cpu is pre-sse4. Since Komodo really suffers on non-sse4 machines (compared to other engines), that probably accounts for the bulk of the 20 elo. Do your other testers have sse4 machines or not?
You keep saying this, but where is the proof ? We found no difference with Komodo 4 SSE and Non-SSE at 40/40, and we ran hundreds of games with each. Incredibly wishful thinking to think SSE is worth nearly 20 Elo. What evidence do you have of that ?
It's easy to measure the relative speedup from using SSE for different engines. I don't have exact figures now, and it varied a bit over different versions, but maybe we get 7% or so more than others from it. Based on the CEGT ratings for Komodo 64 bit and 32 bit (six different versions), we average 1.3 elo for each percent speedup at the 40/4 level, so I guess SSE is close to ten elo. I still have to account for the remainder.