... at the usual place:
http://www.inwoba.de
In case another interesting enigne is released I might postphone this test.
Bye
Ingo
Stockfish test on 2 cores with PONDER ON is running
Moderator: Ras
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
-
- Posts: 3656
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Stockfish test on 2 cores with PONDER ON is running
Very interesting, because in my test Stockfish scales better with 2 CPU than R3!
Jouni
Jouni
-
- Posts: 10892
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Stockfish test on 2 cores with PONDER ON is running
Interesting
The only program that has more than 50% against stockfish 2T at this point of time is
Rybka3 1T and not Rybka3 2T
STOCK171_2T_1
Stockfish 1.7.1 JA 2T - Rybka 3 mp 2T (2942) 10.5 - 10.5 50.00% Perf=2942
Stockfish 1.7.1 JA 2T - Rybka 3 mp (2898) 9.5 - 11.5 45.24% Perf=2865
Stockfish 1.7.1 JA 2T - Naum 4.2 2T (2881) 13.5 - 7.5 64.29% Perf=2983
Stockfish 1.7.1 JA 2T - Deep Shredder 12 2T (2833) 15.0 - 6.0 71.43% Perf=2992
Stockfish 1.7.1 JA 2T - Naum 4.2 (2819) 15.5 - 5.5 73.81% Perf=2998
Stockfish 1.7.1 JA 2T - Deep Shredder 12 (2800) 13.5 - 7.5 64.29% Perf=2902
Stockfish 1.7.1 JA 2T - Komodo64 1.0 JA (2780) 16.5 - 4.5 78.57% Perf=3005
Stockfish 1.7.1 JA 2T - Zappa Mexico II 2T (2773) 14.5 - 6.5 69.05% Perf=2912
Stockfish 1.7.1 JA 2T - Zappa Mexico II (2710) 17.0 - 4.0 80.95% Perf=2961
Stockfish 1.7.1 JA 2T - Protector 1.3.2 JA (2699) 16.0 - 5.0 76.19% Perf=2901
Stockfish 1.7.1 JA 2T - Onno-1-1-1 (2684) 17.5 - 2.5 87.50% Perf=3022
Stockfish 1.7.1 JA 2T - Spark-0.3 VC(a) (2673) 18.0 - 2.0 90.00% Perf=3054
Stockfish 1.7.1 JA 2T - Deep Sjeng WC2008 (2673) 17.0 - 4.0 80.95% Perf=2924
194.0 - 77.0 71.59% Perf=2942
The only program that has more than 50% against stockfish 2T at this point of time is
Rybka3 1T and not Rybka3 2T
STOCK171_2T_1
Stockfish 1.7.1 JA 2T - Rybka 3 mp 2T (2942) 10.5 - 10.5 50.00% Perf=2942
Stockfish 1.7.1 JA 2T - Rybka 3 mp (2898) 9.5 - 11.5 45.24% Perf=2865
Stockfish 1.7.1 JA 2T - Naum 4.2 2T (2881) 13.5 - 7.5 64.29% Perf=2983
Stockfish 1.7.1 JA 2T - Deep Shredder 12 2T (2833) 15.0 - 6.0 71.43% Perf=2992
Stockfish 1.7.1 JA 2T - Naum 4.2 (2819) 15.5 - 5.5 73.81% Perf=2998
Stockfish 1.7.1 JA 2T - Deep Shredder 12 (2800) 13.5 - 7.5 64.29% Perf=2902
Stockfish 1.7.1 JA 2T - Komodo64 1.0 JA (2780) 16.5 - 4.5 78.57% Perf=3005
Stockfish 1.7.1 JA 2T - Zappa Mexico II 2T (2773) 14.5 - 6.5 69.05% Perf=2912
Stockfish 1.7.1 JA 2T - Zappa Mexico II (2710) 17.0 - 4.0 80.95% Perf=2961
Stockfish 1.7.1 JA 2T - Protector 1.3.2 JA (2699) 16.0 - 5.0 76.19% Perf=2901
Stockfish 1.7.1 JA 2T - Onno-1-1-1 (2684) 17.5 - 2.5 87.50% Perf=3022
Stockfish 1.7.1 JA 2T - Spark-0.3 VC(a) (2673) 18.0 - 2.0 90.00% Perf=3054
Stockfish 1.7.1 JA 2T - Deep Sjeng WC2008 (2673) 17.0 - 4.0 80.95% Perf=2924
194.0 - 77.0 71.59% Perf=2942
-
- Posts: 2053
- Joined: Wed Mar 08, 2006 8:30 pm
Re: Stockfish test on 2 cores with PONDER ON is running
You are getting carried away, Uri!!!Uri Blass wrote:The only program that has more than 50% against stockfish 2T
How can you declare anything after matches with 21 games???...

-
- Posts: 3656
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Stockfish test on 2 cores with PONDER ON is running
Not quite at Rybka level yet, but close:
449.5 - 189.5 70.34% Perf=2932 (R3 2942)
But if engine wins all matches it is the strongest even without rating calculations!
BTW Ingo how is your testing done? Do You start all matches manually?
And how are live scores calculated?
Jouni
449.5 - 189.5 70.34% Perf=2932 (R3 2942)
But if engine wins all matches it is the strongest even without rating calculations!
BTW Ingo how is your testing done? Do You start all matches manually?
And how are live scores calculated?
Jouni
-
- Posts: 10892
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Stockfish test on 2 cores with PONDER ON is running
I can clearly show the factsernest wrote:You are getting carried away, Uri!!!Uri Blass wrote:The only program that has more than 50% against stockfish 2T
How can you declare anything after matches with 21 games???...
The facts are still the same after 50 games or 51 games against both programs
Stockfish 1.7.1 JA 2T - Rybka 3 mp 2T (2942) 26.0 - 24.0 52.00% Perf=2955
Stockfish 1.7.1 JA 2T - Rybka 3 mp (2898) 24.5 - 26.5 48.04% Perf=2885
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Stockfish test on 2 cores with PONDER ON is running
Hello
The problem are the CB engines. This I have to play in the CB GUI which is really painful as you can only start ONE GUI (2 if you use different users) and you can only play on one computer/user - you are not able to use multiple computers for ONE tourney.
If I would have to play all that manual like in the CB-GUI I would stop testing in such extend - it would be much to much work! Right now you see by yourself that I start a tourney and leave it alone for some day (just checking from time to time if an engine crashed)
... and that was the short story!
Bye
Ingo
Stockfish 1.7.1 won all SINGLE matches, even vs. Rybka 3 but WITH a proper calculation it is behind R3 ...Jouni wrote:Not quite at Rybka level yet, but close:
449.5 - 189.5 70.34% Perf=2932 (R3 2942)
But if engine wins all matches it is the strongest even without rating calculations!
Uff, long story short ... : I make ONE Tourney and let 4 (at the moment) Quads crunch on it at the same time (for the single test I can even start 2 GUIs simultaniously on each Quad). The Tourney is stored on an identical mapped drive for all the computers (even most engines are installed just from one comp on that mapped drive! All the other Comps/GUIs do not need an additional installation - if there is no copy protection). The trick is the Shredder Classic 4 GUI it is supporting such things including Elo calculations which are very basic and much less sophisticated like Bayeselo. It might be at the end that a close Elo winner will become a close Elo looser with a proper Bayes calculation.Jouni wrote: BTW Ingo how is your testing done? Do You start all matches manually?
And how are live scores calculated?
Jouni
The problem are the CB engines. This I have to play in the CB GUI which is really painful as you can only start ONE GUI (2 if you use different users) and you can only play on one computer/user - you are not able to use multiple computers for ONE tourney.
If I would have to play all that manual like in the CB-GUI I would stop testing in such extend - it would be much to much work! Right now you see by yourself that I start a tourney and leave it alone for some day (just checking from time to time if an engine crashed)
... and that was the short story!
Bye
Ingo
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Stockfish test on 2 cores with PONDER ON is running
Hello
Bye
Ingo
Yes, but keep in mind that there is another fact that R3 and R2 2T are just 44 Elo away with my conditions. Especially if Stockfish 1.7.1 2T is in between those two there is a certain likelihood that it will win a short match of just 100 games vs the presumed stronger one and lose vs the lower rated version. No problem here for me ... and it is still not over!Uri Blass wrote:
I can clearly show the facts
The facts are still the same after 50 games or 51 games against both programs
Stockfish 1.7.1 JA 2T - Rybka 3 mp 2T (2942) 26.0 - 24.0 52.00% Perf=2955
Stockfish 1.7.1 JA 2T - Rybka 3 mp (2898) 24.5 - 26.5 48.04% Perf=2885
Bye
Ingo
-
- Posts: 10892
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Stockfish test on 2 cores with PONDER ON is running
It is not clear after 1000 games(and the gap is only 2 elo)Jouni wrote:Not quite at Rybka level yet, but close:
449.5 - 189.5 70.34% Perf=2932 (R3 2942)
But if engine wins all matches it is the strongest even without rating calculations!
BTW Ingo how is your testing done? Do You start all matches manually?
And how are live scores calculated?
Jouni
713.5 - 286.5 71.35% Perf=2940
Stockfish 1.7.1 JA 2T - Rybka 3 mp 2T (2942) 41.5 - 35.5 53.90% Perf=2969
Stockfish 1.7.1 JA 2T - Rybka 3 mp (2898) 40.5 - 36.5 52.60% Perf=2916
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Stockfish test on 2 cores with PONDER ON is running
Hi
Dont look to a single Elo point. If might very well be that Bayeselo later on is throwing this 'guess' into another order.
Bye
Ingo
These two Engines are to close to become 'clear' in ranking.Uri Blass wrote: It is not clear after 1000 games(and the gap is only 2 elo)
Dont look to a single Elo point. If might very well be that Bayeselo later on is throwing this 'guess' into another order.
Bye
Ingo