Discussion of anything and everything relating to chess playing software and machines.
Moderator: Ras
Norm Pollock
Posts: 1070 Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA
Post
by Norm Pollock » Sun Sep 23, 2012 5:59 pm
Consider this to be an "anecdotal" observation, not mathematical statistics. The latest Stockfish version 231-64-popcnt appears to be significantly stronger than its predecessor 23-64-sse42.
This is a partial round-robin tournament run in Arena at 1'+1", on 2 CPUs. I left out games between the 2 Stockfish versions because I feel games between close relatives are overly slanted to draws, and would distort the comparison between them. Because the Stockfish versions only played 60 games, they finished behind the others playing 80 games.
Starting positions were randomly selected from an epd file containing 1830 of the most popular positions after 12 full moves in high level OTB games between ELO 2400+ players. Positions were also reversed.
Code: Select all
Engine Score Ho Cr Iv St St S-B
1: Houdini_15a_x64 44.0/80 ···················· =0==10====1=1=101=== 101====0====1010=1=1 =101001==0=1==0=0=== 1=01=1=1===10==11101 1481.0
2: Critter_1.6a_64bit 42.5/80 =1==01====0=0=010=== ···················· =0==0=====10=0=1101= ==10=1==10=1==01101= ==1==1==0111====0=11 1419.2
3: Ivanhoe B46fb x64 38.0/80 010====1====0101=0=0 =1==1=====01=1=0010= ···················· =0===00=10101===0==0 =0=======1==01===1== 1346.2
4: Stockfish-231-64-popcnt-ja 31.5/60 =010110==1=0==1=1=== ==01=0==01=0==10010= =1===11=01010===1==1 ···················· 1301.2
5: Stockfish-23-64-sse42-ja 24.0/60 0=10=0=0===01==00010 ==0==0==1000====1=00 =1=======0==10===0== ···················· 987.75
180 of 200 games played
Name of the tournament: B7
Site/ Country: MARGE-PC, United States
Level: Blitz 1/1
Hardware: Intel(R) Pentium(R) CPU G620 @ 2.60GHz with 3.9 GB Memory
Operating system: Windows 7 Home Premium Home Edition Service Pack 1 (Build 7601) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Tournaments\B7.pgn
Website:
E-Mail Address:
IanO
Posts: 498 Joined: Wed Mar 08, 2006 9:45 pm
Location: Portland, OR
Post
by IanO » Mon Sep 24, 2012 8:59 pm
As with the 2.3 release, all of the Mac builds fail with "Illegal instruction" on a Core 2 Duo (OS X 10.6.8). A native gcc build works fine. The clang build fails (clang++ not found), but perhaps my tools are old (clang version 1.7).
Uri Blass
Posts: 10790 Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel
Post
by Uri Blass » Tue Sep 25, 2012 6:21 am
Laskos wrote: Seems same improvement as 2.3 vs. version 2.2.2, about 26 Elo points at ultra-short time control, 2.5s + 0.04s
Code: Select all
Program Score % Elo + - Draws
1 Stockfish 2.3.1 JA 64bit : 2152.5/4000 53.8 3226 8 8 40.5 %
2 Stockfish 2.2.2 JA : 1847.5/4000 46.2 3200 8 8 40.5 %
10-15 points improvement at blitz time control.
Kai
I do not know how do you get 10-15 points improvements at blitz time control.
I can see no improvement based on the ipon rating list.
After almost 1000 games
Stockfish2.3.1 performs 10 elo worse than Stockfish2.2.2.
Modern Times
Posts: 3703 Joined: Thu Jun 07, 2012 11:02 pm
Post
by Modern Times » Tue Sep 25, 2012 6:43 am
styx
Posts: 338 Joined: Tue Mar 13, 2012 9:59 pm
Location: Germany
Post
by styx » Tue Sep 25, 2012 1:27 pm
Uri Blass wrote: Laskos wrote: Seems same improvement as 2.3 vs. version 2.2.2, about 26 Elo points at ultra-short time control, 2.5s + 0.04s
Code: Select all
Program Score % Elo + - Draws
1 Stockfish 2.3.1 JA 64bit : 2152.5/4000 53.8 3226 8 8 40.5 %
2 Stockfish 2.2.2 JA : 1847.5/4000 46.2 3200 8 8 40.5 %
10-15 points improvement at blitz time control.
Kai
I do not know how do you get 10-15 points improvements at blitz time control.
I can see no improvement based on the ipon rating list.
After almost 1000 games
Stockfish2.3.1 performs 10 elo worse than Stockfish2.2.2.
you cannot compare different tests --> different time controls/different opponents/different hardware/different settings
but I have to admit, I'm also a little bit surprised about the preliminary results at IPON and its "bad" results against weaker opponents
bupalo
Posts: 82 Joined: Fri Mar 16, 2012 2:04 pm
Post
by bupalo » Tue Sep 25, 2012 2:45 pm
it's not doing so bad. But we will see its real strenght at Graham's tournament of 8 cores
Graham Banks
Posts: 44026 Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ
Post
by Graham Banks » Tue Sep 25, 2012 8:52 pm
bupalo wrote: it's not doing so bad. But we will see its real strenght at Graham's tournament of 8 cores
One tournament isn't anything to go by though. You really need to look at everybody's results to form a more accurate picture.
gbanksnz at gmail.com
Werner
Posts: 2967 Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle
Post
by Werner » Sun Sep 30, 2012 9:51 am
I played this game with Stockfish 2.3.1 from 22.09.2012:
[Event "40 Züge in 8 min"]
[Site "Werner"]
[Date "2012.09.26"]
[Round "10.2"]
[White "Stockfish 2.3.1 x64 1CPU"]
[Black "Deep Rybka 4.1 x64 1CPU"]
[Result "1-0"]
[PlyCount "26"]
[EventDate "2012.??.??"]
1. e4 Nf6 2. e5 Nd5 3. d4 d6 4. Nf3 dxe5 {+0.35/13 13s} 5. Nxe5 {+0.68/20 10s}
Nd7 {+0.35/14 16s (e6)} 6. Nxf7 {+1.13/20 11s (Sf3)} Kxf7 {0.00/15 18s} 7. Qh5+
{+1.13/12 0s} Ke6 {0.00/15 15s} 8. c4 {+1.25/21 15s} N5f6 {+0.22/15 11s (S7f6)}
9. d5+ {+14.74/20 6s} Kd6 {+0.22/15 14s} 10. c5+ {+15.75/23 11s} Nxc5 {+0.22/17
14s} 11. Bf4+ {+15.75/12 0s} Kd7 {+0.22/17 13s} 12. Bb5+ {+16.60/23 9s} c6 {+0.
22/1 0s} 13. dxc6+ {+16.56/23 7s} bxc6 {+0.22/18 13s (Ke6)} 1-0
The GUI made Stockfish to the winner!
This does not happen with the version from 23.09.2012.
Werner
ZirconiumX
Posts: 1350 Joined: Sun Jul 17, 2011 11:14 am
Full name: Hannah Ravensloft
Post
by ZirconiumX » Sun Sep 30, 2012 10:55 am
A static eval of the position gives +1.82 from stockfish.
Black's king is not safe and I think that is what is giving the large score.
Matthew:out
tu ne cede malis, sed contra audentior ito
Uri Blass
Posts: 10790 Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel
Post
by Uri Blass » Sun Sep 30, 2012 11:19 am
Werner wrote: I played this game with Stockfish 2.3.1 from 22.09.2012:
[Event "40 Züge in 8 min"]
[Site "Werner"]
[Date "2012.09.26"]
[Round "10.2"]
[White "Stockfish 2.3.1 x64 1CPU"]
[Black "Deep Rybka 4.1 x64 1CPU"]
[Result "1-0"]
[PlyCount "26"]
[EventDate "2012.??.??"]
1. e4 Nf6 2. e5 Nd5 3. d4 d6 4. Nf3 dxe5 {+0.35/13 13s} 5. Nxe5 {+0.68/20 10s}
Nd7 {+0.35/14 16s (e6)} 6. Nxf7 {+1.13/20 11s (Sf3)} Kxf7 {0.00/15 18s} 7. Qh5+
{+1.13/12 0s} Ke6 {0.00/15 15s} 8. c4 {+1.25/21 15s} N5f6 {+0.22/15 11s (S7f6)}
9. d5+ {+14.74/20 6s} Kd6 {+0.22/15 14s} 10. c5+ {+15.75/23 11s} Nxc5 {+0.22/17
14s} 11. Bf4+ {+15.75/12 0s} Kd7 {+0.22/17 13s} 12. Bb5+ {+16.60/23 9s} c6 {+0.
22/1 0s} 13. dxc6+ {+16.56/23 7s} bxc6 {+0.22/18 13s (Ke6)} 1-0
The GUI made Stockfish to the winner!
This does not happen with the version from 23.09.2012.
I think that the conclusion is not to use that gui.
I do not see that black evaluated the position as loss for itself so there is no logical reason for a GUI to decide that stockfish won the game.