Perhaps. Even so, a 17 elo difference at THIS level of play I would think is akin to a 100+ elo difference at a lower level...say maybe 2100 and 2200 elo. Nothing to sneeze at, especially given how little time it has been since SF 14 came out.
Surprise: Official release version of Stockfish 14.1
Moderator: Ras
-
Cornfed
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Surprise: Official release version of Stockfish 14.1
-
lkaufman
- Posts: 6279
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Surprise: Official release version of Stockfish 14.1
I ran a 350 game match between SF14.1 and SF14 at standard Rapid time control (15 min plus 10 sec). Score was 11 to 4 with 335 draws (3 move/six ply most popular openings). So plus seven elo for 14.1. That's pretty consistent with 17 elo at bullet chess, though margin of error is large. It is indeed getting to be almost impossible to show elo gains with "good" openings at non-blitz time controls, the draw margin in chess is too large with good openings and the error rates at Rapid have dropped too low.
Komodo rules!
-
rlsuth
- Posts: 322
- Joined: Wed Mar 08, 2006 9:37 pm
Re: Surprise: Official release version of Stockfish 14.1
What am I missing here? When I download this 14.1 version I get exactly the same file as 14. Same dates, same sizes everything.
stockfish_14_x64_avx2.exe 48 966 656 2021-07-06
stockfish_14_x64_avx2.exe 48 966 656 2021-07-06
-
rlsuth
- Posts: 322
- Joined: Wed Mar 08, 2006 9:37 pm
Re: Surprise: Official release version of Stockfish 14.1
Never mind, looks like the link I was using is now pointing to the new version.
-
Jouni
- Posts: 3770
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Surprise: Official release version of Stockfish 14.1
14.1 is about 15% slower than 14 in nps but gets up to 5 plies more depth in fast games!
Jouni
-
Cornfed
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Surprise: Official release version of Stockfish 14.1
In the end...does it matter?
I mean, in the end, does 14.1 beat 14 heads up by scoring more points over a bunch of games in traditional chess...in both LTC and STC?
That seems to be how the developers judge 'successful tweaks'.
I personally think that with the highest level of computer chess being virtually a draw...and chess itself, well played, being drawish in nature that testers should use a different formula to judge 'success' and that being one that takes into account that 'drawish nature' and that Black always starts at a slight disadvantage - more pronounced with 'sketchy defenses. A scoring/testing system that is more subservient to the real nature of the game.
Something like: 1.0 pts for a win with White and 1.2 pts for a win with Black...and the draws favor Black slightly - say White gets 0 for a draw and Black a .01.... or to be more traditional: .5 for White and .505 for a draw with Black. Something along those lines.
I just think testing for 'elo gain' to decide improvements (and frankly in engine vs engine tournaments - where you play a lot of games!) is too subject to the opening choices and 'perhaps' time controls.
-
Uri Blass
- Posts: 11129
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Surprise: Official release version of Stockfish 14.1
I think that if we talk about different scoring system then mating faster should get more points.Cornfed wrote: ↑Sun Oct 31, 2021 4:46 pmIn the end...does it matter?
I mean, in the end, does 14.1 beat 14 heads up by scoring more points over a bunch of games in traditional chess...in both LTC and STC?
That seems to be how the developers judge 'successful tweaks'.
I personally think that with the highest level of computer chess being virtually a draw...and chess itself, well played, being drawish in nature that testers should use a different formula to judge 'success' and that being one that takes into account that 'drawish nature' and that Black always starts at a slight disadvantage - more pronounced with 'sketchy defenses. A scoring/testing system that is more subservient to the real nature of the game.
Something like: 1.0 pts for a win with White and 1.2 pts for a win with Black...and the draws favor Black slightly - say White gets 0 for a draw and Black a .01.... or to be more traditional: .5 for White and .505 for a draw with Black. Something along those lines.
I just think testing for 'elo gain' to decide improvements (and frankly in engine vs engine tournaments - where you play a lot of games!) is too subject to the opening choices and 'perhaps' time controls.
For example if you win you get 1 points minus number of moves divided by 10000 and if you lose you get number of moves divided by 10000.
-
Jouni
- Posts: 3770
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Surprise: Official release version of Stockfish 14.1
Yes depth is not everything. Position from SF forum
[fen]4k3/8/8/8/8/8/8/3BKN2 w - - 0 1[/fen]
Analysis by Stockfish 14.1:
...
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Ne3 Kg6 7.Bc4 Kg7 8.Kf5 Kh6 9.Bb3 Kg7 10.Kg5 Kh7 11.Kf6 Kh8 12.Nf5 Kh7 13.Bd1 Kg8
+- (3.39) Depth: 83/26 00:01:35 809mN
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Kf5 Kh8 7.Kf6 Kh7 8.Bc2+ Kh8 9.Nd2 Kg8 10.Bb3+ Kh8 11.Ne4 Kh7 12.Ng5+ Kh8
+- (3.39) Depth: 84/26 00:01:43 879mN
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Kf5 Kh8 7.Kf6 Kh7 8.Ne3 Kh8 9.Nf5 Kh7 10.Nd6 Kh8
+- (3.40) Depth: 85/25 00:01:54 970mN
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Kf5 Kf8 7.Ne3 Ke7 8.Ke5 Kd8 9.Kd6 Ke8 10.Ke6 Kf8
+- (3.40) Depth: 86/27 00:02:02 1038mN
SF 14 sees a mate
. 14.1 has some kind of issue.
[fen]4k3/8/8/8/8/8/8/3BKN2 w - - 0 1[/fen]
Analysis by Stockfish 14.1:
...
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Ne3 Kg6 7.Bc4 Kg7 8.Kf5 Kh6 9.Bb3 Kg7 10.Kg5 Kh7 11.Kf6 Kh8 12.Nf5 Kh7 13.Bd1 Kg8
+- (3.39) Depth: 83/26 00:01:35 809mN
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Kf5 Kh8 7.Kf6 Kh7 8.Bc2+ Kh8 9.Nd2 Kg8 10.Bb3+ Kh8 11.Ne4 Kh7 12.Ng5+ Kh8
+- (3.39) Depth: 84/26 00:01:43 879mN
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Kf5 Kh8 7.Kf6 Kh7 8.Ne3 Kh8 9.Nf5 Kh7 10.Nd6 Kh8
+- (3.40) Depth: 85/25 00:01:54 970mN
1.Bb3 Kf8 2.Ke2 Kg7 3.Ke3 Kh6 4.Kf4 Kg6 5.Ke5 Kg7 6.Kf5 Kf8 7.Ne3 Ke7 8.Ke5 Kd8 9.Kd6 Ke8 10.Ke6 Kf8
+- (3.40) Depth: 86/27 00:02:02 1038mN
SF 14 sees a mate
Jouni
-
Cornfed
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Surprise: Official release version of Stockfish 14.1
A win is a win. How 'fast' (style points) you mate should be irrelevant.Uri Blass wrote: ↑Sun Oct 31, 2021 5:56 pmI think that if we talk about different scoring system then mating faster should get more points.Cornfed wrote: ↑Sun Oct 31, 2021 4:46 pmIn the end...does it matter?
I mean, in the end, does 14.1 beat 14 heads up by scoring more points over a bunch of games in traditional chess...in both LTC and STC?
That seems to be how the developers judge 'successful tweaks'.
I personally think that with the highest level of computer chess being virtually a draw...and chess itself, well played, being drawish in nature that testers should use a different formula to judge 'success' and that being one that takes into account that 'drawish nature' and that Black always starts at a slight disadvantage - more pronounced with 'sketchy defenses. A scoring/testing system that is more subservient to the real nature of the game.
Something like: 1.0 pts for a win with White and 1.2 pts for a win with Black...and the draws favor Black slightly - say White gets 0 for a draw and Black a .01.... or to be more traditional: .5 for White and .505 for a draw with Black. Something along those lines.
I just think testing for 'elo gain' to decide improvements (and frankly in engine vs engine tournaments - where you play a lot of games!) is too subject to the opening choices and 'perhaps' time controls.
For example if you win you get 1 points minus number of moves divided by 10000 and if you lose you get number of moves divided by 10000.
Think of it this way, your 'speed' choice could neuter 'playing style'. You have a choice between an exchange sac and a clear totally positional choice at move 17 (think Tal vs Karpov) and either may be likely to end up winning. The latter might be more 'sure' in that it offers black less 'wiggle room' and thus more difficult for black to play against.
Besides, many games are 'called/adjudicated' and not actually played 'to mate'.