From Stockfish Jan 26 (abrok/stockfish)
I wonder, why did they allow that STC Elo -6,39 ???
---------------------------
Author: Michael Chaly
Date: Fri Jan 26 20:55:16 2024 +0100
Timestamp: 1706298916
Do more double extensions
Parameter tweak from Black Marlin chess engine. Choose a significantly
lower value that triggers in 95% of cases, compared to the usual 84% in
standard benchmark runs.
Since the introduction by
https://github.com/official-stockfish/S ... ba36aca92e
this constant has only decreased in value over time.
2-16-17-18-21-22-25-26-52-71-75-93-140
Failed STC really fast:
https://tests.stockfishchess.org/tests/ ... 0db026df7b
LLR: -2.94 (-2.94,2.94) <0.00,2.00>
Total: 13216 W: 3242 L: 3485 D: 6489 Elo -6.39
Ptnml(0-2): 50, 1682, 3371, 1471, 34
Was reasonable at LTC:
https://tests.stockfishchess.org/tests/ ... 0db026e210
Elo: 1.18 ± 1.5 (95%) LOS: 94.3%
Total: 50000 W: 12517 L: 12347 D: 25136 Elo +1.18
Ptnml(0-2): 31, 5598, 13579, 5754, 38
nElo: 2.45 ± 3.0 (95%) PairsRatio: 1.03
Recent Stockfish Development Version (Jan26)
Moderator: Ras
-
ernest
- Posts: 2053
- Joined: Wed Mar 08, 2006 8:30 pm
-
Ciekce
- Posts: 197
- Joined: Sun Oct 30, 2022 5:26 pm
- Full name: Conor Anstey
Re: Recent Stockfish Development Version (Jan26)
because it gained at longer time controls
that should be fairly obvious, no?
-
CornfedForever
- Posts: 650
- Joined: Mon Jun 20, 2022 4:08 am
- Full name: Brian D. Smith
Re: Recent Stockfish Development Version (Jan26)
While I am certainly not the person to answer that....perhaps because it was so beneficial for VLTC and VVLTC?
I mean, by it's very nature, STC is going to be more shallow and less precise than those two.
In any case, I've always said they are engaging in 'wishcraft' more often than they would like to admit - throwing this and that at the wall, hoping a quarter of an elo sticks, taking something out, putting something (back) in.
That's not to say it's a stupid approach of course. Ponderous for sure; but it gets them from point a to b to c...sometimes with a half a step back and sometimes, like a snail, they get lucky and stumble down the slopes a bit quicker.
-
Whiskers
- Posts: 246
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
Re: Recent Stockfish Development Version (Jan26)
ernest wrote: ↑Wed Feb 07, 2024 3:48 am From Stockfish Jan 26 (abrok/stockfish)
I wonder, why did they allow that STC Elo -6,39 ???
---------------------------
Author: Michael Chaly
Date: Fri Jan 26 20:55:16 2024 +0100
Timestamp: 1706298916
Do more double extensions
Parameter tweak from Black Marlin chess engine. Choose a significantly
lower value that triggers in 95% of cases, compared to the usual 84% in
standard benchmark runs.
Since the introduction by
https://github.com/official-stockfish/S ... ba36aca92e
this constant has only decreased in value over time.
2-16-17-18-21-22-25-26-52-71-75-93-140
Failed STC really fast:
https://tests.stockfishchess.org/tests/ ... 0db026df7b
LLR: -2.94 (-2.94,2.94) <0.00,2.00>
Total: 13216 W: 3242 L: 3485 D: 6489 Elo -6.39
Ptnml(0-2): 50, 1682, 3371, 1471, 34
Was reasonable at LTC:
https://tests.stockfishchess.org/tests/ ... 0db026e210
Elo: 1.18 ± 1.5 (95%) LOS: 94.3%
Total: 50000 W: 12517 L: 12347 D: 25136 Elo +1.18
Ptnml(0-2): 31, 5598, 13579, 5754, 38
nElo: 2.45 ± 3.0 (95%) PairsRatio: 1.03
The stockfish devs care way more about VLTC strength than STC because even the CCRL blitz time control(120" + 1.2") is considered a "very long time control" by testing standards, and pretty much all important chess engine tournaments are played at longer time controls than that (except for CCC blitz, which is "merely" LTC length, but the huge amount of threads more than makes up for it).
As for home analysis, a 120" + 1.2" time control is usually not going to correspond to more than 10 seconds spent on a particular move. How long do you let Stockfish think when you're analyzing at home? Probably closer to a minute. So home analysis is also "VLTC time control".
In general VLTC is a much more useful time control to us than STC, so it makes sense to sacrifice some STC elo in the name of VLTC elo.
go and star https://github.com/Adam-Kulju/Patricia!
-
Ciekce
- Posts: 197
- Joined: Sun Oct 30, 2022 5:26 pm
- Full name: Conor Anstey
Re: Recent Stockfish Development Version (Jan26)
I really do not get the need that people often seem to have, to comment on the SF development process as if they know better.CornfedForever wrote: ↑Wed Feb 07, 2024 5:33 am In any case, I've always said they are engaging in 'wishcraft' more often than they would like to admit - throwing this and that at the wall, hoping a quarter of an elo sticks, taking something out, putting something (back) in.
-
Jouni
- Posts: 3741
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Recent Stockfish Development Version (Jan26)
26.1. was regression in NCM, Pohl and MCERL lists. Testing below 180 + 1,8 soon useless?
Jouni
-
ernest
- Posts: 2053
- Joined: Wed Mar 08, 2006 8:30 pm
-
CornfedForever
- Posts: 650
- Joined: Mon Jun 20, 2022 4:08 am
- Full name: Brian D. Smith
Re: Recent Stockfish Development Version (Jan26)
Perhaps you should have read the REST of my comment? The OP was asking about the loss of elo at STC. elo comes...and...it goes with these developmental versions...it's in the nature of the testing. You can't argue otherwise. What's important is that the progress continues to slope up in the long run.Ciekce wrote: ↑Wed Feb 07, 2024 1:43 pmI really do not get the need that people often seem to have, to comment on the SF development process as if they know better.CornfedForever wrote: ↑Wed Feb 07, 2024 5:33 am In any case, I've always said they are engaging in 'wishcraft' more often than they would like to admit - throwing this and that at the wall, hoping a quarter of an elo sticks, taking something out, putting something (back) in.
-
Ciekce
- Posts: 197
- Joined: Sun Oct 30, 2022 5:26 pm
- Full name: Conor Anstey
Re: Recent Stockfish Development Version (Jan26)
I did read the rest of your comment. It wasn't relevant to the point I was making, so I didn't quote it.CornfedForever wrote: ↑Thu Feb 08, 2024 7:05 am Perhaps you should have read the REST of my comment? The OP was asking about the loss of elo at STC. elo comes...and...it goes with these developmental versions...it's in the nature of the testing. You can't argue otherwise. What's important is that the progress continues to slope up in the long run.
as you've been told an *embarrassing* amount now, you are taking noise as a regression and need to learn what an error bar is, because apparently it's not within anyone's power to teach you
-
Jouni
- Posts: 3741
- Joined: Wed Mar 08, 2006 8:15 pm
- Full name: Jouni Uski
Re: Recent Stockfish Development Version (Jan26)
11.2. version has confirmed -4 elo regression now. In discord big discussion which patch is causing it. One quess "triple ext regresses"?
Jouni