I guess I was thinking that it was similar to being able to make personalities in Chessmaster by changing the values of aspects like material values, king safety, pawn structure, passed pawn awareness, mobility, etc.Viz wrote: ↑Wed Sep 25, 2024 10:05 pm I recall stockfish dev vs some previous stockfish from startposition played on fishtest.
And it had the most insane variance from any other test, literally some workers showing +100 elo and some -100 elo and this was stable for this workers.
Problem was that some were underclocked and some were overclocked, so some ran at 10.1+0.101 and some were running at 9.9+0.099.
and this was the sole reason why it would get from -100 to +100 elo from the same position at the single core.
And there you show some "wow, look at this, I made a checkbox and games are completely different, must be a lot of work" - hell, no.
You can achieve the same if not bigger result by not changing engines at all but changing time / game (from my example change doesn't even need to be big), hash, threads and other stuff.
This is constantly shown at so-called alt-finals at navratil, this guy just replays TCEC finals on more powerful hardware but with the same nps ratio. Game pair win for stockfish at TCEC can change to game pair win for leela and game pair win for Leela at TCEC can change to game pair win for SF. Not even talking about such "minor" things as double win becoming double draws and double draws becoming double wins - this also happens a lot.
And trust me this would also happen if he used exactly the same hardware for both engines as TCEC does.
Shashin theory
Moderator: Ras
-
- Posts: 44598
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: Shashin theory
gbanksnz at gmail.com
-
- Posts: 7381
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Shashin theory
My opinion as well, simple reason - if Shashin theory is so special it would have been the default setting.Peter Berger wrote: ↑Wed Sep 25, 2024 8:10 pm I had read this one, but I didn't think it necessarily amounted to much. If you look at the strange README on Shashin in Shash, where do these +-, += etc even come from to put you into Major Petrosian or Minor Tal mode? From Stockfish would be my guess.
So I assumed Shashchess and Stockfish might be completely identical at depth=1 without this meaning anything too interesting or new.
And then maybe it changes some search parameters based on low level depths at deeper plies. This can't be too much of a deal anyway, as in my personal tests Shashchess and Stockfish behave in a very similar way when it is about chess moves chosen also at higher depths - not to forget Shashchess loses nearly no strength compared to Stockfish.
I didn't look at the source code at all as I am no programmer, so I don't expect to be able to detect anything the ICGA wouldn't have seen anyway during their two month investigation of things.
We share the same impression on the implementation of the "Shashin theory" here - my personal bet is that this is mostly bullshit - but now me, I am out, as I simply lack the knowledge to do more than "suspecting".
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 7381
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Shashin theory
That technique is not possible any longer with NNUE because you only get an eval value from the NN with no strings attached. Of course you can fiddle the score with HCE code and make it a personality, a bit of the world upside down.Graham Banks wrote: ↑Wed Sep 25, 2024 11:08 pmI guess I was thinking that it was similar to being able to make personalities in Chessmaster by changing the values of aspects like material values, king safety, pawn structure, passed pawn awareness, mobility, etc.Viz wrote: ↑Wed Sep 25, 2024 10:05 pm I recall stockfish dev vs some previous stockfish from startposition played on fishtest.
And it had the most insane variance from any other test, literally some workers showing +100 elo and some -100 elo and this was stable for this workers.
Problem was that some were underclocked and some were overclocked, so some ran at 10.1+0.101 and some were running at 9.9+0.099.
and this was the sole reason why it would get from -100 to +100 elo from the same position at the single core.
And there you show some "wow, look at this, I made a checkbox and games are completely different, must be a lot of work" - hell, no.
You can achieve the same if not bigger result by not changing engines at all but changing time / game (from my example change doesn't even need to be big), hash, threads and other stuff.
This is constantly shown at so-called alt-finals at navratil, this guy just replays TCEC finals on more powerful hardware but with the same nps ratio. Game pair win for stockfish at TCEC can change to game pair win for leela and game pair win for Leela at TCEC can change to game pair win for SF. Not even talking about such "minor" things as double win becoming double draws and double draws becoming double wins - this also happens a lot.
And trust me this would also happen if he used exactly the same hardware for both engines as TCEC does.
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 3410
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: Shashin theory
Why not simply compare output at single positions?
Tactical single best move game changers out of Tal positions (with clear advantage of side to move) showing better time to solution with High Tal checked are numerous and well known to all users testing with such positions now and then, with single positions and with suites, most of those "classical" collections contain mainly such more or less difficult (as for hardware- time to be solved) winners, so easiest way was to show better results at such suites, especially those containing composed studies too, disregarding how much it means as for game playing or not, differences get visible that way most quickly and clearly.
And there are positions with single best moves as for defending positions out of disadvantage for side to move too, just one example of such a Petrosian- position from corr. chess, found in two games of 2021, one is Petrov M.-Sikorsky H., bm 19...Qd7:
So at depth 32 after about 34 seconds best move is found and kept stable in output then.
And default:
It's just one example of a maybe not surely game changing move, yet it's a clear single best move. And not even that would be the point to be discussed here, seeing a clear difference in output- lines over ponder- time single threaded on same hardware with same amount of hash proves a difference in search and eval between the two settings, doesn't it?
Tactical single best move game changers out of Tal positions (with clear advantage of side to move) showing better time to solution with High Tal checked are numerous and well known to all users testing with such positions now and then, with single positions and with suites, most of those "classical" collections contain mainly such more or less difficult (as for hardware- time to be solved) winners, so easiest way was to show better results at such suites, especially those containing composed studies too, disregarding how much it means as for game playing or not, differences get visible that way most quickly and clearly.
And there are positions with single best moves as for defending positions out of disadvantage for side to move too, just one example of such a Petrosian- position from corr. chess, found in two games of 2021, one is Petrov M.-Sikorsky H., bm 19...Qd7:
Code: Select all
ShashChess 36 by A. Manzo, F. Ferraguti, K. Kiniama and Stockfish developers (see AUTHORS file)
position fen r1bq2rk/ppp5/3p1nnb/P1PPp1pp/1P2Pp2/2N2P2/3BBNPP/R2Q2RK b - - 0 19
setoption name hash value 2048
setoption name High Petrosian value true
go depth 40
...
info depth 32 seldepth 39 multipv 1 score cp -66 wdl 6 738 256 upperbound nodes 28074653 nps 1115135 hashfull 90 tbhits 0 time 25176 pv g8g7 a1a2
...
info depth 32 seldepth 44 multipv 1 score cp -61 wdl 7 768 225 lowerbound nodes 38169000 nps 1115922 hashfull 130 tbhits 0 time 34204 pv d8d7
...
info depth 32 seldepth 44 multipv 1 score cp -58 wdl 7 784 209 nodes 39897910 nps 1115120 hashfull 136 tbhits 0 time 35779 pv d8d7 a1c1 g5g4 c5d6 c7d6 c3b5 g4g3 h2g3 d7d8 f2h3 g8g7 e2f1 a7a6 b5a3 f6g8 g3f4 g6f4 d1e1 d8f6 d2f4 h6f4
...
info depth 40 seldepth 52 multipv 1 score cp -64 wdl 6 749 245 nodes 119065944 nps 1116302 hashfull 400 tbhits 0 time 106661 pv d8d7 a1c1 g5g4 c5d6 c7d6 c3b5 g4g3 h2g3 d7d8 f2h3 g8g7 c1c8 a8c8 b5a7 c8a8 a7c6 b7c6 d5c6 f6h7 b4b5 h7g5 b5b6 g5h3 g2h3 d8c8 g3g4 c8c6 g4g5 c6d7 d1f1 g6h4 a5a6 h6g5 e2b5 d7f7 b6b7
bestmove d8d7 ponder a1c1
And default:
Code: Select all
ShashChess 36 by A. Manzo, F. Ferraguti, K. Kiniama and Stockfish developers (see AUTHORS file)
position fen r1bq2rk/ppp5/3p1nnb/P1PPp1pp/1P2Pp2/2N2P2/3BBNPP/R2Q2RK b - - 0 19
setoption name hash value 2048
go depth 40
...
info depth 40 seldepth 62 multipv 1 score cp -65 wdl 6 746 248 nodes 273285222 nps 1074656 hashfull 712 tbhits 0 time 254300 pv g8g7 a1a2 c8d7 a5a6 b7a6 e2a6 g5g4 c5c6 d7c8 a6c8 d8c8 f2d3 a7a6 d1e2 g6h4 d2e1 h4g6 b4b5 a6b5 a2a8 c8a8 c3b5 h6g5 e1f2 h8h7 f2a7 h7h6 g1b1 g4f3 e2f3 a8c8 b1a1 c8g8 f3e2 g5h4 a7g1
bestmove g8g7 ponder a1a2
Peter.
-
- Posts: 694
- Joined: Sun Nov 08, 2015 11:10 pm
- Full name: Bojun Guo
Re: Shashin theory
Nobody said there is no difference, people are just saying there is less than 3% difference between ShashChess and Stockfish regardless of settings while there is more than 30% difference between Stockfish's own versions under the same conditions.
Not to mention across various other engines, such ratio is still consistent. So whatever the difference there is, it is pretty insignificant. Even the code that made such differences aren't really original, they are taken from Crystal, I suspect using just Crystal would solve those positions even better and what does any of that have anything to do with Shashin theory?
Not to mention across various other engines, such ratio is still consistent. So whatever the difference there is, it is pretty insignificant. Even the code that made such differences aren't really original, they are taken from Crystal, I suspect using just Crystal would solve those positions even better and what does any of that have anything to do with Shashin theory?
-
- Posts: 693
- Joined: Sun Aug 04, 2013 1:19 pm
Re: Shashin theory
Try the Top Chess Engines Testsuite 2024 v2.noobpwnftw wrote: ↑Thu Sep 26, 2024 1:10 am Nobody said there is no difference, people are just saying there is less than 3% difference between ShashChess and Stockfish regardless of settings while there is more than 30% difference between Stockfish's own versions under the same conditions.
Not to mention across various other engines, such ratio is still consistent. So whatever the difference there is, it is pretty insignificant. Even the code that made such differences aren't really original, they are taken from Crystal, I suspect using just Crystal would solve those positions even better and what does any of that have anything to do with Shashin theory?
https://www.mediafire.com/file/cypaz2t0 ... 2.pgn/file
Stockfish 16.1 (20%) 23/115
Stockfish 17 (44%) 51/115
ShashChess 35 High Tal + MultiPV=4 + MCTS ON + MCTSThreads = 2 (80%) 93/115
It's very clear that ShashChess is much better suitable to find something, when you have a position on the board which has something.
If you have a position on the board which has nothing, then Stockfish will have the more precise evaluation in +0.01 steps.
The people must decide what they prefer to use.
-
- Posts: 694
- Joined: Sun Nov 08, 2015 11:10 pm
- Full name: Bojun Guo
Re: Shashin theory
Not only it is a sample of 115, but also your definition of the "suitable" is entirely unfounded for even the very basics like threads and MPV are inconsistent in your result and whatever move they may find is right for you is entirely subjective. Even with that, why not compare it with Crystal where the relevant code was yoinked from?
None of that addresses the not so much difference of any significance issue though, and again what does it have anything to do with Shashin theory?
None of that addresses the not so much difference of any significance issue though, and again what does it have anything to do with Shashin theory?
-
- Posts: 4638
- Joined: Tue Apr 03, 2012 4:28 pm
- Location: Midi-Pyrénées
- Full name: Christopher Whittington
Re: Shashin theory
Powerful take on Shashin to add to the discussion ...
-
- Posts: 3410
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: Shashin theory
That depends on the conditions and corresponding to those you have to have a relation between "differences" in performances or similarities and the test- specific error bars. With each and every test you have immanent statistical confidence depending on openings (if it's about game playing), hardware- TC and pool of engines running.noobpwnftw wrote: ↑Thu Sep 26, 2024 1:10 am Nobody said there is no difference, people are just saying there is less than 3% difference between ShashChess and Stockfish regardless of settings while there is more than 30% difference between Stockfish's own versions under the same conditions.
The ratios are as consistent als the tests are. What doesn't say they are transitive to each other at all. You may have consistent tests that are statistically significant on their own but are not to be compared at all to other ones, still also consistent and significant too, e.g. you may have perfectly consistent tests of positional testing "only" with exaclty determined error bars of their own and still (of course) cannot compare them in any way to game playing tests of certain openings, hardware- TC and engine- pool. You even cannot compare game playing tests "only" one to each other one, if openings, hardware- TCs and or engine- pool differ too much to each other.Not to mention across various other engines, such ratio is still consistent. So whatever the difference there is, it is pretty insignificant. Even the code that made such differences aren't really original, they are taken from Crystal, I suspect using just Crystal would solve those positions even better and what does any of that have anything to do with Shashin theory?
That all said now to you question about
< what does any of that have anything to with Shashin theory? >
We are (I am) here and now talking about Shashin theory as Andrea Manzo translated it into Stockfish- code and made it usable with UCI- parameters, as it's core is shown on ShashChess- github- site with the table of correlations between evals of positons and Tal to Petrosian classification.
What I wanted to show with the single one position above was the difference of output per time with and without option Petrosian checked for a Petrosion- position as for its classification per eval.
To see some more difference with some more positions I here have 1111 of those, that are to a very big part Tal- positions and did let them run as a suite with a TC of 1"/pos. in MEA with 4 threads of a 16x3.5GHz CPU, 32Mb hash. 2 settings of ShashChess together with SF dev. (240917) as well as SF17 and SF16.1, ShashChess 1x with High Tal false (default), 1x with High Tal true. Error bar for this one suite, engine- pool and hardware- TC is about 1.5% of Total Rate. In column Hash for Lc0 NN-cache replaces hash of A-B-engines and for this one engine a 3070ti Nvidia GPU is used too.
Code: Select all
EPD : 1111.epd
Time : 1000 ms
Max Total Time Hash
Engine Score Found Pos ELO Score Rate ms Mb Cpu
1 ShashChess36HTon 29051 980 1111 3568 36615 79.3% 1000 32 4
2 Crystal240503 28659 961 1111 3523 36615 78.3% 1000 32 4
3 Stockfish0917 28253 971 1111 3474 36615 77.2% 1000 32 4
4 ShashChess36 28011 961 1111 3442 36615 76.5% 1000 32 4
5 Stockfish17 27754 959 1111 3411 36615 75.8% 1000 32 4
6 SF16.1 27639 945 1111 3379 36815 75.1% 1000 32 4
7 Lc0v0.31.1-6147500PT27056 939 1111 3325 36615 73.9% 1000 100 2
8 Dragon3.3 24568 898 1111 3001 36815 66.7% 1000 32 4
Created with MEA
by
Ferdinand
Mosca
https://drive.google.com/file/d/1kVmfmK ... sp=sharing
, regards
Peter.
-
- Posts: 3410
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: Shashin theory
Edit- time over.
And in listing of the suites the 1111 positions were taken from, I forgot to mention STS (Strategic Test Suite) too, about 590 positions out of this one are used too, "only" those deriving from there, that don't have too many multiple solutions, thus fitting to the single best move- solutions of the other sources. The suite can therefore be used in GUIs instead of MEA too, multiple solutions then are simply adjudicated equally as solved, being listed as more than one with bm- syntax, only in MEA of course the points per solution- principle works.
should read <anything to do with> instead, mistyped leaving away one <do>, yet in direct quote from posting answered to it was correct anyhow.
And in listing of the suites the 1111 positions were taken from, I forgot to mention STS (Strategic Test Suite) too, about 590 positions out of this one are used too, "only" those deriving from there, that don't have too many multiple solutions, thus fitting to the single best move- solutions of the other sources. The suite can therefore be used in GUIs instead of MEA too, multiple solutions then are simply adjudicated equally as solved, being listed as more than one with bm- syntax, only in MEA of course the points per solution- principle works.
Peter.