Well, you don't really disagree with the statement, either ("no sign so far").
I'd hope they'd get stronger in the endgame & middlegame before they get stronger in the opening.
Moderators: hgm, Rebel, chrisw
Well, you don't really disagree with the statement, either ("no sign so far").
The non-deterministic behavior of Stockfish caused a lot of factors: Thread handling of Windows and its own actions during the games (the main factors), fluctuations in processor timings, heat effect on processor parameters, etc. - even the cosmic ray has an effect on the actual results. The sensitivity of AB engines to disturbances is enhanced if there are many selectable move with similar value of evaluation.Uri Blass wrote: ↑Sat Jul 13, 2019 6:00 am ...
In the match between stockfish and lc0 without opening book it was usually stockfish who chose a different move in the opening and not lc0 so it seems that alpha beta engines do not tend to repeat the same move again and again.
Stockfish is even not deterministic in the first move and may choose 1.e4 or 1.d4 with the same number of nodes based on luck when it use more than one core.
If you give stockfish to play against itself 1000 games with the same time control without book you will probably not find even 2 identical games.
I would like TCEC to repeat a tournament like the following without opening book.
https://cd.tcecbeta.club/archive.html?s ... l=1&game=1
Stockfish played e4 in game 1 and d4 in game 3 and you will not find 2 identical game and the the quilty program is mainly stockfish.
Note that maybe some tool to make stockfish more deterministic can help to increase stockfish's playing strength and it should have some preference between specific moves.
Maybe some small bonus for moves that statistically are good moves can help(so it will have generally an opinion that Ne4-d6 for white is 0.04 better than Ne4-d2 so the scores of lines that start with the root move Ne4-d6 will get a bonus of 0.04 pawns relative to lines that start with Ne4-d2).
Of course you need a good table of bonuses for possible moves that increase the playing strength and I do not think that a table only based on the move is the best idea and you may consider other factors for example if the position is opening or endgame.
If you can build an AB engine without PSQT, do it.Zenmastur wrote: ↑Sat Jul 13, 2019 1:43 pm ...
The PSQT in AB engines is part of the problem. Eliminating it's influence on opening play would be a boon if you ask me. My opinion of the “built-in” book of NN engines is that the engine would be better off if it were removed by not allowing the NN to actually taking part in the opening in favor of a MAB book in it's place. That way the NN can concentrate it's limited abilities on the middle game. i.e no training in the opening. Let the book routine take care of that aspect of the game.
I think to make a correct self play test to determine the Elo enhancement of the developing it needs different kind of start positions than it needs for making an opening book.Zenmastur wrote: ↑Sat Jul 13, 2019 1:43 pm ...
As I already pointed out this time can be hidden in the testing frame work for AB engines. In the case where it wasn't desirable to use test games to produce the book, if could still be constructed using very fast games like 1 +0.1 seconds or even faster if node count in the search is used instead a time constraint. Test games on SF testing frame work can proceed at 1000 games per minute. With the same resources, (e.g. cores * time) perhaps 2,000 to 5,000 games per minute could be played. While it would be somewhat time consuming, when the frame work is empty this would be a good use for it's available resources.
There is no need to create a new book just because a new version of the program has come out. In fact, you can make this type of book using MC play out without any engine, just the book routine and the MC routines. It will, of course, converge to good solutions much faster and be much smaller in size if an engine is involved in it's creation. Even if the engine uses only 100 milliseconds or less per move it will still produce a good book and the time required to get the book up and running would be considerably shortened.
Yes, but later I did a multiPV=5 and Bd7 gave a non-zero value.zullil wrote: ↑Sat Jul 13, 2019 3:34 pmThanks for the analysis. Does your output represent Stockfish with MultiPV=3 to depth 50? Is it possible that 14...Bd7 also evaluates as 0.00? That seems uncertain from the content you've posted.Zenmastur wrote: ↑Fri Jul 12, 2019 10:38 pm
It looks like 14... Bd7 wasn't the best move.
It's been so long since I posted on this forum I forgot how to post a board!Code: Select all
( [Stockfish 010719 64 BMI2] 50:+0.00 14...Bg5 15.h4 Be3+ 16.Kh2 f6 17.Nf3 Bxc1 18.Rxc1 Qb6 19.Rb1 Kh8 20.Qe2 Bd7 21.Bh3 Qc7 22.Rg1 Ne7 23.Nd2 b6 24.Rbf1 h6 25.Bg4 Rg8 26.Bh5 Raf8 27.b3 Kh7 28.a4 Qc6 29.Nc4 Bc8 30.Qf3 Qc7 31.Nd2 Bd7 32.Ra1 Qb7 33.Nc4 Qc6 34.Bg4 Qc7 35.Nd2 Rc8 36.Bh5 Rcf8 37.Qd1 Qc6 38.g4 g6 39.Bxg6+ Nxg6 40.fxg6+ Rxg6 ) ( [Stockfish 010719 64 BMI2] 50:+0.00 14...a5 15.Nf3 a4 16.g4 f6 17.Rf2 g5 18.h4 h6 19.Bf1 Kf7 20.Rh2 Bd7 21.Bd2 Qc7 22.Qc1 Ke8 23.Rb1 Kd8 24.Be1 Kc8 25.a3 b6 26.h5 Bd6 27.c4 Na5 28.Nd2 Nb3 29.Qd1 Na5 30.Qc2 Bc6 31.Qc1 Nb3 32.Qc2 ) ( [Stockfish 010719 64 BMI2] 50:+0.00 14...b5 15.a4 bxa4 16.Rxa4 a5 17.h4 Ba6 18.Ra1 Qc7 19.Kh2 Rfb8 20.Nc4 Qd8 21.Rf3 Qc7 )
Regards,
Zenmastur
OK, so Stockfish considers 14...Bd7 worse than at least three other moves. Still, I can't imagine it's bad enough to cost the game, i.e., the correct evaluation of the position after 14...Bd7 is still 0.00 (=DRAW). On the other hand, take a look at 21...Bb4, if you get a chance. That appears to be a very significant error. Perhaps sufficient to convert DRAW into LOSS.Zenmastur wrote: ↑Sat Jul 13, 2019 9:27 pmYes, but later I did a multiPV=5 and Bd7 gave a non-zero value.zullil wrote: ↑Sat Jul 13, 2019 3:34 pmThanks for the analysis. Does your output represent Stockfish with MultiPV=3 to depth 50? Is it possible that 14...Bd7 also evaluates as 0.00? That seems uncertain from the content you've posted.Zenmastur wrote: ↑Fri Jul 12, 2019 10:38 pm
It looks like 14... Bd7 wasn't the best move.
It's been so long since I posted on this forum I forgot how to post a board!Code: Select all
( [Stockfish 010719 64 BMI2] 50:+0.00 14...Bg5 15.h4 Be3+ 16.Kh2 f6 17.Nf3 Bxc1 18.Rxc1 Qb6 19.Rb1 Kh8 20.Qe2 Bd7 21.Bh3 Qc7 22.Rg1 Ne7 23.Nd2 b6 24.Rbf1 h6 25.Bg4 Rg8 26.Bh5 Raf8 27.b3 Kh7 28.a4 Qc6 29.Nc4 Bc8 30.Qf3 Qc7 31.Nd2 Bd7 32.Ra1 Qb7 33.Nc4 Qc6 34.Bg4 Qc7 35.Nd2 Rc8 36.Bh5 Rcf8 37.Qd1 Qc6 38.g4 g6 39.Bxg6+ Nxg6 40.fxg6+ Rxg6 ) ( [Stockfish 010719 64 BMI2] 50:+0.00 14...a5 15.Nf3 a4 16.g4 f6 17.Rf2 g5 18.h4 h6 19.Bf1 Kf7 20.Rh2 Bd7 21.Bd2 Qc7 22.Qc1 Ke8 23.Rb1 Kd8 24.Be1 Kc8 25.a3 b6 26.h5 Bd6 27.c4 Na5 28.Nd2 Nb3 29.Qd1 Na5 30.Qc2 Bc6 31.Qc1 Nb3 32.Qc2 ) ( [Stockfish 010719 64 BMI2] 50:+0.00 14...b5 15.a4 bxa4 16.Rxa4 a5 17.h4 Ba6 18.Ra1 Qc7 19.Kh2 Rfb8 20.Nc4 Qd8 21.Rf3 Qc7 )
Regards,
Zenmastur
Regards,
Zenmastur
The thing is that after the game is played I can go offline and analyze what happened. I can "emulate" people's hardware by depth (and everything matches, so these people are really reaching those nodes per second and depth). I don't think I could have done better than them with better hardware. I can only conclude that if I had much stronger hardware (at least jumping from 2000 kn/s to 6000 kn/s) I'd only play around 30 elo better. This only can mean that when optimal lines are played from both sides, Stockfish really sucks at using the extra resouces, and you get this instead of the 50 elo per doubling perceived with generic openings.Zenmastur wrote: ↑Sat Jul 13, 2019 8:57 amI'd go as far as saying "under most circumstances." It's unbelievable that I'm performing only 30 ELO worse than people with fast Intel i7s. I think this is another problem hidden by generic book testing, where sub-optimal positions are reached and in those the extra resources are used effectively, but when the weak side plays into optimal positions the extra resources don't make any difference.
I wouldn't read too much into your performance against other players. There are way to many factors that could affect their and your performance other than raw hardware speed. First, they might be idiots. Second, they may have lied about their hardware. Third you may just be better at analyzing efficiently with an engine. It could be almost anything. I don't even try to guess what my opponents are doing or the hardware they are using. I do make note of the number of on going games they have and their previous performance. Other than that I just play my “A” game and call it good.
My claim: Stockfish played the error much earlier than it thinks. Several alternatives suggested by Stockfish would have also lost it the game (say, MultiPV shows 0.00 for 3 alternative moves, but one of them loses, how do you decide what to play? And if Stockfish is one move away from randomly losing the game, should it really show 0.00 on the previous move if we don't know it'll play the losing one in the next? Does 33% chance of losing the game in the next move look like 0.00?) This happens on a regular basis.I seemed to have found the error in the first game but have't as yet looked at the second.
As I stated basically the AB engines are made for middle games so they need a good opening book and a good endgame database for a near "perfect" play. NN engines work well during opening phase because they have inherently an "opening book" but they also need a good endgame database.Ovyron wrote: ↑Sun Jul 14, 2019 9:20 am ...
My claim: Stockfish played the error much earlier than it thinks. Several alternatives suggested by Stockfish would have also lost it the game (say, MultiPV shows 0.00 for 3 alternative moves, but one of them loses, how do you decide what to play? And if Stockfish is one move away from randomly losing the game, should it really show 0.00 on the previous move if we don't know it'll play the losing one in the next? Does 33% chance of losing the game in the next move look like 0.00?) This happens on a regular basis.
I'm doing it, but right now it's not clear if upgrading to a faster CPU would be better than just cutting my games to a third and spending three times the time in my remaining games. At least it seems I'd be better off than Stockfish using the resources of a 3 times faster CPU badly.
The reason I selected this node was it was the first node where I thought the game strayed from “best play”. As in many games, one sides position slowly deteriorates due to inaccurate moves. In many cases the individual moves taken by themselves aren't enough to cause a loss. The cumulative effect of several inaccuracies is what causes the position to become lost. Sometimes it's very difficult to determine which move is the one that changes the “true” evaluation from a draw to a loss. E.g. After 21. … Bb4 22. Bc1 there are many replies possible. (eg. a6, a5, Bc3, Qd6, Qc7, Rf7, and Kh8 for example). I did a deep analysis on one of these variation (Rf7) and +1.72 was as high as the score got. The position was such that it still looked contested. So, maybe Bb4 isn't the losing move. It would take more time than I have. So I left it there. I try to avoid the whole issue by backing up till the first detectable inaccuracy if I can. So I did look at 21. … Bb4 in depth and yes the position is bad after that move.zullil wrote: ↑Sat Jul 13, 2019 9:57 pm
OK, so Stockfish considers 14...Bd7 worse than at least three other moves. Still, I can't imagine it's bad enough to cost the game, i.e., the correct evaluation of the position after 14...Bd7 is still 0.00 (=DRAW). On the other hand, take a look at 21...Bb4, if you get a chance. That appears to be a very significant error. Perhaps sufficient to convert DRAW into LOSS.
3 & 4 is a nice compromise, something is better than nothing and you're not spending a mini fortune on something you know you're going to replace. You're wife seems very level-headed, I would listen to her.Zenmastur wrote: ↑Tue Jul 16, 2019 3:51 amThe reason I selected this node was it was the first node where I thought the game strayed from “best play”. As in many games, one sides position slowly deteriorates due to inaccurate moves. In many cases the individual moves taken by themselves aren't enough to cause a loss. The cumulative effect of several inaccuracies it what causes the position to become lost. Sometimes it's very difficult to determine which move is the one that changes the “true” evaluation from a draw to a loss. E.g. After 21. … Bb4 22. Bc1 there are many replies possible. (eg. a6, a5, Bc3, Qd6, Qc7, Rf7, and Kh8 for example). I did a deep analysis on one of these variation (Rf7) and +1.72 was as high as the score got. The position was such that it still looked contested. So, maybe Bb4 isn't the losing move. It would take more time than I have. So I left it there. I try to avoid the whole issue by backing up till the first detectable inaccuracy if I can. So I did look at 21. … Bb4 in depth and yes the position is bad after that move.zullil wrote: ↑Sat Jul 13, 2019 9:57 pm
OK, so Stockfish considers 14...Bd7 worse than at least three other moves. Still, I can't imagine it's bad enough to cost the game, i.e., the correct evaluation of the position after 14...Bd7 is still 0.00 (=DRAW). On the other hand, take a look at 21...Bb4, if you get a chance. That appears to be a very significant error. Perhaps sufficient to convert DRAW into LOSS.
In the mean time my computer developed problems. It was down for over a day. I diagnosed it as either a MB VRM problem OR a power supply problem. I ordered a new power supply which will be here tomorrow. In the mean time my machine is crippled. So, for at least a few days I won't be doing any deep analysis.
If that fixes the problem then I'll wait to build a new machine. If it doesn't fix the problem then I have several options.
1. Do nothing and SUFFER for the rest of my life.
2. Wait for high core count Cpu's and suffer in the mean time.
3. Build a Ryzen 5 3600 machine as an interim machine until high core CPU's are available
4. If 2. then build a high core machine and give my wife the Ryzen 5 3600 machine.
5. Build a Ryzen 9 3900Z
6. If 4. build a high core machine when available and give R9 3900X to wife.
7. Build a Ryzen 9 3900X and keep as my main machine.
8. Wait for high core count Cpu's and suffer in the mean time.
My wife offered to take ANY interim system I build, including a Ryzen 3900X, as her new system. Then I would be free to build a 3950X or Threadripper system when they become available. I'm not sure how much use she'll get from the new system as she doesn't use her current desktop system and plans on buying a new laptop at the end of September in any case. It would be nice to have a 3900X system and a 3000 sieres Threadripper system in the house but I'm not sure it's worth the extra money. I don't really use her computer now and I don't see that changing much in the future. So, this option may sound good but in the end I think the interim system will be a waste.
I'm not much on suffering so option 1 and 2 don't sound too appealing. If I do option 67 I won't be very happy as it's not the machine I really want plus my wife keeps warning against this option.
So what do you think I should do if the power supply doesn't fix my problem?
Regards
Zenmastur