Hi,
Sometimes when I'm analysing a position with Stockfish, the evaluation can vary quite significantly while stepping into the PV. Of course, I don't expect the eval to stay the same since afterall Stockfish may then be searching the subposition to a different depth, etc. However, my question is, do some engines tend to have a more consistent/stable evaluation than others when doing one step into the PV?
cheers
Gordon
Engine evaluation consistency/stability
Moderators: hgm, Rebel, chrisw
-
- Posts: 194
- Joined: Thu Aug 06, 2009 8:04 pm
- Location: UK
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Engine evaluation consistency/stability
I did a test some months ago. Search for EARS or engine analysis reliabilty score. I think I did a 6 ply comparison. Source code is available so you can do some experiments with latest engines.gordonr wrote: ↑Wed Oct 09, 2019 1:15 pm Hi,
Sometimes when I'm analysing a position with Stockfish, the evaluation can vary quite significantly while stepping into the PV. Of course, I don't expect the eval to stay the same since afterall Stockfish may then be searching the subposition to a different depth, etc. However, my question is, do some engines tend to have a more consistent/stable evaluation than others when doing one step into the PV?
cheers
Gordon
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Engine evaluation consistency/stability
I prefer Houdini for most analysis because its eval doesn't bounce around so much.
-
- Posts: 3293
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Engine evaluation consistency/stability
In TCEC SF has many games (white and black) with 0,00 evaluation entire game Some games are over 100 moves. Chess is soon solved?
Jouni
-
- Posts: 473
- Joined: Thu Dec 27, 2007 9:34 pm
Re: Engine evaluation consistency/stability
I believe that chess is still very far away from being solved. Even in the year 4020 (we are now in the year 2020) chess would still not be completely solved.
You see chess is so very complex that chess engines still have many weaknesses in their chess understanding and knowledge of the game, compared to humans.
-
- Posts: 194
- Joined: Thu Aug 06, 2009 8:04 pm
- Location: UK
Re: Engine evaluation consistency/stability
Thanks everyone for their help. Ferdy, I found your excellent post. Very interesting and useful indeed.
http://talkchess.com/forum3/viewtopic.p ... rs#p792684
http://talkchess.com/forum3/viewtopic.p ... rs#p792684
-
- Posts: 12541
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Engine evaluation consistency/stability
You can fix the stockfish sewing machine with this simple thing:gordonr wrote: ↑Wed Oct 09, 2019 1:15 pm Hi,
Sometimes when I'm analysing a position with Stockfish, the evaluation can vary quite significantly while stepping into the PV. Of course, I don't expect the eval to stay the same since afterall Stockfish may then be searching the subposition to a different depth, etc. However, my question is, do some engines tend to have a more consistent/stable evaluation than others when doing one step into the PV?
cheers
Gordon
In ucioptions.cpp (set to false from the GUI):
Code: Select all
o["Show Fail High and Fail Low"] << Option(true);
Code: Select all
bool bSewingMachine = Options["Show Fail High and Fail Low"];
Code: Select all
// When failing high/low give some update (without cluttering
// the UI) before a re-search.
if ( mainThread
&& multiPV == 1
&& (bestValue <= alpha || bestValue >= beta)
&& (Time.elapsed() > 3000 ) && bSewingMachine)
sync_cout << UCI::pv(rootPos, rootDepth, alpha, beta) << sync_endl;
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Engine evaluation consistency/stability
Chess isn't close at all to being solved. People have claimed that they can produce perfect chess moves on the fly, but if this was true, they could make an opening book where their moves were played up to the point where an unassisted engine could draw the game from there at bullet chess. That this hasn't been done and bullet chess is fine draw-wise means those people still have to work hard to produce "perfect chess", and the only reason those people haven't lost yet is because they haven't played enough games for that.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Engine evaluation consistency/stability
Hmmm, no word about Leela?
It is stable in both time (search) and along the PV, if there is not a lot of tactics. Compared to AB engines I know, much more stable. And for tactics, AB engines are complementary to Leela, they find it and stick to it.
It is stable in both time (search) and along the PV, if there is not a lot of tactics. Compared to AB engines I know, much more stable. And for tactics, AB engines are complementary to Leela, they find it and stick to it.
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Engine evaluation consistency/stability
Or one big tactic. But you never know, the analysis will become inconsistent once she sees it, so it's not stable.
Your beliefs create your reality, so be careful what you wish for.