I don't think what greenblatt described was anywhere near what we are talking about here. Greenblatt used static criteria to toss out (forward prune) moves even at the root. Not using dynamic search information such as history counters to order the moves on the fly and throw away the last few...lkaufman wrote:Don't you have to do eval on the last four mainsearch plies in order to do futility pruning? If so then not doing eval at interior nodes should only save about 1%. Anyway the two programs being compared both do eval at interior nodes as this is needed for pruning decisions.bob wrote:NPS is simply total nodes searched divided by time taken to search all of them. Or, in a simpler term, the time taken to search a single typical node, multiplied by the number of such nodes searched. There are lots of things one can do to change NPS. For example, use eval at all interior nodes. I typically see the q-search taking 80-90% of the total nodes searched, so adding eval to the interior nodes will slow it down by about 10%. What if that gives you better move ordering so that your tree is now 25% deeper on average? You are 10% slower, but getting to the same depth 25% faster. A net win.lkaufman wrote:Perhaps we'll try the profiler, but it won't answer the basic questions:jdart wrote:a few comments:lkaufman wrote: 1. How does Ivanhoe achieve such a huge lead in NPS over Stockfish?
2. What is Ivanhoe giving up to achieve this speed? It can't just be simpler eval, as the speed gap is too large. But Ivanhoe would be totally crushed by Stockfish on an equal-nodes basis. Why?
1. As you probably know, NPS across programs is not highly correlated to playing strength. A very simple program can have very high NPS and not be a a good chessplayer. To take another example, Crafty has quite high NPS relative to other programs, but is not nearly equal to IvanHoe in strength.
2. Chess programs are complex beasts, and all the pieces typically interact. So for example programs with different board representations will behave quite differently. Also pruning and evaluation decisions affect NPS quite a bit. Many programs do eval at every node because it is used in pruning decisions. If you don't do this, or do lazy eval, you will be faster, but not necessarily stronger.
3. On the x86 platform, cache and memory usage has a very large effect on speed, especially for multithreaded programs. A program can be optimized at a local level by the compiler but optimizing memory usage usually has to be done by design.
4. I haven't done this but running the programs you are comparing under a good profiler (such as VTune) will answer some of your questions about causes of speed differences.
--Jon
1. Is the reported NPS gap real or artificial?
2. If real, what accounts for SF being so much stronger than Ivanhoe on an equal nodes (rather than time) basis?
If the profiler shows much higher % of time spent in eval in SF, that could help explain #2. But if it shows no major differences in percentages, then what?
Comparing NPS is not a particularly interesting number, any more than comparing the RPM reading on two different vehicles tells you anything at all about how fast they are going relative to each other, how much fuel they are consuming, or how much noise they are making.
I agree that comparing NPS in general is rather pointless, but these are open source programs, so in theory the differences should be fully explainable. I am pointing to it because the program that does more NPS is much stronger at hyperspeed chess but not at regular chess. The question is whether the first observation explains the second.
Crafty is too unlike any of the top programs for comparisons to be relevant. One big difference is movecount pruning, which you correctly credited all the way back to Greenblatt in MacHack but which for some reason you don't use in Crafty. It took forty years to recognize that he was right!
Comparative nodes per second
Moderators: hgm, Rebel, chrisw
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Comparative nodes per second
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Comparative nodes per second (early results)
Code: Select all
2 Crafty-23.5R22-1 2662 4 4 30000 64% 2548 22%
3 Crafty-23.5R22-2 2661 4 4 30000 64% 2548 22%
7 Crafty-23.5R20-2 2658 4 4 30000 63% 2548 22%
8 Crafty-23.5-2 2657 4 4 30000 63% 2548 22%
10 Crafty-23.4-2 2641 4 4 30000 62% 2548 21%
11 Crafty-23.4-1 2639 4 4 30000 62% 2548 21%
12 Crafty-23.5R23 2636 7 7 7406 61% 2548 21%
-
- Posts: 5960
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: Comparative nodes per second
Why would you want to waste time on null move if you are (let's say) a piece behind beta?bob wrote:I always do null-move. I've tried all the alternatives and the only exception is that I use the hash table trick where entries with a draft that is too shallow can't be used to terminate the search, but they might have draft sufficient to tell me that a null-move search is hopeless.lkaufman wrote:What about for deciding whether to try null move or not? I don't see how Crafty could avoid calling the eval in the main search.jdart wrote:IIRC Crafty does futility based on material score only.Don't you have to do eval on the last four mainsearch plies in order to do futility pruning?
--Jon
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Comparative nodes per second
I decided to give this a cluster test since that is one option I do not recall having tried in real games. Initially, with a few positional tests, it actually looks like a reasonable idea (I just added && eval >= beta to the end of the rest of the null-move qualifiers such as not being in check, the hash table didn't say null is futile, etc..., so that the eval call is not done on every node, just the ones that pass the earlier filters). Reduced the size of the tree, and slowed the nps. But with a net gain on just a few test positions. Am now running a cluster test to see if it is a real gain in terms of Elo...Don wrote:For us there is no question. We have tried many permutations of the general idea from always doing the null move test regardless of the evaluation to restricting it to certain node types, etc.jdart wrote:No eval test is done in Crafty as a condition for null move. Arasan does not do this either. The efficacy of this is questionable.What about for deciding whether to try null move or not? I don't see how Crafty could avoid calling the eval in the main search.
--Jon
What has always worked best is to do null move pruning test only when the current evaluation is already >= beta. A lot of details in each program that could change this formula of course - but for us it's not a close enough call to even be ambiguous.
What does Arasan do? Do you always do the null move test?
More in a couple of hours...
-
- Posts: 5960
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: Comparative nodes per second
Well, he looked at a constant number of moves on the last ply, a bigger number on the previous one, etc., right? That is exactly what all the top programs do (with scoring conditions in some programs). Of course the criteria for selecting those moves is different now than forty-five years ago, but the principle is the same. If Greenblatt did this at the root, it was only because the computers then couldn't reach enough depth to justify going full width at the root. I don't think there is any real difference; Greenblatt deserves the credit for the idea.bob wrote:
I don't think what greenblatt described was anywhere near what we are talking about here. Greenblatt used static criteria to toss out (forward prune) moves even at the root. Not using dynamic search information such as history counters to order the moves on the fly and throw away the last few...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Comparative nodes per second
What he did in tournaments was a tapered search. 15 moves at root, ordered by plausibility, rest were dumped and never searched. Next 2 plies were 9, last two were 7, on the version of his code I once had and played around with on an old dec PDP-10. No dynamic information at all. At root, generate all moves, order them by the plausibility scores, then chop off all but first 15. If you use history with LMR or LMP, the search is providing feedback as the iterations increase to help recognize those moves which are least likely to be useful vs those that are most likely to help...lkaufman wrote:Well, he looked at a constant number of moves on the last ply, a bigger number on the previous one, etc., right? That is exactly what all the top programs do (with scoring conditions in some programs). Of course the criteria for selecting those moves is different now than forty-five years ago, but the principle is the same. If Greenblatt did this at the root, it was only because the computers then couldn't reach enough depth to justify going full width at the root. I don't think there is any real difference; Greenblatt deserves the credit for the idea.bob wrote:
I don't think what greenblatt described was anywhere near what we are talking about here. Greenblatt used static criteria to toss out (forward prune) moves even at the root. Not using dynamic search information such as history counters to order the moves on the fly and throw away the last few...
There's a huge leap from static forward pruning to dynamic forward pruning where the search guides the pruning as it unfolds... Much easier and more accurate to do an iteration, then on the next iteration, use information from the last to help choose what to discard, rather than doing it with absolute no searching of any kind, just pure static criteria.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Comparative nodes per second
Early results so-so. +2 elo after 15,000 games, with an error bar between +/-5. I am suspecting it is one of those "no improvement ideas" for me at present, but will let the test run twice to confirm...bob wrote:I decided to give this a cluster test since that is one option I do not recall having tried in real games. Initially, with a few positional tests, it actually looks like a reasonable idea (I just added && eval >= beta to the end of the rest of the null-move qualifiers such as not being in check, the hash table didn't say null is futile, etc..., so that the eval call is not done on every node, just the ones that pass the earlier filters). Reduced the size of the tree, and slowed the nps. But with a net gain on just a few test positions. Am now running a cluster test to see if it is a real gain in terms of Elo...Don wrote:For us there is no question. We have tried many permutations of the general idea from always doing the null move test regardless of the evaluation to restricting it to certain node types, etc.jdart wrote:No eval test is done in Crafty as a condition for null move. Arasan does not do this either. The efficacy of this is questionable.What about for deciding whether to try null move or not? I don't see how Crafty could avoid calling the eval in the main search.
--Jon
What has always worked best is to do null move pruning test only when the current evaluation is already >= beta. A lot of details in each program that could change this formula of course - but for us it's not a close enough call to even be ambiguous.
What does Arasan do? Do you always do the null move test?
More in a couple of hours...
-
- Posts: 171
- Joined: Wed Dec 28, 2011 8:44 pm
- Location: United States
Re: Comparative nodes per second
I tried this in EXchess a few weeks ago and it was a small net win. One thing I realized is that it should not slow down the program very much because I would need to do an eval anyway if the null move drops straight into qsearch. So if the eval >= beta, and I do the null move, I keep the score and just use it (with the sign flipped) when qsearch is called rather than repeat the eval.bob wrote:I decided to give this a cluster test since that is one option I do not recall having tried in real games. Initially, with a few positional tests, it actually looks like a reasonable idea (I just added && eval >= beta to the end of the rest of the null-move qualifiers such as not being in check, the hash table didn't say null is futile, etc..., so that the eval call is not done on every node, just the ones that pass the earlier filters). Reduced the size of the tree, and slowed the nps. But with a net gain on just a few test positions. Am now running a cluster test to see if it is a real gain in terms of Elo...Don wrote:For us there is no question. We have tried many permutations of the general idea from always doing the null move test regardless of the evaluation to restricting it to certain node types, etc.jdart wrote:No eval test is done in Crafty as a condition for null move. Arasan does not do this either. The efficacy of this is questionable.What about for deciding whether to try null move or not? I don't see how Crafty could avoid calling the eval in the main search.
--Jon
What has always worked best is to do null move pruning test only when the current evaluation is already >= beta. A lot of details in each program that could change this formula of course - but for us it's not a close enough call to even be ambiguous.
What does Arasan do? Do you always do the null move test?
More in a couple of hours...
- Dan
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Comparative nodes per second
First run completed and showed "0 elo improvement" but one takes that with the +/-4 error bar to be not so conclusive. Running another 30K games to refine the accuracy somewhat...dchoman wrote:I tried this in EXchess a few weeks ago and it was a small net win. One thing I realized is that it should not slow down the program very much because I would need to do an eval anyway if the null move drops straight into qsearch. So if the eval >= beta, and I do the null move, I keep the score and just use it (with the sign flipped) when qsearch is called rather than repeat the eval.bob wrote:I decided to give this a cluster test since that is one option I do not recall having tried in real games. Initially, with a few positional tests, it actually looks like a reasonable idea (I just added && eval >= beta to the end of the rest of the null-move qualifiers such as not being in check, the hash table didn't say null is futile, etc..., so that the eval call is not done on every node, just the ones that pass the earlier filters). Reduced the size of the tree, and slowed the nps. But with a net gain on just a few test positions. Am now running a cluster test to see if it is a real gain in terms of Elo...Don wrote:For us there is no question. We have tried many permutations of the general idea from always doing the null move test regardless of the evaluation to restricting it to certain node types, etc.jdart wrote:No eval test is done in Crafty as a condition for null move. Arasan does not do this either. The efficacy of this is questionable.What about for deciding whether to try null move or not? I don't see how Crafty could avoid calling the eval in the main search.
--Jon
What has always worked best is to do null move pruning test only when the current evaluation is already >= beta. A lot of details in each program that could change this formula of course - but for us it's not a close enough call to even be ambiguous.
What does Arasan do? Do you always do the null move test?
More in a couple of hours...
- Dan
If it sticks at 0 gain, I am done. If it shows some promise, I might try to optimize a bit. But I have tried so many things with null-move in the past, this was almost certainly one of them, and there's generally a reason something is not in my code like that, in that it simply didn't work for me. I am going to look at this a bit further to see if there are any other potential things to try, again..
-
- Posts: 5960
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: Comparative nodes per second
Probably for you, the cost of calling the eval offsets the benefit. For us (and all the top engines) the eval must be called for various other prunings anyway, so the benefit is free.bob wrote:First run completed and showed "0 elo improvement" but one takes that with the +/-4 error bar to be not so conclusive. Running another 30K games to refine the accuracy somewhat...dchoman wrote:I tried this in EXchess a few weeks ago and it was a small net win. One thing I realized is that it should not slow down the program very much because I would need to do an eval anyway if the null move drops straight into qsearch. So if the eval >= beta, and I do the null move, I keep the score and just use it (with the sign flipped) when qsearch is called rather than repeat the eval.bob wrote:I decided to give this a cluster test since that is one option I do not recall having tried in real games. Initially, with a few positional tests, it actually looks like a reasonable idea (I just added && eval >= beta to the end of the rest of the null-move qualifiers such as not being in check, the hash table didn't say null is futile, etc..., so that the eval call is not done on every node, just the ones that pass the earlier filters). Reduced the size of the tree, and slowed the nps. But with a net gain on just a few test positions. Am now running a cluster test to see if it is a real gain in terms of Elo...Don wrote:For us there is no question. We have tried many permutations of the general idea from always doing the null move test regardless of the evaluation to restricting it to certain node types, etc.jdart wrote:No eval test is done in Crafty as a condition for null move. Arasan does not do this either. The efficacy of this is questionable.What about for deciding whether to try null move or not? I don't see how Crafty could avoid calling the eval in the main search.
--Jon
What has always worked best is to do null move pruning test only when the current evaluation is already >= beta. A lot of details in each program that could change this formula of course - but for us it's not a close enough call to even be ambiguous.
What does Arasan do? Do you always do the null move test?
More in a couple of hours...
- Dan
If it sticks at 0 gain, I am done. If it shows some promise, I might try to optimize a bit. But I have tried so many things with null-move in the past, this was almost certainly one of them, and there's generally a reason something is not in my code like that, in that it simply didn't work for me. I am going to look at this a bit further to see if there are any other potential things to try, again..