Comparative nodes per second

rvida · Post by **rvida** » Wed Apr 11, 2012 1:28 am

lkaufman wrote:
rvida wrote:
Ivanhoe updates nodes in do_move(). But because the legality check is done only after do_move(), there are some illegal moves included in Ivanhoe's node count.
I think this issue accounts for just a few percent. Am I mistaken?

Yes, since their evasion move generator returns only legal moves, this increases nodecount only by some 2-3% at most.

Daniel Shawul · Post by **Daniel Shawul** » Wed Apr 11, 2012 1:41 am

lkaufman wrote:Comparing the two strongest open-source engines, Stockfish and Ivanhoe (recent versions), I notice that although they are virtually identical in playing strength (except at bullet levels), the reported nodes per second in Ivanhoe is higher in about the ratio of 5 to 3. I am well aware that node counts can differ depending on details of how they are counted (whether to count illegal moves for example), I don't believe that such details would amount to more than a few percent in the given instance. A ratio of 5 to 3 is HUGE, far more than could plausibly be attributed to differences in the complexity of the eval function (in the given instance at least). Obviously, since the programs are of equal strength, this huge disparity must have a high price or must somehow be misleading. Both programs are highly selective, though details differ, so I doubt this is the answer.

So, two questions:

1. How does Ivanhoe achieve such a huge lead in NPS over Stockfish?
2. What is Ivanhoe giving up to achieve this speed? It can't just be simpler eval, as the speed gap is too large. But Ivanhoe would be totally crushed by Stockfish on an equal-nodes basis. Why?

Why do you underestimate the effect of time spent on eval ? It is there where most of the time (say 70%) of an engine is spent. Also the way the nodes are counted and the types of pruning (and the margins used) do matter a lot. Optimization is overrated so that can't be an answer. The compiler takes you far ahead than you could imagine. Houdini is somewhere in between nps wise but it kicks both... You just can't say anything based on nps unless the engines do everything the same.

lkaufman · Post by **lkaufman** » Wed Apr 11, 2012 6:52 am

rvida wrote:
lkaufman wrote:
rvida wrote:
Ivanhoe updates nodes in do_move(). But because the legality check is done only after do_move(), there are some illegal moves included in Ivanhoe's node count.
I think this issue accounts for just a few percent. Am I mistaken?
Yes, since their evasion move generator returns only legal moves, this increases nodecount only by some 2-3% at most.

So what is your opinion on the question, is the 5-3 ratio in NPS real to a significant degree or an artifact of details in how nodes are counted? If it is real, why do you think SF is just as strong?
Regarding Critter, it sits roughly midway between SF and Ivanhoe in NPS. Is the substantially higher NPS in Ivanhoe (compared to Critter, roughly 4 to 3) real or misleading? If real, is Critter stronger despite this due to superior eval, or to some other reason?

lkaufman · Post by **lkaufman** » Wed Apr 11, 2012 7:05 am

Daniel Shawul wrote:[Why do you underestimate the effect of time spent on eval ? It is there where most of the time (say 70%) of an engine is spent. Also the way the nodes are counted and the types of pruning (and the margins used) do matter a lot. Optimization is overrated so that can't be an answer. The compiler takes you far ahead than you could imagine. Houdini is somewhere in between nps wise but it kicks both... You just can't say anything based on nps unless the engines do everything the same.

I don't know where you got that 70% figure, but it is certainly not typical of top engines. I think the typical figure is somewhere around 40%. At one time it was 50% in Komodo and we were certainly the highest of top programs in that regard. Anyway Stockfish is "built for speed" and it's hard to imagine that it would spend more than 10% more of total time in eval than Ivanhoe, whereas I'm looking to explain a 5 to 3 ratio. I think in some ways SF may spend less time in eval than Ivanhoe, because SF does movecount based pruning without needing a score whereas Ivanhoe does so with score considerations. But obviously if the speed difference is real and not an artifact of node counting, Ivanhoe must be spending much less time in many areas, and eval is almost surely one of them. But does it do so by having a simpler eval or by trying harder not to call eval at all? I know that lazy eval is a factor, since SF doesn't use it, but the fact that SF found it useless itself suggests that SF eval cannot be too bloated.

BubbaTough · Post by **BubbaTough** » Wed Apr 11, 2012 7:28 am

lkaufman wrote: Anyway Stockfish is "built for speed" and it's hard to imagine that it would spend more than 10% more of total time in eval

Both statements here sound very odd to me. What do you base them on?

-Sam

mcostalba · Post by **mcostalba** » Wed Apr 11, 2012 7:58 am

BubbaTough wrote:
lkaufman wrote: Anyway Stockfish is "built for speed" and it's hard to imagine that it would spend more than 10% more of total time in eval
Both statements here sound very odd to me. What do you base them on?

-Sam

Yes, they are incorrect. SF spends about 40% of time in evaluation. When I say that SF is well tuned speed wise I mean that each function has been careful profiled, so it is difficult to find a function that can be further optimized without changing its functionality, but this doesn't mean that SF skips evaluation or that evaluation is lean. Actually SF doesn't do lazy eval and calls eval in every node (when not backed up by TT).

Daniel Shawul · Post by **Daniel Shawul** » Wed Apr 11, 2012 2:27 pm

lkaufman wrote:
Daniel Shawul wrote:[Why do you underestimate the effect of time spent on eval ? It is there where most of the time (say 70%) of an engine is spent. Also the way the nodes are counted and the types of pruning (and the margins used) do matter a lot. Optimization is overrated so that can't be an answer. The compiler takes you far ahead than you could imagine. Houdini is somewhere in between nps wise but it kicks both... You just can't say anything based on nps unless the engines do everything the same.
I don't know where you got that 70% figure, but it is certainly not typical of top engines. I think the typical figure is somewhere around 40%. At one time it was 50% in Komodo and we were certainly the highest of top programs in that regard. Anyway Stockfish is "built for speed" and it's hard to imagine that it would spend more than 10% more of total time in eval than Ivanhoe, whereas I'm looking to explain a 5 to 3 ratio. I think in some ways SF may spend less time in eval than Ivanhoe, because SF does movecount based pruning without needing a score whereas Ivanhoe does so with score considerations. But obviously if the speed difference is real and not an artifact of node counting, Ivanhoe must be spending much less time in many areas, and eval is almost surely one of them. But does it do so by having a simpler eval or by trying harder not to call eval at all? I know that lazy eval is a factor, since SF doesn't use it, but the fact that SF found it useless itself suggests that SF eval cannot be too bloated.

I used to have 70% or more time spent on eval before I switched to bitboards and popcnt. Still 50% is not something that should be overlooked. You have mentioned all the possible cases that could cause this difference BUT optimization, which made me think that you think that is the cause. I guarantee you if both programs do the same thing you won't see much of a difference since compiler optimizes well. The algorithms used (prunings) OTOH can affect nps so much. For example, in my case 90% of my nodes are in qsearch so I would look carefully there if I want some speed ups. Size of evaluation, SEE vs MVV , captures and check move generation etc... Adding internal nodes evaluation may decrease nps a little but I don't expect it to affect me much. Or it could all be due to difference in counting who knows.

lkaufman · Post by **lkaufman** » Wed Apr 11, 2012 3:04 pm

mcostalba wrote:
BubbaTough wrote:
lkaufman wrote: Anyway Stockfish is "built for speed" and it's hard to imagine that it would spend more than 10% more of total time in eval
Both statements here sound very odd to me. What do you base them on?

-Sam
Yes, they are incorrect. SF spends about 40% of time in evaluation. When I say that SF is well tuned speed wise I mean that each function has been careful profiled, so it is difficult to find a function that can be further optimized without changing its functionality, but this doesn't mean that SF skips evaluation or that evaluation is lean. Actually SF doesn't do lazy eval and calls eval in every node (when not backed up by TT).

I didn't say or imply that SF spent only 10% of time in eval; in fact your 40% figure is just what I thought it was. I said that I doubted that the difference between SF and Ivanhoe in this number would be more than 10; in other words taking your 40 figure I doubt that Ivanhoe is below 30. Am I wrong? Is it your opinion that the 5 to 3 ratio is primarily due to node-counting differences, to more costly eval in SF, or to other factors? I think this issue may relate to the mystery of why Ivanhoe is much stronger than SF at bullet chess but not at normal blitz or slower. Any comment on that phenomenon?

jdart · Post by **jdart** » Wed Apr 11, 2012 3:59 pm

lkaufman wrote: 1. How does Ivanhoe achieve such a huge lead in NPS over Stockfish?
2. What is Ivanhoe giving up to achieve this speed? It can't just be simpler eval, as the speed gap is too large. But Ivanhoe would be totally crushed by Stockfish on an equal-nodes basis. Why?

a few comments:

1. As you probably know, NPS across programs is not highly correlated to playing strength. A very simple program can have very high NPS and not be a a good chessplayer. To take another example, Crafty has quite high NPS relative to other programs, but is not nearly equal to IvanHoe in strength.

2. Chess programs are complex beasts, and all the pieces typically interact. So for example programs with different board representations will behave quite differently. Also pruning and evaluation decisions affect NPS quite a bit. Many programs do eval at every node because it is used in pruning decisions. If you don't do this, or do lazy eval, you will be faster, but not necessarily stronger.

3. On the x86 platform, cache and memory usage has a very large effect on speed, especially for multithreaded programs. A program can be optimized at a local level by the compiler but optimizing memory usage usually has to be done by design.

4. I haven't done this but running the programs you are comparing under a good profiler (such as VTune) will answer some of your questions about causes of speed differences.

--Jon

BubbaTough · Post by **BubbaTough** » Wed Apr 11, 2012 4:25 pm

jdart wrote:
1. As you probably know, NPS across programs is not highly correlated to playing strength. A very simple program can have very high NPS and not be a a good chessplayer. To take another example, Crafty has quite high NPS relative to other programs, but is not nearly equal to IvanHoe in strength.

2. Chess programs are complex beasts, and all the pieces typically interact. So for example programs with different board representations will behave quite differently. Also pruning and evaluation decisions affect NPS quite a bit. Many programs do eval at every node because it is used in pruning decisions. If you don't do this, or do lazy eval, you will be faster, but not necessarily stronger.

--Jon

Exactly. #2 leads to #1. And #1 makes the question less interesting. Ivan and Stockfish are great examples of this. If memory serves, Ivan is designed to avoid eval whenever possible, and Stockfish is designed to take advantage of it when possible. The result is Ivan goes fast, and Stockfish slow. Both approaches have advantages and disadvantages, and comparing NPS across programs with such different approaches is not all that informative in my opinion.

-Sam

Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second

Re: Comparative nodes per second