Not differentiating root and non-root / pondering approach

Discussion of chess software programming and technical issues.

Moderator: Ras

BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Not differentiating root and non-root / pondering approa

Post by BubbaTough »

The power of the instant move should not be underestimated in that it not only saves you time, but often robs your opponent of time. I think you approach is completely reasonable as a starting point because it is easy to implement. But it is highly unlikely to be better than more normal approaches.

-Sam
Harald
Posts: 318
Joined: Thu Mar 09, 2006 1:07 am

Re: Not differentiating root and non-root / pondering approa

Post by Harald »

bob wrote:
Harald wrote:
cyberfish wrote:3) pondering. Instead of guessing opponent's move and thinking on the resulting position like just about every engine under the sun does, my engine just "think as the opponent", filling up the hash table.

I decided to do it this way because I noticed that, for "quiet" positions, ponder hits are quite rare, and the whole effort would be wasted if the opponent doesn't make the predicted move (or 2), and for "violent" positions (recaptures for example), it doesn't really matter anyways.

What are the real world differences between these two approaches in terms of strength?

Many thanks
I do it this way also. But I am always told that is not good.
Though nobody tries to explain in depth why. See this thread:
http://www.talkchess.com/forum/viewtopi ... 34&t=24870

Harald
It is obvious why it is bad. Let's take two cases:

(1) you predict the wrong move. Neither approach will help here. Your previous search had the wrong expected move from your opponent, so that is no good. The approach being discussed finds a move that is different from what the opponent played, so that is no good. In this case, it is a "tie".

(2) you predict the correct move. If you use traditional pondering, you chose the second move from the PV, played it, and did a normal N-ply search after this move and when the opponent moves, you have a move ready, searched to the normal depth, very quickly after he moves. If you use your approach, you do a normal N ply search to find the best move by your opponent. If he makes this move, you can play the second move in the PV, but it was only searched to depth N-1, which is a loss of one full ply.

Which is better? A normal program will correctly predict opponent moves well over 50% of the time. The last time I posted analysis here it was probably closer to 75%. So with traditional pondering, 75% of the time you get an almost instant move and save time, or you search an extra ply or so if you want to use the time you could save. With the discussed approach, 75% of the time you lose a ply over the other method. Which do you think is better???
Ok, next time I write a ponder function I will use the classic approach.
With your experience and the better engine you may be right.

But just for the fun of the discussion:
I don't predict _one_ move and neither (1) nor (2) applies.
Instead I do an inspection of all opponent moves and all my own answers.
Through the search the time spent on all moves is different and self adjusting to importance.
If there is an easy to guess opp-move I will search it for about 90% of
time. There is not much difference to the 100% classic method.
The search tree and hash entries will be nearly the same.
That is not losing a whole ply later.
But I also have some answers in the hash table and a small jump start if
the opponent makes a different move.
If there are more than one opp-moves that are good answers I will
distribute my search time equally on all of them. May be three good opp-moves
each searched 30% of time. After three such moves in the game I am 'right' three
times and save 30% of time. You would be right once and wrong twice.
My approach seems to be more natural and more like I would try to think
as a human player.

Sorry, I have no statistical empirical data for all this. I like the smooth
self adjusting heuristic of my method and may be too blind to understand
your proof. It's a feeling that there is some conditional probability in it
and I don't see it in your description.

Harald
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Not differentiating root and non-root / pondering approa

Post by Michael Sherwin »

Harald wrote:
bob wrote:
Harald wrote:
cyberfish wrote:3) pondering. Instead of guessing opponent's move and thinking on the resulting position like just about every engine under the sun does, my engine just "think as the opponent", filling up the hash table.

I decided to do it this way because I noticed that, for "quiet" positions, ponder hits are quite rare, and the whole effort would be wasted if the opponent doesn't make the predicted move (or 2), and for "violent" positions (recaptures for example), it doesn't really matter anyways.

What are the real world differences between these two approaches in terms of strength?

Many thanks
I do it this way also. But I am always told that is not good.
Though nobody tries to explain in depth why. See this thread:
http://www.talkchess.com/forum/viewtopi ... 34&t=24870

Harald
It is obvious why it is bad. Let's take two cases:

(1) you predict the wrong move. Neither approach will help here. Your previous search had the wrong expected move from your opponent, so that is no good. The approach being discussed finds a move that is different from what the opponent played, so that is no good. In this case, it is a "tie".

(2) you predict the correct move. If you use traditional pondering, you chose the second move from the PV, played it, and did a normal N-ply search after this move and when the opponent moves, you have a move ready, searched to the normal depth, very quickly after he moves. If you use your approach, you do a normal N ply search to find the best move by your opponent. If he makes this move, you can play the second move in the PV, but it was only searched to depth N-1, which is a loss of one full ply.

Which is better? A normal program will correctly predict opponent moves well over 50% of the time. The last time I posted analysis here it was probably closer to 75%. So with traditional pondering, 75% of the time you get an almost instant move and save time, or you search an extra ply or so if you want to use the time you could save. With the discussed approach, 75% of the time you lose a ply over the other method. Which do you think is better???
Ok, next time I write a ponder function I will use the classic approach.
With your experience and the better engine you may be right.

But just for the fun of the discussion:
I don't predict _one_ move and neither (1) nor (2) applies.
Instead I do an inspection of all opponent moves and all my own answers.
Through the search the time spent on all moves is different and self adjusting to importance.
If there is an easy to guess opp-move I will search it for about 90% of
time. There is not much difference to the 100% classic method.
The search tree and hash entries will be nearly the same.
That is not losing a whole ply later.
But I also have some answers in the hash table and a small jump start if
the opponent makes a different move.
If there are more than one opp-moves that are good answers I will
distribute my search time equally on all of them. May be three good opp-moves
each searched 30% of time. After three such moves in the game I am 'right' three
times and save 30% of time. You would be right once and wrong twice.
My approach seems to be more natural and more like I would try to think
as a human player.

Sorry, I have no statistical empirical data for all this. I like the smooth
self adjusting heuristic of my method and may be too blind to understand
your proof. It's a feeling that there is some conditional probability in it
and I don't see it in your description.

Harald
the name of my next engine is going to be named Invisible. :lol:
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Not differentiating root and non-root / pondering approa

Post by Sven »

bob wrote:(2) you predict the correct move. If you use traditional pondering, you chose the second move from the PV, played it, and did a normal N-ply search after this move and when the opponent moves, you have a move ready, searched to the normal depth, very quickly after he moves. If you use your approach, you do a normal N ply search to find the best move by your opponent. If he makes this move, you can play the second move in the PV, but it was only searched to depth N-1, which is a loss of one full ply.

Which is better? A normal program will correctly predict opponent moves well over 50% of the time. The last time I posted analysis here it was probably closer to 75%. So with traditional pondering, 75% of the time you get an almost instant move and save time, or you search an extra ply or so if you want to use the time you could save. With the discussed approach, 75% of the time you lose a ply over the other method. Which do you think is better???
I'm not quite sure why you say that the discussed method (always? I know you didn't say that ...) loses one full ply. I think it is highly dependent on some engine characteristics.

First example (which confirms your "one ply" statement):

Assume a branching factor (bf) of 2.0. Assume that only 50% of overall time is spent for the "best" move. Assume an N-ply search with the discussed method visits T nodes until the opponent makes his move. Assume further that you always have these T nodes, i.e. a constant NPS. The predicted move M (from your previous PV) takes 50/100 * T = T/2 nodes, and so do all other root moves together. Further assumption is that the search always keeps M as best move (which is not realistic, of course).

With the given bf, only the iteration N of the discussed method takes T/4 nodes for M while only the iteration N-1 took T/8 nodes, and so on, which is in total roughly T/2 nodes for M through all iterations. (lim(1/4 + 1/8 + 1/16 + ...) = 1/2). So iteration N+1 would search bf * (number of nodes only for iteration N) = 2.0 * T/4 = T/2 nodes, which is exactly what gets available when not searching all the other root moves. So the traditional method indeed saves one ply *with the assumptions above*.

Second example (which does not confirm the "one ply" statement):

Assume bf = 2.5 and time spent for best move = 60%, everything else as above. Predicted move M takes 60/100 * T = 0.6 * T nodes, all other moves together take 0.4 * T nodes.

For the given bf, since lim(1 + 0.4 + 0.16 + 0.064 + ...) = 5/3, only the iteration N of the discussed method takes 3/5 * 0.6 * T = 0.36 * T nodes while only the iteration N-1 took 0.144 * T nodes, and so on (total = 0.6 * T). Iteration N+1 would search 2.5 * 0.36 * T = 0.9 * T nodes. But only 0.4 * T "additional" nodes are available for iteration N+1, so it is like 4/9 of a ply that is saved, which might mean that the ponder search is still within the N+1 ply search of the move M when the opponent moves, so there would be no way yet to compare M to other moves at depth N+1 (which is not really a saving IMO).

The second example shows, if my calculation is correct, that a lower branching factor of an engine as well as only a slightly higher percentage of time spent for the "best" move may reduce the additional capacity that is available for a deeper search of the predicted move.

Much theory, of course a practical test could show other results.

Bob, can you confirm at least my calculation, or is there some severe error in it?

Sven
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Not differentiating root and non-root / pondering approa

Post by bob »

Sven Schüle wrote:
bob wrote:(2) you predict the correct move. If you use traditional pondering, you chose the second move from the PV, played it, and did a normal N-ply search after this move and when the opponent moves, you have a move ready, searched to the normal depth, very quickly after he moves. If you use your approach, you do a normal N ply search to find the best move by your opponent. If he makes this move, you can play the second move in the PV, but it was only searched to depth N-1, which is a loss of one full ply.

Which is better? A normal program will correctly predict opponent moves well over 50% of the time. The last time I posted analysis here it was probably closer to 75%. So with traditional pondering, 75% of the time you get an almost instant move and save time, or you search an extra ply or so if you want to use the time you could save. With the discussed approach, 75% of the time you lose a ply over the other method. Which do you think is better???
I'm not quite sure why you say that the discussed method (always? I know you didn't say that ...) loses one full ply. I think it is highly dependent on some engine characteristics.
It is simpler than that. If you search from your own perspective, after playing a single predicted move, you can reach depth N in a given amount of time. If you search from your opponent's perspective instead, you can reach a depth of N in that same amount of time, but the first ply doesn't count, that is your opponent's move. _your_ PV is one ply less than normal.


First example (which confirms your "one ply" statement):

Assume a branching factor (bf) of 2.0. Assume that only 50% of overall time is spent for the "best" move. Assume an N-ply search with the discussed method visits T nodes until the opponent makes his move. Assume further that you always have these T nodes, i.e. a constant NPS. The predicted move M (from your previous PV) takes 50/100 * T = T/2 nodes, and so do all other root moves together. Further assumption is that the search always keeps M as best move (which is not realistic, of course).

With the given bf, only the iteration N of the discussed method takes T/4 nodes for M while only the iteration N-1 took T/8 nodes, and so on, which is in total roughly T/2 nodes for M through all iterations. (lim(1/4 + 1/8 + 1/16 + ...) = 1/2). So iteration N+1 would search bf * (number of nodes only for iteration N) = 2.0 * T/4 = T/2 nodes, which is exactly what gets available when not searching all the other root moves. So the traditional method indeed saves one ply *with the assumptions above*.

Second example (which does not confirm the "one ply" statement):

Assume bf = 2.5 and time spent for best move = 60%, everything else as above. Predicted move M takes 60/100 * T = 0.6 * T nodes, all other moves together take 0.4 * T nodes.

For the given bf, since lim(1 + 0.4 + 0.16 + 0.064 + ...) = 5/3, only the iteration N of the discussed method takes 3/5 * 0.6 * T = 0.36 * T nodes while only the iteration N-1 took 0.144 * T nodes, and so on (total = 0.6 * T). Iteration N+1 would search 2.5 * 0.36 * T = 0.9 * T nodes. But only 0.4 * T "additional" nodes are available for iteration N+1, so it is like 4/9 of a ply that is saved, which might mean that the ponder search is still within the N+1 ply search of the move M when the opponent moves, so there would be no way yet to compare M to other moves at depth N+1 (which is not really a saving IMO).

The second example shows, if my calculation is correct, that a lower branching factor of an engine as well as only a slightly higher percentage of time spent for the "best" move may reduce the additional capacity that is available for a deeper search of the predicted move.

Much theory, of course a practical test could show other results.

Bob, can you confirm at least my calculation, or is there some severe error in it?

Sven
Seems reasonable although I did not think about it very long. I didn't look carefully because the answer is a lot simpler than that based on the pondering method being discussed as I understood it...
Harald
Posts: 318
Joined: Thu Mar 09, 2006 1:07 am

Re: Not differentiating root and non-root / pondering approa

Post by Harald »

bob wrote:
jwes wrote:
bob wrote:
cyberfish wrote:For my engine, I think I am doing something different from most engines, and I wonder if there are any drawbacks.

1) I use the transposition table to keep track of PV (the PV printing routine just follows the hash move at each position to get the PV).

That I think can only affect thinking output and not strength, so this is a minor issue.

2) I don't differentiate between root and non-root nodes. When it's the engine's turn to move, it just calls negamax on the root position, and then return the hash move from the transposition table when negamax returns (if not interrupted). My code is carefully written so that the "root" negamax call will always store a hash move in the TT. It also allows clean implementation of aspiration window. Unlike most (actually every I have seen) engines, it doesn't generate a list of legal moves, and call negamax on each resulting position.

It seems to work quite well for my engine, but does it have any negative impact that I didn't see?

3) pondering. Instead of guessing opponent's move and thinking on the resulting position like just about every engine under the sun does, my engine just "think as the opponent", filling up the hash table.

I decided to do it this way because I noticed that, for "quiet" positions, ponder hits are quite rare, and the whole effort would be wasted if the opponent doesn't make the predicted move (or 2), and for "violent" positions (recaptures for example), it doesn't really matter anyways.

What are the real world differences between these two approaches in terms of strength?

Many thanks
You are throwing away a lot of processing time. You should average at _least_ 50% on ponder hits over the course of a game, correctly predicting the opponent's move thru your PV. Searching from the opponent's perspective simply costs you one ply that you would be getting if you ponder the best move you found as most do...
In another thread recently, you stated that most of the time in a search is spent on the PV, so the loss of time caused by searching from the opponent's perspective would not be that great. It could be worth a test just to see how many ELO difference it makes.
You are losing a whole ply, I would think it easy to see why. Because you are searching for the best move for the opponent, rather than yourself, that ply is important. What it means is that for whatever percent of the time you correctly predict, you will be searching 1 ply less than when you do not predict correct. one ply is measurable in terms of lost Elo...
No that is not true. We are not losing 1 ply.
At the moment when the opponent makes his move you may have searched
2000000 nodes. In the case of a right prediction we may have spent
1800001 nodes for that move also, that is the same 1800000 nodes and entries
in the hash table. At least the most important high depth entries.
Just 200000 nodes are wasted for other opponent moves.
To complete the ply after resuming the search in our thinking time
we are just 200000 nodes behind.
In the case of a misprediction you have to start at zero and we are 20000
nodes ahead.
Do not forget that after the opponent moved we just continue searching
in our allocated time as long as we want. The time stamps of a finished ply are
just milestones at certain node counts in the whole search time.

Harald
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Not differentiating root and non-root / pondering approa

Post by Sven »

bob wrote:It is simpler than that. If you search from your own perspective, after playing a single predicted move, you can reach depth N in a given amount of time. If you search from your opponent's perspective instead, you can reach a depth of N in that same amount of time, but the first ply doesn't count, that is your opponent's move. _your_ PV is one ply less than normal.
I think this is wrong. The problem is your "N". I try to improve my description.

1) Method A = traditional, Method B = "the discussed one".

2) Constant number of T nodes available for both methods.

3) Method B searches from the root and gets to ply N.

4) Method B spends a portion P of all nodes for the "best" move M which would be the predicted move in method A. So it spends P*T nodes for searching M to a remaining depth N-1 and (1-P)*T nodes for other root moves.

5) Method A only searches the subtree of M. After P*T nodes it has searched the same subtree of M as method B, so it has searched to the same depth N-1. (Bob, at this point I think you make the error, it sounds like you assume depth N here!)

6) Now method A has (1-P)*T nodes left until pondering stops since it does not search the other opponent moves. The key question now arises: how much deeper can A go now compared to B? My answer is that this depends both on P and on the branching factor.

6a) With P=0.5 and bf=2.0, A can get a full ply deeper since it has T/2 nodes left and the next ply would take these T/2 nodes. (Branching factor calculation, see my previous post.)

6b) With P=0.9 (Harald Lüßen uses this number), A has only T/10 nodes left, so you can easily see that this will never take A one ply deeper. Even with bf=1.0 (impossible), the next full ply would require 0.9*T nodes.

I think this proves that the advantage of method A (in case of correct prediction) depends on both P and bf, where both a higher P and a higher bf make it worse for A.
Seems reasonable although I did not think about it very long. I didn't look carefully because the answer is a lot simpler than that based on the pondering method being discussed as I understood it...
You can only prove me wrong after thinking about my post carefully, not by just saying that you're right ;-)

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Not differentiating root and non-root / pondering approa

Post by Sven »

Sven Schüle wrote:6b) With P=0.9 (Harald Lüßen uses this number), A has only T/10 nodes left, so you can easily see that this will never take A one ply deeper. Even with bf=1.0 (impossible), the next full ply would require 0.9*T nodes.
To be more precise, the number of required nodes for the next ply would be something like 0.9*T/(N-1) since with the hypothetical bf=1.0, each iteration would take the same number of nodes. This is getting nonsense so I should have chosen a better example, like bf=1.5 (still hypothetical):

lim(0.3+0.2+0.1333+0.0889+...) = 0.9, so the next full ply would take 1.5*0.3*T = 0.45*T nodes which is far more than the available 0.1*T.

And no current engine has bf=1.5 AFAIK.

Sven
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Not differentiating root and non-root / pondering approa

Post by bob »

Harald wrote:
bob wrote:
jwes wrote:
bob wrote:
cyberfish wrote:For my engine, I think I am doing something different from most engines, and I wonder if there are any drawbacks.

1) I use the transposition table to keep track of PV (the PV printing routine just follows the hash move at each position to get the PV).

That I think can only affect thinking output and not strength, so this is a minor issue.

2) I don't differentiate between root and non-root nodes. When it's the engine's turn to move, it just calls negamax on the root position, and then return the hash move from the transposition table when negamax returns (if not interrupted). My code is carefully written so that the "root" negamax call will always store a hash move in the TT. It also allows clean implementation of aspiration window. Unlike most (actually every I have seen) engines, it doesn't generate a list of legal moves, and call negamax on each resulting position.

It seems to work quite well for my engine, but does it have any negative impact that I didn't see?

3) pondering. Instead of guessing opponent's move and thinking on the resulting position like just about every engine under the sun does, my engine just "think as the opponent", filling up the hash table.

I decided to do it this way because I noticed that, for "quiet" positions, ponder hits are quite rare, and the whole effort would be wasted if the opponent doesn't make the predicted move (or 2), and for "violent" positions (recaptures for example), it doesn't really matter anyways.

What are the real world differences between these two approaches in terms of strength?

Many thanks
You are throwing away a lot of processing time. You should average at _least_ 50% on ponder hits over the course of a game, correctly predicting the opponent's move thru your PV. Searching from the opponent's perspective simply costs you one ply that you would be getting if you ponder the best move you found as most do...
In another thread recently, you stated that most of the time in a search is spent on the PV, so the loss of time caused by searching from the opponent's perspective would not be that great. It could be worth a test just to see how many ELO difference it makes.
You are losing a whole ply, I would think it easy to see why. Because you are searching for the best move for the opponent, rather than yourself, that ply is important. What it means is that for whatever percent of the time you correctly predict, you will be searching 1 ply less than when you do not predict correct. one ply is measurable in terms of lost Elo...
No that is not true. We are not losing 1 ply.
At the moment when the opponent makes his move you may have searched
2000000 nodes. In the case of a right prediction we may have spent
1800001 nodes for that move also, that is the same 1800000 nodes and entries
in the hash table. At least the most important high depth entries.
Just 200000 nodes are wasted for other opponent moves.
To complete the ply after resuming the search in our thinking time
we are just 200000 nodes behind.
In the case of a misprediction you have to start at zero and we are 20000
nodes ahead.
Do not forget that after the opponent moved we just continue searching
in our allocated time as long as we want. The time stamps of a finished ply are
just milestones at certain node counts in the whole search time.

Harald
I'll make this simple.

Case one, the Crafty mode of pondering. Crafty makes a move and immediately assumes you will make the second move in the PV. It starts to ponder. After 180 seconds it has searched 20 plies deep and you make a move. I can choose to make a move right now (instantly) and accept a 20 ply search after 180 seconds of effort, or I can decide to continue searching, where an additional ply is going to double the total time and I can make a 21 ply move after 360 seconds.

Case two, the mode being discussed. You make a move, and then "flip sides" to begin searching. After 180 seconds you have again reached a depth of 20. But that is 20 plies for the opponent, _not_ for you. When the opponent moves after 180 seconds, if you move instantly you have a 20 ply search from his perspective, but only 19 plies from your perspective completed.

You _do_ lose a ply. It is also trivial to prove that if you predict correctly > 50% of the time, the way I do it is the _only_ way to do it that makes any sense at all from a time utilization perspective...

Your analysis is flawed, because it is based on the same kind of math I use for the normal pondering mode. You believe most of the effort is on the first move you search. And you are correct. But you ignore that most of the effort spent on _that_ move is spent on various other moves lower in the tree, _not_ necessarily the "best" moves. The cost is more pronounced than you believe.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Not differentiating root and non-root / pondering approa

Post by bob »

Sven Schüle wrote:
bob wrote:It is simpler than that. If you search from your own perspective, after playing a single predicted move, you can reach depth N in a given amount of time. If you search from your opponent's perspective instead, you can reach a depth of N in that same amount of time, but the first ply doesn't count, that is your opponent's move. _your_ PV is one ply less than normal.
I think this is wrong. The problem is your "N". I try to improve my description.

1) Method A = traditional, Method B = "the discussed one".

2) Constant number of T nodes available for both methods.

3) Method B searches from the root and gets to ply N.

4) Method B spends a portion P of all nodes for the "best" move M which would be the predicted move in method A. So it spends P*T nodes for searching M to a remaining depth N-1 and (1-P)*T nodes for other root moves.

5) Method A only searches the subtree of M. After P*T nodes it has searched the same subtree of M as method B, so it has searched to the same depth N-1. (Bob, at this point I think you make the error, it sounds like you assume depth N here!)
No, what I am considering is that you have a set of N root moves, and the normal method of pondering focuses all effort on the "best" move of that set, while the discussed method focuses effort on _all_ moves in that set. you can't possibly search all the moves to the same depth I can search just one of the moves to, which was my point. Hence "losing one ply"...

6) Now method A has (1-P)*T nodes left until pondering stops since it does not search the other opponent moves. The key question now arises: how much deeper can A go now compared to B? My answer is that this depends both on P and on the branching factor.
No argument there.


6a) With P=0.5 and bf=2.0, A can get a full ply deeper since it has T/2 nodes left and the next ply would take these T/2 nodes. (Branching factor calculation, see my previous post.)

6b) With P=0.9 (Harald Lüßen uses this number), A has only T/10 nodes left, so you can easily see that this will never take A one ply deeper. Even with bf=1.0 (impossible), the next full ply would require 0.9*T nodes.
Again, it depends on the numbers. But another important fact. Starting the next iteration (or trying to go one ply deeper) has a chance to tell you something you would have missed had you not tried, namely that the best move might fail low, and that happens _very_ quickly since one refutation at ply 2 is all that is needed.

I think this proves that the advantage of method A (in case of correct prediction) depends on both P and bf, where both a higher P and a higher bf make it worse for A.
Seems reasonable although I did not think about it very long. I didn't look carefully because the answer is a lot simpler than that based on the pondering method being discussed as I understood it...
You can only prove me wrong after thinking about my post carefully, not by just saying that you're right ;-)

Sven
The simple point is this: Which would you prefer:

(a) do the most efficient search you possibly can?

(b) waste a significant part of your search effort when pondering?

The discussed method is clearly (b). Even if it is only a 10% penalty assuming 90% of effort on the first move, who would _not_ choose to speed up their program by 10%? That's a significant speedup. Not all trees have a EBF of 2.0 either, it can go way up. I'd bet that your "ponder search" will change its mind several times in complex positions, where the traditional pondering approach is using the result of a deep previous search to find the best move to ponder, saving even more.

The bottom line is that the current approach is better. how much better we could argue about for weeks. But that's not the point. It is better, and IMHO "better is always better"... I'll take 10% any day of the week, and twice on Sunday. I'll take 1% in fact... And since there is _no_ case where the proposed method is better in light of current pondering accuracy, I don't consider it very reasonable.