An idea for a combination of stockfish and lc0

Uri Blass · Post by **Uri Blass** » Tue Sep 25, 2018 8:54 am

I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?

chrisw · Post by **chrisw** » Tue Sep 25, 2018 11:09 am

Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?

Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.

Michael Sherwin · Post by **Michael Sherwin** » Tue Sep 25, 2018 1:32 pm

chrisw wrote: ↑Tue Sep 25, 2018 11:09 am
Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?
Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.

Chris W, How much work would it take to turn this example into a working GUI?

chrisw · Post by **chrisw** » Tue Sep 25, 2018 1:46 pm

Michael Sherwin wrote: ↑Tue Sep 25, 2018 1:32 pm
chrisw wrote: ↑Tue Sep 25, 2018 11:09 am
Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?
Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.
Chris W, How much work would it take to turn this example into a working GUI?

No idea. The video gives me a headache after 15 seconds, too much flashing, too much music.

The idea I described above only requires a simple text only GUI (sorry, I meant UI). It takes as long as it takes to integrate three UCI loops, one to say hello to Arena or whatever, one to SF and one to LC0

Uri Blass · Post by **Uri Blass** » Tue Sep 25, 2018 2:24 pm

chrisw wrote: ↑Tue Sep 25, 2018 11:09 am
Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?
Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.

I think that my idea can correct lc0's tactical blunders that stockfish can easily see the right move.

Examples

[d]2r2rk1/4nppp/2b1pq2/1p6/2pRP2P/2P2BB1/3Q1PP1/R5K1 b - h3 0 21 am Ng6

[d]1rr3k1/5pp1/3Pq2p/p1P1p3/1n4P1/B6R/4QPP1/4R1K1 w - - 3 39 bm Rc1 am Qxe5

chrisw · Post by **chrisw** » Tue Sep 25, 2018 3:19 pm

Uri Blass wrote: ↑Tue Sep 25, 2018 2:24 pm
chrisw wrote: ↑Tue Sep 25, 2018 11:09 am
Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?
Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.
I think that my idea can correct lc0's tactical blunders that stockfish can easily see the right move.

Examples

[d]2r2rk1/4nppp/2b1pq2/1p6/2pRP2P/2P2BB1/3Q1PP1/R5K1 b - h3 0 21 am Ng6

[d]1rr3k1/5pp1/3Pq2p/p1P1p3/1n4P1/B6R/4QPP1/4R1K1 w - - 3 39 bm Rc1 am Qxe5

If I remember it correct, LC0 played Ng6 thinking it was ok, when it wasn't ok. That's an over-optimistic mistake. Why would your algorithm correct that? Ah, I may have misinterpreted your algorithm .....

I assumed you meant (just with random evals for clarity)
LC0 move = Ng4 eval 2.5
If SF finds a move XYZ eval > (2.5 + 0.4 bias) then play XYZ else play Ng4.

But did you actually mean
LC0 move = Ng4 eval 2.5
SF Ng4 eval = 1.1 (for example) and then if SF finds a move XYZ eval > (1.1 + 0.4) then play XYZ else play Ng4
so, you have to run two searches for SF, one to evaluate Ng4 and one to see if there's an XYZ that improves on Ng4 by 0.4?

??

Uri Blass · Post by **Uri Blass** » Tue Sep 25, 2018 7:51 pm

chrisw wrote: ↑Tue Sep 25, 2018 3:19 pm
Uri Blass wrote: ↑Tue Sep 25, 2018 2:24 pm
chrisw wrote: ↑Tue Sep 25, 2018 11:09 am
Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?
Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.
I think that my idea can correct lc0's tactical blunders that stockfish can easily see the right move.

Examples

[d]2r2rk1/4nppp/2b1pq2/1p6/2pRP2P/2P2BB1/3Q1PP1/R5K1 b - h3 0 21 am Ng6

[d]1rr3k1/5pp1/3Pq2p/p1P1p3/1n4P1/B6R/4QPP1/4R1K1 w - - 3 39 bm Rc1 am Qxe5
If I remember it correct, LC0 played Ng6 thinking it was ok, when it wasn't ok. That's an over-optimistic mistake. Why would your algorithm correct that? Ah, I may have misinterpreted your algorithm .....

I assumed you meant (just with random evals for clarity)
LC0 move = Ng4 eval 2.5
If SF finds a move XYZ eval > (2.5 + 0.4 bias) then play XYZ else play Ng4.

But did you actually mean
LC0 move = Ng4 eval 2.5
SF Ng4 eval = 1.1 (for example) and then if SF finds a move XYZ eval > (1.1 + 0.4) then play XYZ else play Ng4
so, you have to run two searches for SF, one to evaluate Ng4 and one to see if there's an XYZ that improves on Ng4 by 0.4?

??

I meant something similiar to what you write at the last lines but I do not need to do 2 searches of stockfish.

I actually meant lc0 Ng6 best move(eval not important)
SF start to search with bias of 0.4 for Ng6 it means that it starts with Ng6 as best move and only if it finds a move that is at least 0.4 pawns better it changes it's mind.

There are 2 possible cases during the search of stockfish.
1)SF's best move is Ng6 with some score 1.1 pawns(for example) advantage for itself.
SF needs a score of at least 1.5 pawns for itself to change its mind.
2)SF's best move is not Ng6 with some score let say 0.5 pawns(for example) advantage for itself
SF may changes its mind back to Ng6 if it finds Ng6 is at least 0.1 pawns for itself(0.5-0.4)
Of course in this case SF may change its mind to another move that is not Ng6 and you do not need to add or substract 0.4

chrisw · Post by **chrisw** » Tue Sep 25, 2018 8:08 pm

Uri Blass wrote: ↑Tue Sep 25, 2018 7:51 pm
chrisw wrote: ↑Tue Sep 25, 2018 3:19 pm
Uri Blass wrote: ↑Tue Sep 25, 2018 2:24 pm
chrisw wrote: ↑Tue Sep 25, 2018 11:09 am
Uri Blass wrote: ↑Tue Sep 25, 2018 8:54 am I suggest the following idea.

The program start by using lc0 for p1% of the time when p1 is a parameter and find the best move that lc0 suggest.

Later the program decide to have a bias of p2 centi-pawns for the best move(p2 is also a parameter).

Later the program use stockfish with (100-p1)% of the target time with the bias.

For example if the bias is 0.4 pawns and the move of lc0 is 14.Ra1-c1 then stockfish need to find a move that is more than 0.4 pawns better than Ra1-c1 in order not to choose Ra1-c1.

I wonder if this idea with the right parameters can be significantly stronger than both stockfish and lc0.

What is your opinion?
Well, it won't correct lc0 over-optimism.
It will correct lc0 pessimism. But pessimism doesn't seem to be an lc0 problem.
It won't correct those recurring situations where lc0 goes into a 0.0 (according to SF) draw but thinks she is +3.

It would be very simple (I am not offering) to write a GUI which fired up SF and LC0 in analyse mode (there's no reason why they can't both run at same time) and choose a move after time t, according to your algorithm. It's just a simple subset of a UCI program that pretends to be an engine (it doesn't even have to understand the rules) and calls up two other UCI programs. Then you can run it for N games against a real Stockfish and test the result.
I think that my idea can correct lc0's tactical blunders that stockfish can easily see the right move.

Examples

[d]2r2rk1/4nppp/2b1pq2/1p6/2pRP2P/2P2BB1/3Q1PP1/R5K1 b - h3 0 21 am Ng6

[d]1rr3k1/5pp1/3Pq2p/p1P1p3/1n4P1/B6R/4QPP1/4R1K1 w - - 3 39 bm Rc1 am Qxe5
If I remember it correct, LC0 played Ng6 thinking it was ok, when it wasn't ok. That's an over-optimistic mistake. Why would your algorithm correct that? Ah, I may have misinterpreted your algorithm .....

I assumed you meant (just with random evals for clarity)
LC0 move = Ng4 eval 2.5
If SF finds a move XYZ eval > (2.5 + 0.4 bias) then play XYZ else play Ng4.

But did you actually mean
LC0 move = Ng4 eval 2.5
SF Ng4 eval = 1.1 (for example) and then if SF finds a move XYZ eval > (1.1 + 0.4) then play XYZ else play Ng4
so, you have to run two searches for SF, one to evaluate Ng4 and one to see if there's an XYZ that improves on Ng4 by 0.4?

??
I meant something similiar to what you write at the last lines but I do not need to do 2 searches of stockfish.

I actually meant lc0 Ng6 best move(eval not important)
SF start to search with bias of 0.4 for Ng6 it means that it starts with Ng6 as best move and only if it finds a move that is at least 0.4 pawns better it changes it's mind.

There are 2 possible cases during the search of stockfish.
1)SF's best move is Ng6 with some score 1.1 pawns(for example) advantage for itself.
SF needs a score of at least 1.5 pawns for itself to change its mind.
2)SF's best move is not Ng6 with some score let say 0.5 pawns(for example) advantage for itself
SF may changes its mind back to Ng6 if it finds Ng6 is at least 0.1 pawns for itself(0.5-0.4)
Of course in this case SF may change its mind to another move that is not Ng6 and you do not need to add or substract 0.4

Okay. Understood.
Downside is that LCZero is being de-LCZeroed. Any move LC0 finds that she thinks is spectacularly positionally interesting won't get played unless SF sees at least some of it too.

Anyway, it isn't too complex to hack a UI to do this and run lots of tests. Possibly somebody already has. I'm assuming you can send bias commands to SF via the command line.

An idea for a combination of stockfish and lc0

An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0

Re: An idea for a combination of stockfish and lc0