On Opening books in 2015

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

jdart
Posts: 4429
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: On Opening books in 2015

Post by jdart »

Selecting moves from published PGNs is problematic.

Even GMs make mistakes. And sometimes moves suddenly fall out of favor, sometimes for good reasons (a refutation is found).

I think to build a reasonable book from PGNs you need to weight win percentage very highly. If you have a large number of games and move A is scoring 55% and move B is scoring 45%, that is a significant difference, and you should strongly prefer move A. Maybe also you should tilt the book towards recently played moves. But all this breaks down when few samples are available. Then PGN statistics are unreliable, and if you lack other information such as previous search results you are relatively blind.

--Jon
Dann Corbit
Posts: 12856
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: On Opening books in 2015

Post by Dann Corbit »

jdart wrote:Selecting moves from published PGNs is problematic.

Even GMs make mistakes. And sometimes moves suddenly fall out of favor, sometimes for good reasons (a refutation is found).

I think to build a reasonable book from PGNs you need to weight win percentage very highly. If you have a large number of games and move A is scoring 55% and move B is scoring 45%, that is a significant difference, and you should strongly prefer move A. Maybe also you should tilt the book towards recently played moves. But all this breaks down when few samples are available. Then PGN statistics are unreliable, and if you lack other information such as previous search results you are relatively blind.

--Jon
I favor the following approach:
1. Collect statistics for each position you intend to store in your book.
2. Analyze the every position in the book for at least one hour.
3. For each position, if the analyzed move and the statistical move agree then that position is completed {for now *}.
4. If the analyze position's move and the statistical suggestion for a move disagree, then you need a threshold for depth needed to overcome N games. For instance, you might decide that for 30 games played for a position, at least 37 plies are needed and for 100 games played at least 40 plies are needed to prefer the computer analysis. If there are a very large number of games played (e.g. 500 or more) then the position should be reanalyzed to a high depth with several different engines. This is especially the case if the win/loss/draw ratio is favorable for the statistical move and the ce is unfavorable for the computer analyzed move. It almost always means that the computer has not analyzed deeply enough yet (though once in a while the computer find a refutation for the most commonly played move).

Also, force the statistically suggested best move and analyze that position.
It may also be helpful to do a 24 hour analysis on multi-pv mode.

If the book is intended to be used against any book other than itself, then all interior nodes must be analyzed and not just the book exit points.

* If the move in agreement performs poorly in practice, it must be re-analyzed. For that reason, it is a good idea to store book play statistics for the position.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: On Opening books in 2015

Post by Laskos »

lucasart wrote:
Peter Berger wrote: Simplified claim: Opening books are pretty useless these days. E.g. Stockfish don't need no book at slow time controls.
I almost agree with you. However, to measure the book effect per se, one needs to depollute the measure from the time biais.
...
Generally, the larger the book, the crappier it is.
These are complete misconceptions of both of you. Are those your opinions? Do you have at least some empirical data? Two weeks ago I played 30 games at 40'/game on pretty strong hardware, the result was +5 =25 -0 for the Stockfish with a good, large book against Stockfish no book. I am curious what miracles would happen at 3 times longer TC like 120'/game? 2 plies deeper the engine realizes it constantly blunders many openings?

A typical win for book Stockfish:

[Event "40min 4cores"]
[Site "?"]
[Date "2015.03.07"]
[Round "7.1"]
[White "Stockfish 020315 64 BMI2 Book"]
[Black "Stockfish 020315 64 BMI2"]
[Result "1-0"]
[ECO "C11"]
[Annotator "0.00;0.23"]
[PlyCount "163"]
[EventDate "2015.03.07"]
[EventType "tourn"]
[Source "Kai"]
[TimeControl "2400"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=36.7 plies; 8,989kN/s; PlaychessNightmare.ctg B=34.0 plies; 8,035kN/s}
1. e4 {B 0} e6 {0.23/28 64} 2. d4 {B 0} d5 {0.15/30 49} 3. Nc3 {B 0} Nf6 {0.12/29 40} 4. e5 {B 0} Nfd7 {0.09/31 42} 5. f4 {B 0} c5 {0.02/31 46} 6. Nf3 {B 0} Nc6 {0.00/30 36} 7. Be3 {B 0} Qb6 {0.00/32 26} 8. Na4 {B 0} Qa5+ {0.00/33 23} 9. Nc3 {B 0} Qb6 {0.00/36 31} 10. Na4 {B 0} Qa5+ {0.00/41 34} 11. c3 {B 0} cxd4 {0.00/32 38} 12. b4 {B 0} Nxb4 {0.00/33 23} 13. cxb4 {B 0} Bxb4+ {0.39/30 42} 14. Bd2 {B 0} Bxd2+ {0.32/31 43} 15. Nxd2 {B 0} b6 {0.29/34 41} 16. Bd3 {B 0} Ba6 {0.36/33 59} 17. Nb2 {B 0} Nc5 {0.53/33 41} 18. Bxa6 {B 0} Qxa6 {0.55/32 68} 19. Qe2 {B 0} Rc8 {0.57/32 34} 20. Qxa6 {B 0} Nxa6 {0.50/32 38} 21. Rb1 {B 0} O-O {0.58/33 90} 22. Nf3 {0.65/35 55} Nb4 {0.65/34 37}

...

1-0

At moves 6 to 12, the non-book Stockfish thinks it gained ground as black, giving 0.00 eval. While at move 22, exiting the book, it admits it blundered the opening, giving 0.65 to book Stockfish, which just exited the book. In 4 of the 5 wins of book Stockfish, the games went on these lines. Then, there are exiting positions which Stockfish considers equal (close to 0.00) while not being so, usually according to outcome statistics. The burden to prove your naive, misleading statements and LTC miracles (2 plies deeper) is entirely on you. I claim that maybe 5-10 years are needed for top engines to play openings reasonably and not need books. It's not about 2 plies more of the same Stockfish, your miraculous LTC. It's about 20+ plies more, horizon, lack of outcome predictability, unreliability of PV in the openings.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: On Opening books in 2015

Post by Zenmastur »

lucasart wrote:
Peter Berger wrote: Simplified claim: Opening books are pretty useless these days. E.g. Stockfish don't need no book at slow time controls.
I almost agree with you. However, to measure the book effect per se, one needs to depollute the measure from the time biais.

For example, if you play 8 book moves, your time is base+8*inc when you start the game at move 8. Instead without book, you start from move 1, and after 8 moves your time on the clock is significantly reduced.

my claim: Most of the book value is simply the time biais.
I partially agree with the last statement, but only if we are considering antiquated book designs. This is because currently books are antiquated, pretty static and they are used with out any thought by the end user or any search by the engine. When a human plays a tournament game most of the good players at some point, while still in their opening book, begin to think about the books moves they are about to play. They don't blindly play the moves just because it's in their opening repertoire. The fact that engines blindly play book moves is a fault with the programmer not the book.

If we are talking about more modern designs then I have to disagree completely! I think you can add 30-60 ELO (or more) to any engine with proper book design. And, I'm not talking about adding line "x" to the book to improve the engines performance. I'm talking about a book that is designed to add value to the engine as opposed to some after-thought added on because the engine can't play the openings well without it, but you were too lazy to do the proper ground work or have any real understanding or insight into the problem.

Many position in the opening aren't subject to being "solved" by a search with current software and hardware. This is true even if huge amounts of time are used. So the idea that an engine can search it's way out of the opening is ridiculous.

[quote=""lucasart"] Generally, the larger the book, the crappier it is. A book is crappy when it contains crappy moves, even < 1%. It only takes one crappy move to lose a game... And with most book being built from GM databases, they are mostly crappy...[/quote]

Just because a move isn't "best" isn't a good reason to exclude it from the book. Books made for humans contain all kinds of moves. They will purposely include moves that are blunders, Usually this will be the non-obvious kind but not always. The usually non-obvious line of play that is meant to deal with it save the end user time and increases their game score. So why should it be any different for a computer book. Am I missing something here?

I'm not sure how it came to pass that computer opening books and played blindly like they were gospel or why some people think that only "good" moves have value in a book and that any "bad" move should be removed. I guess this is what happens when you get a group of people blindly following someone else's design decisions instead of thinking for themselves.

Bad moves have as much value as the good ones. They can save time and increase games scores if properly used. As far as moves being incorrectly identified in the book as "good" when they are actually "bad" there is a trivial fix for this. Before a book move is played for the first time and assuming we have no other knowledge about it, the search depth field is consulted, if the depth is zero or less than what a "reasonable" search under current time controls would allow then the move has to be searched before it's played. After the search the depth of the search, the number of nodes searched, and the evaluation are stored in the book. If it's subsequently played then the book is updated as it normally would be. Problem solved!

If I were a purist, I would tell you that before any move can be played from the book, even if its been played before, it's maximum search depth should be checked to see if it's possible under the current time controls to increase the search depth. Any time it's possible to increase the search depth it should be. If the move returned from the search results in a position that isn't in the book the new position is added to the book even if it's parent node doesn't have the right to add leaf nodes. When this happens it's time to decide if the move should be played. If you're NOT in a tournament then IMO it should be played. This will cause the program to exit the known book early. The only program I know of that will ignore the book move and play a non-book move is Crafty. Once a move is played it will be treated like any other move in the book. No special attention is needed if the book routine is properly thought out and written.

[quote=""lucasart"]Some book are built properly, with every single position verified by search. But most of them are just a dump of PGN crap from GM games...

For me the only utility of a book is to ensure non determinism (not play the same games all the time). But even for that it's useless. A simple opening selection (EPD file) is enough.[/quote]

While searching book moves and storing some information about the search in the position record is a good idea, I don't think it's practical to perform deep searches on every position in the book. Some positions are more important than others because they are seen many times more often and therefore would affect many more games should a better line of play be found. The more often a position is seen the more important it becomes and the more likely it is to warrant special attention by way of an extended search. For the vast majority of positions this is overkill.

I guess you referring to GM games as "crap" is an indication of how liitle thought you have put into designing a book. Most Othello programs have books that are a lot more sophisticated than those found in chess engines or GUI's.
mvk wrote:Agree, a search is necessary, and the main advantage is time. Secondary is to avoid bad moves just beyond your search horizon. But there is only so much you can search in advance. It should be more than you can search during the game (modulo the time advantage). Given the increase in computer power, this gives an upper bound on the useful size of a book. Too big and it will be shallow near the leaves, and you make the mistakes there. Or too big and the verifications are deep, but by the time your tree is ready the tournament machine is a lot faster thanks to semiconductor manufacturing improvements.


Lets look at it this way, if the position isn't in the book you will have to do a search in any case. So there is NO disadvantage to having a large book provided that you verify that the book move you are about to play has been searched to at least to the depth that it would be searched if it weren't a book move. In fact, having a large book that can be extended ad hoc is a great advantage over smaller, more conventional "static" books because search information can be stored for positions not originally placed in the book when it was created. So I think that bigger, more extensible, and more flexible is better in every respect, assuming the size doesn't become burdensome and the access time is relatively quick. With modern hardware I don't think either of these issues will be a problem.
jdart wrote:Selecting moves from published PGNs is problematic.
Only if your program blindly makes any move it finds in it's book. A book routine is only as good as you make it. If you want to cut corners by blindly copying someone else's book design with out even so much as looking at the design trade off that person made, then you deserve to have a worthless book in your program.
jdart wrote:Even GMs make mistakes. And sometimes moves suddenly fall out of favor, sometimes for good reasons (a refutation is found).


With a properly designed book routine (i.e. something that ISN'T a copy of a 30 year old design) none of this would present even the slightest problem.
jdart wrote:I think to build a reasonable book from PGNs you need to weight win percentage very highly. If you have a large number of games and move A is scoring 55% and move B is scoring 45%, that is a significant difference, and you should strongly prefer move A. Maybe also you should tilt the book towards recently played moves. But all this breaks down when few samples are available. Then PGN statistics are unreliable, and if you lack other information such as previous search results you are relatively blind.
I'm not sure what to think about the last two statements. So what if you don't have 5,000 games, or 500 or pick some reasonable number, from a position to rely on! Your complaining that GM's play poorly and now you're complaining that you have positions with too few games. I have a question for you: Do you uses GM games to make your books or computer games or just analysis. Because I think you're just complaining to have something to do!
Dann Corbit wrote: I favor the following approach:
1. Collect statistics for each position you intend to store in your book.
I think everyone does this.
Dann Corbit wrote:2. Analyze the every position in the book for at least one hour.
Get real! Your not going to get very far if you try to pre-analyze ever position for an hour. Unless, of course, you have VAST amounts of computer hardware or you have such a small book as to make it nearly useless.
Dann Corbit wrote:3. For each position, if the analyzed move and the statistical move agree then that position is completed {for now *}.
4. If the analyze position's move and the statistical suggestion for a move disagree, then you need a threshold for depth needed to overcome N games. For instance, you might decide that for 30 games played for a position, at least 37 plies are needed and for 100 games played at least 40 plies are needed to prefer the computer analysis. If there are a very large number of games played (e.g. 500 or more) then the position should be reanalyzed to a high depth with several different engines. This is especially the case if the win/loss/draw ratio is favorable for the statistical move and the ce is unfavorable for the computer analyzed move. It almost always means that the computer has not analyzed deeply enough yet (though once in a while the computer find a refutation for the most commonly played move).
Good luck with this type of approach. It seems unnecessarily complicated and relies on an unproven theory that is likely to be disproved were anyone to spend time analyzing it. It's wishful thinking to believe there is a simple relationship between the number of games in a database and the moves reliability compared to the reliability of a computer search to depth "n".
Dann Corbit wrote:* If the move in agreement performs poorly in practice, it must be re-analyzed. For that reason, it is a good idea to store book play statistics for the position.
It seems to me that if other moves from the same position don't have this problem, i.e. they perform well, then wouldn't it be easier to select one of them and save yourself a bunch of work? If no other move from that position performs well then there's a good chance that the position is bad, and it's time to back up two plies and see if that solves the problem. If you do this repeatedly and arrive at the root and you're still having problems then it's a good indication that your engine is the problem! :lol:

Regards,

Zen
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: On Opening books in 2015

Post by Zenmastur »

Laskos wrote:
lucasart wrote:
Peter Berger wrote: Simplified claim: Opening books are pretty useless these days. E.g. Stockfish don't need no book at slow time controls.
I almost agree with you. However, to measure the book effect per se, one needs to depollute the measure from the time biais.
...
Generally, the larger the book, the crappier it is.
These are complete misconceptions of both of you. Are those your opinions? Do you have at least some empirical data? Two weeks ago I played 30 games at 40'/game on pretty strong hardware, the result was +5 =25 -0 for the Stockfish with a good, large book against Stockfish no book. I am curious what miracles would happen at 3 times longer TC like 120'/game? 2 plies deeper the engine realizes it constantly blunders many openings?

A typical win for book Stockfish:

[Event "40min 4cores"]
[Site "?"]
[Date "2015.03.07"]
[Round "7.1"]
[White "Stockfish 020315 64 BMI2 Book"]
[Black "Stockfish 020315 64 BMI2"]
[Result "1-0"]
[ECO "C11"]
[Annotator "0.00;0.23"]
[PlyCount "163"]
[EventDate "2015.03.07"]
[EventType "tourn"]
[Source "Kai"]
[TimeControl "2400"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=36.7 plies; 8,989kN/s; PlaychessNightmare.ctg B=34.0 plies; 8,035kN/s}
1. e4 {B 0} e6 {0.23/28 64} 2. d4 {B 0} d5 {0.15/30 49} 3. Nc3 {B 0} Nf6 {0.12/29 40} 4. e5 {B 0} Nfd7 {0.09/31 42} 5. f4 {B 0} c5 {0.02/31 46} 6. Nf3 {B 0} Nc6 {0.00/30 36} 7. Be3 {B 0} Qb6 {0.00/32 26} 8. Na4 {B 0} Qa5+ {0.00/33 23} 9. Nc3 {B 0} Qb6 {0.00/36 31} 10. Na4 {B 0} Qa5+ {0.00/41 34} 11. c3 {B 0} cxd4 {0.00/32 38} 12. b4 {B 0} Nxb4 {0.00/33 23} 13. cxb4 {B 0} Bxb4+ {0.39/30 42} 14. Bd2 {B 0} Bxd2+ {0.32/31 43} 15. Nxd2 {B 0} b6 {0.29/34 41} 16. Bd3 {B 0} Ba6 {0.36/33 59} 17. Nb2 {B 0} Nc5 {0.53/33 41} 18. Bxa6 {B 0} Qxa6 {0.55/32 68} 19. Qe2 {B 0} Rc8 {0.57/32 34} 20. Qxa6 {B 0} Nxa6 {0.50/32 38} 21. Rb1 {B 0} O-O {0.58/33 90} 22. Nf3 {0.65/35 55} Nb4 {0.65/34 37}

...

1-0

At moves 6 to 12, the non-book Stockfish thinks it gained ground as black, giving 0.00 eval. While at move 22, exiting the book, it admits it blundered the opening, giving 0.65 to book Stockfish, which just exited the book. In 4 of the 5 wins of book Stockfish, the games went on these lines. Then, there are exiting positions which Stockfish considers equal (close to 0.00) while not being so, usually according to outcome statistics. The burden to prove your naive, misleading statements and LTC miracles (2 plies deeper) is entirely on you. I claim that maybe 5-10 years are needed for top engines to play openings reasonably and not need books. It's not about 2 plies more of the same Stockfish, your miraculous LTC. It's about 20+ plies more, horizon, lack of outcome predictability, unreliability of PV in the openings.

Hmmm...

The voice of reason has a very pleasant ring! :)

Regards,

Zen
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
Dann Corbit
Posts: 12856
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: On Opening books in 2015

Post by Dann Corbit »

Zenmastur wrote:
lucasart wrote:
Peter Berger wrote: Simplified claim: Opening books are pretty useless these days. E.g. Stockfish don't need no book at slow time controls.
I almost agree with you. However, to measure the book effect per se, one needs to depollute the measure from the time biais.

For example, if you play 8 book moves, your time is base+8*inc when you start the game at move 8. Instead without book, you start from move 1, and after 8 moves your time on the clock is significantly reduced.

my claim: Most of the book value is simply the time biais.
I partially agree with the last statement, but only if we are considering antiquated book designs. This is because currently books are antiquated, pretty static and they are used with out any thought by the end user or any search by the engine. When a human plays a tournament game most of the good players at some point, while still in their opening book, begin to think about the books moves they are about to play. They don't blindly play the moves just because it's in their opening repertoire. The fact that engines blindly play book moves is a fault with the programmer not the book.

If we are talking about more modern designs then I have to disagree completely! I think you can add 30-60 ELO (or more) to any engine with proper book design. And, I'm not talking about adding line "x" to the book to improve the engines performance. I'm talking about a book that is designed to add value to the engine as opposed to some after-thought added on because the engine can't play the openings well without it, but you were too lazy to do the proper ground work or have any real understanding or insight into the problem.

Many position in the opening aren't subject to being "solved" by a search with current software and hardware. This is true even if huge amounts of time are used. So the idea that an engine can search it's way out of the opening is ridiculous.

[quote=""lucasart"] Generally, the larger the book, the crappier it is. A book is crappy when it contains crappy moves, even < 1%. It only takes one crappy move to lose a game... And with most book being built from GM databases, they are mostly crappy...
Just because a move isn't "best" isn't a good reason to exclude it from the book. Books made for humans contain all kinds of moves. They will purposely include moves that are blunders, Usually this will be the non-obvious kind but not always. The usually non-obvious line of play that is meant to deal with it save the end user time and increases their game score. So why should it be any different for a computer book. Am I missing something here?

I'm not sure how it came to pass that computer opening books and played blindly like they were gospel or why some people think that only "good" moves have value in a book and that any "bad" move should be removed. I guess this is what happens when you get a group of people blindly following someone else's design decisions instead of thinking for themselves.

Bad moves have as much value as the good ones. They can save time and increase games scores if properly used. As far as moves being incorrectly identified in the book as "good" when they are actually "bad" there is a trivial fix for this. Before a book move is played for the first time and assuming we have no other knowledge about it, the search depth field is consulted, if the depth is zero or less than what a "reasonable" search under current time controls would allow then the move has to be searched before it's played. After the search the depth of the search, the number of nodes searched, and the evaluation are stored in the book. If it's subsequently played then the book is updated as it normally would be. Problem solved!

If I were a purist, I would tell you that before any move can be played from the book, even if its been played before, it's maximum search depth should be checked to see if it's possible under the current time controls to increase the search depth. Any time it's possible to increase the search depth it should be. If the move returned from the search results in a position that isn't in the book the new position is added to the book even if it's parent node doesn't have the right to add leaf nodes. When this happens it's time to decide if the move should be played. If you're NOT in a tournament then IMO it should be played. This will cause the program to exit the known book early. The only program I know of that will ignore the book move and play a non-book move is Crafty. Once a move is played it will be treated like any other move in the book. No special attention is needed if the book routine is properly thought out and written.

[quote=""lucasart"]Some book are built properly, with every single position verified by search. But most of them are just a dump of PGN crap from GM games...

For me the only utility of a book is to ensure non determinism (not play the same games all the time). But even for that it's useless. A simple opening selection (EPD file) is enough.[/quote]

While searching book moves and storing some information about the search in the position record is a good idea, I don't think it's practical to perform deep searches on every position in the book. Some positions are more important than others because they are seen many times more often and therefore would affect many more games should a better line of play be found. The more often a position is seen the more important it becomes and the more likely it is to warrant special attention by way of an extended search. For the vast majority of positions this is overkill.

I guess you referring to GM games as "crap" is an indication of how liitle thought you have put into designing a book. Most Othello programs have books that are a lot more sophisticated than those found in chess engines or GUI's.
mvk wrote:Agree, a search is necessary, and the main advantage is time. Secondary is to avoid bad moves just beyond your search horizon. But there is only so much you can search in advance. It should be more than you can search during the game (modulo the time advantage). Given the increase in computer power, this gives an upper bound on the useful size of a book. Too big and it will be shallow near the leaves, and you make the mistakes there. Or too big and the verifications are deep, but by the time your tree is ready the tournament machine is a lot faster thanks to semiconductor manufacturing improvements.


Lets look at it this way, if the position isn't in the book you will have to do a search in any case. So there is NO disadvantage to having a large book provided that you verify that the book move you are about to play has been searched to at least to the depth that it would be searched if it weren't a book move. In fact, having a large book that can be extended ad hoc is a great advantage over smaller, more conventional "static" books because search information can be stored for positions not originally placed in the book when it was created. So I think that bigger, more extensible, and more flexible is better in every respect, assuming the size doesn't become burdensome and the access time is relatively quick. With modern hardware I don't think either of these issues will be a problem.
jdart wrote:Selecting moves from published PGNs is problematic.
Only if your program blindly makes any move it finds in it's book. A book routine is only as good as you make it. If you want to cut corners by blindly copying someone else's book design with out even so much as looking at the design trade off that person made, then you deserve to have a worthless book in your program.
jdart wrote:Even GMs make mistakes. And sometimes moves suddenly fall out of favor, sometimes for good reasons (a refutation is found).


With a properly designed book routine (i.e. something that ISN'T a copy of a 30 year old design) none of this would present even the slightest problem.
jdart wrote:I think to build a reasonable book from PGNs you need to weight win percentage very highly. If you have a large number of games and move A is scoring 55% and move B is scoring 45%, that is a significant difference, and you should strongly prefer move A. Maybe also you should tilt the book towards recently played moves. But all this breaks down when few samples are available. Then PGN statistics are unreliable, and if you lack other information such as previous search results you are relatively blind.
I'm not sure what to think about the last two statements. So what if you don't have 5,000 games, or 500 or pick some reasonable number, from a position to rely on! Your complaining that GM's play poorly and now you're complaining that you have positions with too few games. I have a question for you: Do you uses GM games to make your books or computer games or just analysis. Because I think you're just complaining to have something to do!
Dann Corbit wrote: I favor the following approach:
1. Collect statistics for each position you intend to store in your book.
I think everyone does this.
Dann Corbit wrote:2. Analyze the every position in the book for at least one hour.
Get real! Your not going to get very far if you try to pre-analyze ever position for an hour. Unless, of course, you have VAST amounts of computer hardware or you have such a small book as to make it nearly useless.
[/quote]
I have such hardware.
And a few thousand distinct positions is not useless.
Of course, after finding the best lines, it is good to harden the book by mini-max and then reanalysis of the pv.
Dann Corbit wrote:3. For each position, if the analyzed move and the statistical move agree then that position is completed {for now *}.
4. If the analyze position's move and the statistical suggestion for a move disagree, then you need a threshold for depth needed to overcome N games. For instance, you might decide that for 30 games played for a position, at least 37 plies are needed and for 100 games played at least 40 plies are needed to prefer the computer analysis. If there are a very large number of games played (e.g. 500 or more) then the position should be reanalyzed to a high depth with several different engines. This is especially the case if the win/loss/draw ratio is favorable for the statistical move and the ce is unfavorable for the computer analyzed move. It almost always means that the computer has not analyzed deeply enough yet (though once in a while the computer find a refutation for the most commonly played move).
Good luck with this type of approach. It seems unnecessarily complicated and relies on an unproven theory that is likely to be disproved were anyone to spend time analyzing it. It's wishful thinking to believe there is a simple relationship between the number of games in a database and the moves reliability compared to the reliability of a computer search to depth "n".
Rather than complicated it is utterly simple. In fact, it can easily be automated so that human intervention is rarely needed in the process.
It is also obvious why it works.
And if it turns out that some particular count of plies is not enough, just add a few plies.
Eventually, hardware will be strong enough to do 50 plies or better with this sort of approach.
Dann Corbit wrote:* If the move in agreement performs poorly in practice, it must be re-analyzed. For that reason, it is a good idea to store book play statistics for the position.
It seems to me that if other moves from the same position don't have this problem, i.e. they perform well, then wouldn't it be easier to select one of them and save yourself a bunch of work? If no other move from that position performs well then there's a good chance that the position is bad, and it's time to back up two plies and see if that solves the problem. If you do this repeatedly and arrive at the root and you're still having problems then it's a good indication that your engine is the problem! :lol:
How do we know if other moves perform well if we do not analyze them?
I also guess that the engine is never the problem. If you use Stockfish or Komodo or Houdini, then one hour will give a pretty good analysis for a book move on a 3.4 GHz machine with a dozen cores. If you have a dozen such machines at your disposal, you can do 12machines*availabilty hrs/machine in a day. Suppose that you get 14 hours per machine. That gives 14 * 12 positions per day. About 60,000 in a year. And the best thing is you can be pretty stupid about chess and it will still work out well.
Regards,

Zen
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: On Opening books in 2015

Post by lucasart »

Laskos wrote:
lucasart wrote:
Peter Berger wrote: Simplified claim: Opening books are pretty useless these days. E.g. Stockfish don't need no book at slow time controls.
I almost agree with you. However, to measure the book effect per se, one needs to depollute the measure from the time biais.
...
Generally, the larger the book, the crappier it is.
These are complete misconceptions of both of you. Are those your opinions? Do you have at least some empirical data? Two weeks ago I played 30 games at 40'/game on pretty strong hardware, the result was +5 =25 -0 for the Stockfish with a good, large book against Stockfish no book. I am curious what miracles would happen at 3 times longer TC like 120'/game? 2 plies deeper the engine realizes it constantly blunders many openings?

A typical win for book Stockfish:

[Event "40min 4cores"]
[Site "?"]
[Date "2015.03.07"]
[Round "7.1"]
[White "Stockfish 020315 64 BMI2 Book"]
[Black "Stockfish 020315 64 BMI2"]
[Result "1-0"]
[ECO "C11"]
[Annotator "0.00;0.23"]
[PlyCount "163"]
[EventDate "2015.03.07"]
[EventType "tourn"]
[Source "Kai"]
[TimeControl "2400"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=36.7 plies; 8,989kN/s; PlaychessNightmare.ctg B=34.0 plies; 8,035kN/s}
1. e4 {B 0} e6 {0.23/28 64} 2. d4 {B 0} d5 {0.15/30 49} 3. Nc3 {B 0} Nf6 {0.12/29 40} 4. e5 {B 0} Nfd7 {0.09/31 42} 5. f4 {B 0} c5 {0.02/31 46} 6. Nf3 {B 0} Nc6 {0.00/30 36} 7. Be3 {B 0} Qb6 {0.00/32 26} 8. Na4 {B 0} Qa5+ {0.00/33 23} 9. Nc3 {B 0} Qb6 {0.00/36 31} 10. Na4 {B 0} Qa5+ {0.00/41 34} 11. c3 {B 0} cxd4 {0.00/32 38} 12. b4 {B 0} Nxb4 {0.00/33 23} 13. cxb4 {B 0} Bxb4+ {0.39/30 42} 14. Bd2 {B 0} Bxd2+ {0.32/31 43} 15. Nxd2 {B 0} b6 {0.29/34 41} 16. Bd3 {B 0} Ba6 {0.36/33 59} 17. Nb2 {B 0} Nc5 {0.53/33 41} 18. Bxa6 {B 0} Qxa6 {0.55/32 68} 19. Qe2 {B 0} Rc8 {0.57/32 34} 20. Qxa6 {B 0} Nxa6 {0.50/32 38} 21. Rb1 {B 0} O-O {0.58/33 90} 22. Nf3 {0.65/35 55} Nb4 {0.65/34 37}

...

1-0

At moves 6 to 12, the non-book Stockfish thinks it gained ground as black, giving 0.00 eval. While at move 22, exiting the book, it admits it blundered the opening, giving 0.65 to book Stockfish, which just exited the book. In 4 of the 5 wins of book Stockfish, the games went on these lines. Then, there are exiting positions which Stockfish considers equal (close to 0.00) while not being so, usually according to outcome statistics. The burden to prove your naive, misleading statements and LTC miracles (2 plies deeper) is entirely on you. I claim that maybe 5-10 years are needed for top engines to play openings reasonably and not need books. It's not about 2 plies more of the same Stockfish, your miraculous LTC. It's about 20+ plies more, horizon, lack of outcome predictability, unreliability of PV in the openings.
There's a huge time biais in your test. You didn't bother to read what I said about time biais ? Do it with a fixed time per move, and you will see a difference. Besides, you can't conclude much from such a small sample.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: On Opening books in 2015

Post by Laskos »

lucasart wrote: There's a huge time biais in your test. You didn't bother to read what I said about time biais ? Do it with a fixed time per move, and you will see a difference. Besides, you can't conclude much from such a small sample.
That I did two weeks ago, 40'/game match. Your slow time control claim is akin to some claiming "engine X is the strongest at infinite time control". Sure, that cannot be falsified, the sample is always 0.

I guess there are so many standard C/C++ guys with such misconceptions due to nature of their usual programming. Using C/C++ always imperatively lures them into thinking that a book is a collection of sequences of best 1-movers. Imperative programming, a quasi totality of chess programming, is basically a series of orders (imperatives) given to a computer. So the chess openings according to Stockfish must be the best Stockfish eval in a series of FEN after FEN, one year of thinking on each position. Guess what, you will end up with a weak book after bazillion years. From time to time it's better to change the paradigm, maybe to relax on declarative languages, which describe a problem rather than define a solution.

Now, to address a bit your time bias concern:

The exit from the book was already lost for non-book Stockfish. Not time squeeze there. I checked the moves chosen at 40'/game for inaccuracies of non-book Sotckfish of minimum 6 cp at 3'/move (LTC) in the opening book moves. Stockfish at LTC found only one inaccuracy:

[Event "40min 4cores"]
[Site "?"]
[Date "2015.03.07"]
[Round "7.1"]
[White "Stockfish 020315 64 BMI2 Book"]
[Black "Stockfish 020315 64 BMI2"]
[Result "1-0"]
[ECO "C11"]
[Annotator "0.00;0.23"]
[PlyCount "44"]
[EventDate "2015.03.07"]
[Source "Kai"]
[TimeControl "2400"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=36.7 plies; 8,989kN/s;
PlaychessNightmare.ctg B=34.0 plies; 8,035kN/s} 1. e4 {B 0 last book move} e6 {0.23/28 64} 2. d4 {B 0} d5 {0.15/30 49} 3. Nc3 {B 0} Nf6 {0.12/29 40} 4. e5 {B 0} Nfd7 {0.09/31 42} 5. f4 {B 0} c5 {0.02/31 46} 6. Nf3 {B 0} Nc6 {0.00/30 36} 7. Be3 {B 0} Qb6 {0.00/32 26} 8. Na4 {B 0} Qa5+ {0.00/33 23} 9. Nc3 {B 0} Qb6 {0.00/36 31} 10. Na4 {B 0} Qa5+ {0.00/41 34} 11. c3 {B 0} cxd4 {0.00/32 38} 12. b4 {B 0} Nxb4 {0.00/33 23} 13. cxb4 {B 0} Bxb4+ {0.39/30 42} 14. Bd2 {B 0} Bxd2+ {0.32/31 43} 15. Nxd2 {B 0} b6 {0.29/34 41} 16. Bd3 {B 0}

16... Ba6 {0.36/33 59}

({0.59 Stockfish 020315 64 BMI2:} 16... Nc5 17. Nxc5 bxc5 18. O-O Bd7 19. Rb1 g6 20. Rb7 Rc8 21. Qe2 Bc6 22. Bb5 O-O 23. Bxc6 Rxc6 24. Nb3 Qa3 25. f5 exf5 26. Rd7 Re8 27. Rxd5 c4 28. Nxd4 Rc5 29. Rd7 Rexe5 30. Rd8+ Kg7 31. Qf2 Qe3 32. Qxe3 Rxe3 33. g4 fxg4 34. Rd7 h5 35. Rfxf7+ Kh6 36. Rd8 Ra5 37. Kf2 {0.44/36})

17. Nb2 {B 0} Nc5 {0.53/33 41} 18. Bxa6 {B 0} Qxa6 {0.55/32 68} 19. Qe2 {B 0} Rc8 {0.57/32 34} 20. Qxa6 {B 0} Nxa6 {0.50/32 38} 21. Rb1 {B 0} O-O {0.58/33 90} 22. Nf3 {0.65/35 55} Nb4 {0.65/34 37} 1-0

Then I set at 2 hours/40 moves (LTC) two non-book Stockfishes to play the game from the position just before the SOLE inaccuracy:

[Event "120 min 40 moves"]
[Site "?"]
[Date "2015.03.17"]
[Round "1"]
[White "Stockfish 020315 64 BMI2"]
[Black "Stockfish 020315 64 BMI2 Same"]
[Result "1-0"]
[Annotator "0.72;0.58"]
[SetUp "1"]
[FEN "r1b1k2r/p4ppp/1p2p3/q1npP3/N2p1P2/3B4/P2N2PP/R2QK2R w KQkq - 0 17"]
[PlyCount "71"]
[EventDate "2015.03.17"]
[EventType "tourn"]
[Source "Kai"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=45.1 plies; 8,236kN/s B=45.5 plies; 8,318kN/s} 17. Nxc5 {0.72/34 121} bxc5 {0.58/32 121} 18. O-O {0.48/35 163} Bd7 {0.63/34 167} 19. Rb1 {0.77/34 156} O-O {0.87/35 328 (g6)} 20. Nb3 {1.27/38 476} Qa3 {0.97/37 190} 21. Qc2 {1.38/39 180 (Rf3)} c4 {1.43/39 130} 22. Bxh7+ {1.32/39 125} Kh8 {1.43/1 0} 23. Rf3 {1.19/40 130} cxb3 {1.45/39 116} 24. Rfxb3 {1.46/39 115} d3 {1.28/39 123} 25. Rxd3 {1.33/41 179} Qa4 {1.57/42 147} 26. Rh3 {1.45/40 223} Qxc2 {1.38/41 123} 27. Bxc2+ {1.69/40 150} Kg8 {1.38/1 0} 28. Bh7+ {1.81/40 214} Kh8 {1.38/1 0} 29. Rb7 {1.84/39 164} Rfd8 {1.69/41 246} 30. f5 {1.88/42 208} Rac8 {1.80/43 654} 31. f6 {1.96/43 179} Rb8 {1.82/42 122} 32. Rxa7 {2.62/43 428} Ra8 {2.19/41 142} 33. Rc7 {2.72/43 162} Rac8 {2.57/40 170} 34. Bd3+ {2.87/41 499 (Rxc8)} Kg8 {2.57/1 0} 35. Rxc8 {2.74/42 150} Rxc8 {2.68/41 200} 36. Rh7 {3.00/42 251} gxf6 {2.71/39 176} 37. exf6 {2.97/42 262} Rf8 {2.82/40 201} 38. Rg7+ {3.01/40 221} Kh8 {2.82/1 0} 39. Kf2 {3.02/40 95} e5 {3.03/37 479} 40. Rg5 {3.19/42 162} e4 {3.25/40 132} 41. Rxd5 {3.31/42 107} exd3 {3.35/44 134} 42. Rxd7 {3.40/41 125} Kh7 {3.46/43 138} 43. a4 {3.86/44 105 (Rxd3)} Ra8 {3.63/49 146} 44. Rxf7+ {4.55/56 112} Kg6 {4.03/50 150} 45. Rd7 {4.56/63 637} d2 {4.29/53 150 (Rxa4)} 46. f7 {5.39/55 84 (Ke3)} Kg7 {5.96/53 165 (d1R)} 47. Ke2 {8.50/54 397 (Rxd2)} Rxa4 {6.14/59 219} 48. h3 {10.07/61 79 (g3)} Ra2 {6.14/59 170 (Re4+)} 49. Rxd2 {10.07/64 74} Ra1 {16.70/69 322 (Ra3)} 50. Kf3 {10.63/66 74 (Rd3)} Rh1 {18.05/70 159 (Ra6)} 51. Kg4 {10.70/67 49 (Rd8)} Ra1 {42.01/74 258} 52. h4 {13.06/52 67 (f8Q+)} 1-0

No time bias here, and the book position BEFORE Stockfish's 40'/game inaccuracy is a book loss. A position with Stockfish eval not far from 0.29 at LTC, following a line which Stockfish considers accurate at LTC, which leads to almost sure book loss. Again, books are not sequences of good 1-movers at LTC.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: On Opening books in 2015

Post by Milos »

Laskos wrote:Two weeks ago I played 30 games at 40'/game on pretty strong hardware, the result was +5 =25 -0 for the Stockfish with a good, large book against Stockfish no book. I am curious what miracles would happen at 3 times longer TC like 120'/game? 2 plies deeper the engine realizes it constantly blunders many openings?

A typical win for book Stockfish:

[Event "40min 4cores"]
[Site "?"]
[Date "2015.03.07"]
[Round "7.1"]
[White "Stockfish 020315 64 BMI2 Book"]
[Black "Stockfish 020315 64 BMI2"]
[Result "1-0"]
[ECO "C11"]
[Annotator "0.00;0.23"]
[PlyCount "163"]
[EventDate "2015.03.07"]
[EventType "tourn"]
[Source "Kai"]
[TimeControl "2400"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=36.7 plies; 8,989kN/s; PlaychessNightmare.ctg B=34.0 plies; 8,035kN/s}
1. e4 {B 0} e6 {0.23/28 64} 2. d4 {B 0} d5 {0.15/30 49} 3. Nc3 {B 0} Nf6 {0.12/29 40} 4. e5 {B 0} Nfd7 {0.09/31 42} 5. f4 {B 0} c5 {0.02/31 46} 6. Nf3 {B 0} Nc6 {0.00/30 36} 7. Be3 {B 0} Qb6 {0.00/32 26} 8. Na4 {B 0} Qa5+ {0.00/33 23} 9. Nc3 {B 0} Qb6 {0.00/36 31} 10. Na4 {B 0} Qa5+ {0.00/41 34} 11. c3 {B 0} cxd4 {0.00/32 38} 12. b4 {B 0} Nxb4 {0.00/33 23} 13. cxb4 {B 0} Bxb4+ {0.39/30 42} 14. Bd2 {B 0} Bxd2+ {0.32/31 43} 15. Nxd2 {B 0} b6 {0.29/34 41} 16. Bd3 {B 0} Ba6 {0.36/33 59} 17. Nb2 {B 0} Nc5 {0.53/33 41} 18. Bxa6 {B 0} Qxa6 {0.55/32 68} 19. Qe2 {B 0} Rc8 {0.57/32 34} 20. Qxa6 {B 0} Nxa6 {0.50/32 38} 21. Rb1 {B 0} O-O {0.58/33 90} 22. Nf3 {0.65/35 55} Nb4 {0.65/34 37}

...

1-0
+5/=25/-0 is 58 Elo difference with 50Elo error bars.
The engine with the book got out of the book after move 21, at that point engine without book already spent 38% of its time which is equivalent to engine with the book having 61% more time than engine without the book, which is at least 40Elo worth (in a typical case as you said that was a typical win ;)).

So as usual your "theories and experiments" are useless. :lol:

P.S. Don't pretend now that you gave engine without book more time coz you didn't (it's clear from time spent per move ;)).
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: On Opening books in 2015

Post by Laskos »

Milos wrote:
Laskos wrote:Two weeks ago I played 30 games at 40'/game on pretty strong hardware, the result was +5 =25 -0 for the Stockfish with a good, large book against Stockfish no book. I am curious what miracles would happen at 3 times longer TC like 120'/game? 2 plies deeper the engine realizes it constantly blunders many openings?

A typical win for book Stockfish:

[Event "40min 4cores"]
[Site "?"]
[Date "2015.03.07"]
[Round "7.1"]
[White "Stockfish 020315 64 BMI2 Book"]
[Black "Stockfish 020315 64 BMI2"]
[Result "1-0"]
[ECO "C11"]
[Annotator "0.00;0.23"]
[PlyCount "163"]
[EventDate "2015.03.07"]
[EventType "tourn"]
[Source "Kai"]
[TimeControl "2400"]

{Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz 3598 MHz W=36.7 plies; 8,989kN/s; PlaychessNightmare.ctg B=34.0 plies; 8,035kN/s}
1. e4 {B 0} e6 {0.23/28 64} 2. d4 {B 0} d5 {0.15/30 49} 3. Nc3 {B 0} Nf6 {0.12/29 40} 4. e5 {B 0} Nfd7 {0.09/31 42} 5. f4 {B 0} c5 {0.02/31 46} 6. Nf3 {B 0} Nc6 {0.00/30 36} 7. Be3 {B 0} Qb6 {0.00/32 26} 8. Na4 {B 0} Qa5+ {0.00/33 23} 9. Nc3 {B 0} Qb6 {0.00/36 31} 10. Na4 {B 0} Qa5+ {0.00/41 34} 11. c3 {B 0} cxd4 {0.00/32 38} 12. b4 {B 0} Nxb4 {0.00/33 23} 13. cxb4 {B 0} Bxb4+ {0.39/30 42} 14. Bd2 {B 0} Bxd2+ {0.32/31 43} 15. Nxd2 {B 0} b6 {0.29/34 41} 16. Bd3 {B 0} Ba6 {0.36/33 59} 17. Nb2 {B 0} Nc5 {0.53/33 41} 18. Bxa6 {B 0} Qxa6 {0.55/32 68} 19. Qe2 {B 0} Rc8 {0.57/32 34} 20. Qxa6 {B 0} Nxa6 {0.50/32 38} 21. Rb1 {B 0} O-O {0.58/33 90} 22. Nf3 {0.65/35 55} Nb4 {0.65/34 37}

...

1-0
+5/=25/-0 is 58 Elo difference with 50Elo error bars.
The engine with the book got out of the book after move 21, at that point engine without book already spent 38% of its time which is equivalent to engine with the book having 61% more time than engine without the book, which is at least 40Elo worth (in a typical case as you said that was a typical win ;)).

So as usual your "theories and experiments" are useless. :lol:

P.S. Don't pretend now that you gave engine without book more time coz you didn't (it's clear from time spent per move ;)).
I gave 40'/game, so no, I didn't give more time for one engine, and the numbers you gave are pretty close, maybe 30% and 43% instead of 38% and 61%. In any case, as I mentioned, 4 of the 5 SF losses (without a book) were book losses (as was shown in one case at LTC with equal time, no books), and on average the book engine was exiting the book with 20cp-30cp advantage versus color situation over non-book engine, and all your blabbering misses the point, as it always happens with your half-educated comments. Do some homework before commenting.