Of all the factors that can influence test results, such as time limit, increment vs. repeating controls, ponder, hardware, etc., the one we are currently most interested in is the effect of opening books/testsuites. Our own distributed tester uses a five move book, rather shorter than that used by most testers. Since it shows a sixteen elo lead for Komodo 5 over Houdini 1.5 (after over 11k games) which is not shown by the testing agencies, and since the only result on this forum showing Komodo 5 beating Houdini 2 in a long match used a four move book, we decided to make a new testbook that is more typical of books normally used in tests - it averages six moves, but some popular lines are much longer than this. Based on hyper-fast testing, our performance drops by 12 Elo playing against Critter (the closest opponent at hyperspeed levels) after 6700 games. So assuming this would also be true at the normal blitz levels used in the distributed test, this would appear to account for most of the discrepancy between our own test results and the others.
Has anyone else run long tests to compare the effect of different opening books on test results? The tests would have to be several thousand games long, but can be at very fast levels.
Probably we will modify our tester to use this or a similar new book, so that future results will be better predicted by it. My conclusion is that Komodo is better than other top programs at playing the early opening, but the longer the book line supplied, the less valuable this asset becomes. Perhaps switching to a more normal book for testing will gradually help Komodo as different features are tuned using this new book.
I never considered the opening book to be much of a factor in test results (assuming colors are switched for each book position tested), but I am gradually becoming a believer.
The influence of books on test results.
Moderator: Ras
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
-
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
Re: The influence of books on test results.
That system used you, Chessbase book, book Schreder GUI,
What system?.
How many moves do you want? 4,5,6,8.
How many openings diverse needs.?
It can be prepared if you give the data.
I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
What system?.
How many moves do you want? 4,5,6,8.
How many openings diverse needs.?
It can be prepared if you give the data.
I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
-
- Posts: 1033
- Joined: Sat Feb 04, 2012 10:03 pm
Re: The influence of books on test results.
Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.lkaufman wrote:Of all the factors that can influence test results, such as time limit, increment vs. repeating controls, ponder, hardware, etc., the one we are currently most interested in is the effect of opening books/testsuites. Our own distributed tester uses a five move book, rather shorter than that used by most testers. Since it shows a sixteen elo lead for Komodo 5 over Houdini 1.5 (after over 11k games) which is not shown by the testing agencies, and since the only result on this forum showing Komodo 5 beating Houdini 2 in a long match used a four move book, we decided to make a new testbook that is more typical of books normally used in tests - it averages six moves, but some popular lines are much longer than this. Based on hyper-fast testing, our performance drops by 12 Elo playing against Critter (the closest opponent at hyperspeed levels) after 6700 games. So assuming this would also be true at the normal blitz levels used in the distributed test, this would appear to account for most of the discrepancy between our own test results and the others.
Has anyone else run long tests to compare the effect of different opening books on test results? The tests would have to be several thousand games long, but can be at very fast levels.
Probably we will modify our tester to use this or a similar new book, so that future results will be better predicted by it. My conclusion is that Komodo is better than other top programs at playing the early opening, but the longer the book line supplied, the less valuable this asset becomes. Perhaps switching to a more normal book for testing will gradually help Komodo as different features are tuned using this new book.
I never considered the opening book to be much of a factor in test results (assuming colors are switched for each book position tested), but I am gradually becoming a believer.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.
I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.
Cheers,
Gab
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: The influence of books on test results.
We have always made our own test books, because we need at least 10,000 positions (to run 20,000 games). I don't know any publicly available test books like this, please tell me if there are any. I think CCRL and CEGT seem to use books averaging about 8 moves per side. All positions should be ones that have occurred a reasonable number of times in master play, so we can be pretty sure that White has just a fairly normal advantage. Which publicly available books come closest to meeting this description?velmarin wrote:That system used you, Chessbase book, book Schreder GUI,
What system?.
How many moves do you want? 4,5,6,8.
How many openings diverse needs.?
It can be prepared if you give the data.
I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: The influence of books on test results.
gleperlier wrote:I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.
I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.
Cheers,
Gab
-
- Posts: 1033
- Joined: Sat Feb 04, 2012 10:03 pm
Re: The influence of books on test results.
lkaufman wrote:I always found this quite obvious in my games but have to make some tests to confirm. For example, my engines could have lost 100 elo points on Playchess without TBs.gleperlier wrote:lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.
I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.
Cheers,
Gab
I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
Will keep you updated with my tests.
Cheers,
Gab
-
- Posts: 3018
- Joined: Thu Mar 09, 2006 11:58 am
- Location: Antalya/Turkey
Re: The influence of books on test results.
Dear Larry,lkaufman wrote:We have always made our own test books, because we need at least 10,000 positions (to run 20,000 games). I don't know any publicly available test books like this, please tell me if there are any. I think CCRL and CEGT seem to use books averaging about 8 moves per side. All positions should be ones that have occurred a reasonable number of times in master play, so we can be pretty sure that White has just a fairly normal advantage. Which publicly available books come closest to meeting this description?velmarin wrote:That system used you, Chessbase book, book Schreder GUI,
What system?.
How many moves do you want? 4,5,6,8.
How many openings diverse needs.?
It can be prepared if you give the data.
I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
Just my two cents over this issue,
I don't suggest to be used a large opening book,where the engines will be played with 10,000 positions
Because i am afraid that Engines Elo performance will suffer due to such variety openings(there are many holes in a such huge database)
Best,
Sedat
-
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
-
- Posts: 10892
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: The influence of books on test results.
gleperlier wrote:100 elo points?lkaufman wrote:I always found this quite obvious in my games but have to make some tests to confirm. For example, my engines could have lost 100 elo points on Playchess without TBs.gleperlier wrote:lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.
I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.
Cheers,
Gab
I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
Will keep you updated with my tests.
Cheers,
Gab
You cannot be serious.
I always thought that tablebases have almost no influence on playing strength based on what I read(Stockfish does not support tablebases exactly because the authors found no way to earn elo from using them).
Rating on playchess may be unstable and it probably possible to lose 100 elo or to earn 100 elo with no change in the program.
-
- Posts: 1033
- Joined: Sat Feb 04, 2012 10:03 pm
Re: The influence of books on test results.
Maybe yes, maybe not. That's only my feelings. Ask all theses guys with 6 to 12 cores engines who have only one thing in mind 'Elo' to remove their tablebases... you will see that none will do. I guess their is a reason and they have tested it.Uri Blass wrote: 100 elo points?
You cannot be serious.
I always thought that tablebases have almost no influence on playing strength based on what I read(Stockfish does not support tablebases exactly because the authors found no way to earn elo from using them).
Rating on playchess may be unstable and it probably possible to lose 100 elo or to earn 100 elo with no change in the program.
But again, I don't want to argue, it's only a feeling.
Gab