The influence of books on test results.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

The influence of books on test results.

Post by lkaufman »

Of all the factors that can influence test results, such as time limit, increment vs. repeating controls, ponder, hardware, etc., the one we are currently most interested in is the effect of opening books/testsuites. Our own distributed tester uses a five move book, rather shorter than that used by most testers. Since it shows a sixteen elo lead for Komodo 5 over Houdini 1.5 (after over 11k games) which is not shown by the testing agencies, and since the only result on this forum showing Komodo 5 beating Houdini 2 in a long match used a four move book, we decided to make a new testbook that is more typical of books normally used in tests - it averages six moves, but some popular lines are much longer than this. Based on hyper-fast testing, our performance drops by 12 Elo playing against Critter (the closest opponent at hyperspeed levels) after 6700 games. So assuming this would also be true at the normal blitz levels used in the distributed test, this would appear to account for most of the discrepancy between our own test results and the others.
Has anyone else run long tests to compare the effect of different opening books on test results? The tests would have to be several thousand games long, but can be at very fast levels.
Probably we will modify our tester to use this or a similar new book, so that future results will be better predicted by it. My conclusion is that Komodo is better than other top programs at playing the early opening, but the longer the book line supplied, the less valuable this asset becomes. Perhaps switching to a more normal book for testing will gradually help Komodo as different features are tuned using this new book.
I never considered the opening book to be much of a factor in test results (assuming colors are switched for each book position tested), but I am gradually becoming a believer.
User avatar
velmarin
Posts: 1600
Joined: Mon Feb 21, 2011 9:48 am

Re: The influence of books on test results.

Post by velmarin »

That system used you, Chessbase book, book Schreder GUI,
What system?.

How many moves do you want? 4,5,6,8.

How many openings diverse needs.?

It can be prepared if you give the data.

I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
User avatar
gleperlier
Posts: 1033
Joined: Sat Feb 04, 2012 10:03 pm

Re: The influence of books on test results.

Post by gleperlier »

lkaufman wrote:Of all the factors that can influence test results, such as time limit, increment vs. repeating controls, ponder, hardware, etc., the one we are currently most interested in is the effect of opening books/testsuites. Our own distributed tester uses a five move book, rather shorter than that used by most testers. Since it shows a sixteen elo lead for Komodo 5 over Houdini 1.5 (after over 11k games) which is not shown by the testing agencies, and since the only result on this forum showing Komodo 5 beating Houdini 2 in a long match used a four move book, we decided to make a new testbook that is more typical of books normally used in tests - it averages six moves, but some popular lines are much longer than this. Based on hyper-fast testing, our performance drops by 12 Elo playing against Critter (the closest opponent at hyperspeed levels) after 6700 games. So assuming this would also be true at the normal blitz levels used in the distributed test, this would appear to account for most of the discrepancy between our own test results and the others.
Has anyone else run long tests to compare the effect of different opening books on test results? The tests would have to be several thousand games long, but can be at very fast levels.
Probably we will modify our tester to use this or a similar new book, so that future results will be better predicted by it. My conclusion is that Komodo is better than other top programs at playing the early opening, but the longer the book line supplied, the less valuable this asset becomes. Perhaps switching to a more normal book for testing will gradually help Komodo as different features are tuned using this new book.
I never considered the opening book to be much of a factor in test results (assuming colors are switched for each book position tested), but I am gradually becoming a believer.
Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: The influence of books on test results.

Post by lkaufman »

velmarin wrote:That system used you, Chessbase book, book Schreder GUI,
What system?.

How many moves do you want? 4,5,6,8.

How many openings diverse needs.?

It can be prepared if you give the data.

I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
We have always made our own test books, because we need at least 10,000 positions (to run 20,000 games). I don't know any publicly available test books like this, please tell me if there are any. I think CCRL and CEGT seem to use books averaging about 8 moves per side. All positions should be ones that have occurred a reasonable number of times in master play, so we can be pretty sure that White has just a fairly normal advantage. Which publicly available books come closest to meeting this description?
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: The influence of books on test results.

Post by lkaufman »

gleperlier wrote:
lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab
I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
User avatar
gleperlier
Posts: 1033
Joined: Sat Feb 04, 2012 10:03 pm

Re: The influence of books on test results.

Post by gleperlier »

lkaufman wrote:
gleperlier wrote:
lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab

I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
I always found this quite obvious in my games but have to make some tests to confirm. For example, my engines could have lost 100 elo points on Playchess without TBs.

Will keep you updated with my tests.

Cheers,

Gab
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: The influence of books on test results.

Post by Sedat Canbaz »

lkaufman wrote:
velmarin wrote:That system used you, Chessbase book, book Schreder GUI,
What system?.

How many moves do you want? 4,5,6,8.

How many openings diverse needs.?

It can be prepared if you give the data.

I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
We have always made our own test books, because we need at least 10,000 positions (to run 20,000 games). I don't know any publicly available test books like this, please tell me if there are any. I think CCRL and CEGT seem to use books averaging about 8 moves per side. All positions should be ones that have occurred a reasonable number of times in master play, so we can be pretty sure that White has just a fairly normal advantage. Which publicly available books come closest to meeting this description?
Dear Larry,

Just my two cents over this issue,

I don't suggest to be used a large opening book,where the engines will be played with 10,000 positions

Because i am afraid that Engines Elo performance will suffer due to such variety openings(there are many holes in a such huge database)

Best,
Sedat
User avatar
velmarin
Posts: 1600
Joined: Mon Feb 21, 2011 9:48 am

Re: The influence of books on test results.

Post by velmarin »

Larry, There I have an idea book for Fritz.

http://www.talkchess.com/forum/viewtopi ... 473#475473
Uri Blass
Posts: 10892
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: The influence of books on test results.

Post by Uri Blass »

gleperlier wrote:
lkaufman wrote:
gleperlier wrote:
lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab

I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
I always found this quite obvious in my games but have to make some tests to confirm. For example, my engines could have lost 100 elo points on Playchess without TBs.

Will keep you updated with my tests.

Cheers,

Gab
100 elo points?

You cannot be serious.
I always thought that tablebases have almost no influence on playing strength based on what I read(Stockfish does not support tablebases exactly because the authors found no way to earn elo from using them).

Rating on playchess may be unstable and it probably possible to lose 100 elo or to earn 100 elo with no change in the program.
User avatar
gleperlier
Posts: 1033
Joined: Sat Feb 04, 2012 10:03 pm

Re: The influence of books on test results.

Post by gleperlier »

Uri Blass wrote: 100 elo points?

You cannot be serious.
I always thought that tablebases have almost no influence on playing strength based on what I read(Stockfish does not support tablebases exactly because the authors found no way to earn elo from using them).

Rating on playchess may be unstable and it probably possible to lose 100 elo or to earn 100 elo with no change in the program.
Maybe yes, maybe not. That's only my feelings. Ask all theses guys with 6 to 12 cores engines who have only one thing in mind 'Elo' to remove their tablebases... you will see that none will do. I guess their is a reason and they have tested it.

But again, I don't want to argue, it's only a feeling.

Gab