The influence of books on test results.

lkaufman · Post by **lkaufman** » Tue Jul 24, 2012 5:31 pm

Of all the factors that can influence test results, such as time limit, increment vs. repeating controls, ponder, hardware, etc., the one we are currently most interested in is the effect of opening books/testsuites. Our own distributed tester uses a five move book, rather shorter than that used by most testers. Since it shows a sixteen elo lead for Komodo 5 over Houdini 1.5 (after over 11k games) which is not shown by the testing agencies, and since the only result on this forum showing Komodo 5 beating Houdini 2 in a long match used a four move book, we decided to make a new testbook that is more typical of books normally used in tests - it averages six moves, but some popular lines are much longer than this. Based on hyper-fast testing, our performance drops by 12 Elo playing against Critter (the closest opponent at hyperspeed levels) after 6700 games. So assuming this would also be true at the normal blitz levels used in the distributed test, this would appear to account for most of the discrepancy between our own test results and the others.
Has anyone else run long tests to compare the effect of different opening books on test results? The tests would have to be several thousand games long, but can be at very fast levels.
Probably we will modify our tester to use this or a similar new book, so that future results will be better predicted by it. My conclusion is that Komodo is better than other top programs at playing the early opening, but the longer the book line supplied, the less valuable this asset becomes. Perhaps switching to a more normal book for testing will gradually help Komodo as different features are tuned using this new book.
I never considered the opening book to be much of a factor in test results (assuming colors are switched for each book position tested), but I am gradually becoming a believer.

velmarin · Post by **velmarin** » Tue Jul 24, 2012 6:41 pm

That system used you, Chessbase book, book Schreder GUI,
What system?.

How many moves do you want? 4,5,6,8.

How many openings diverse needs.?

It can be prepared if you give the data.

I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .

gleperlier · Post by **gleperlier** » Tue Jul 24, 2012 7:15 pm

lkaufman wrote:Of all the factors that can influence test results, such as time limit, increment vs. repeating controls, ponder, hardware, etc., the one we are currently most interested in is the effect of opening books/testsuites. Our own distributed tester uses a five move book, rather shorter than that used by most testers. Since it shows a sixteen elo lead for Komodo 5 over Houdini 1.5 (after over 11k games) which is not shown by the testing agencies, and since the only result on this forum showing Komodo 5 beating Houdini 2 in a long match used a four move book, we decided to make a new testbook that is more typical of books normally used in tests - it averages six moves, but some popular lines are much longer than this. Based on hyper-fast testing, our performance drops by 12 Elo playing against Critter (the closest opponent at hyperspeed levels) after 6700 games. So assuming this would also be true at the normal blitz levels used in the distributed test, this would appear to account for most of the discrepancy between our own test results and the others.
Has anyone else run long tests to compare the effect of different opening books on test results? The tests would have to be several thousand games long, but can be at very fast levels.
Probably we will modify our tester to use this or a similar new book, so that future results will be better predicted by it. My conclusion is that Komodo is better than other top programs at playing the early opening, but the longer the book line supplied, the less valuable this asset becomes. Perhaps switching to a more normal book for testing will gradually help Komodo as different features are tuned using this new book.
I never considered the opening book to be much of a factor in test results (assuming colors are switched for each book position tested), but I am gradually becoming a believer.

Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab

lkaufman · Post by **lkaufman** » Tue Jul 24, 2012 8:08 pm

velmarin wrote:That system used you, Chessbase book, book Schreder GUI,
What system?.

How many moves do you want? 4,5,6,8.

How many openings diverse needs.?

It can be prepared if you give the data.

I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .

We have always made our own test books, because we need at least 10,000 positions (to run 20,000 games). I don't know any publicly available test books like this, please tell me if there are any. I think CCRL and CEGT seem to use books averaging about 8 moves per side. All positions should be ones that have occurred a reasonable number of times in master play, so we can be pretty sure that White has just a fairly normal advantage. Which publicly available books come closest to meeting this description?

lkaufman · Post by **lkaufman** » Tue Jul 24, 2012 8:11 pm

gleperlier wrote:
lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab
I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.

gleperlier · Post by **gleperlier** » Tue Jul 24, 2012 8:19 pm

lkaufman wrote:
gleperlier wrote:
lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab

I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
I always found this quite obvious in my games but have to make some tests to confirm. For example, my engines could have lost 100 elo points on Playchess without TBs.

Will keep you updated with my tests.

Cheers,

Gab

Sedat Canbaz · Post by **Sedat Canbaz** » Tue Jul 24, 2012 8:24 pm

lkaufman wrote:
velmarin wrote:That system used you, Chessbase book, book Schreder GUI,
What system?.

How many moves do you want? 4,5,6,8.

How many openings diverse needs.?

It can be prepared if you give the data.

I am preparing one of 8 movements in Fritz GUI, always ends with Black movement,
the idea that always begins think the white side .
We have always made our own test books, because we need at least 10,000 positions (to run 20,000 games). I don't know any publicly available test books like this, please tell me if there are any. I think CCRL and CEGT seem to use books averaging about 8 moves per side. All positions should be ones that have occurred a reasonable number of times in master play, so we can be pretty sure that White has just a fairly normal advantage. Which publicly available books come closest to meeting this description?

Dear Larry,

Just my two cents over this issue,

I don't suggest to be used a large opening book,where the engines will be played with 10,000 positions

Because i am afraid that Engines Elo performance will suffer due to such variety openings(there are many holes in a such huge database)

Best,
Sedat

velmarin · Post by **velmarin** » Tue Jul 24, 2012 8:34 pm

Larry, There I have an idea book for Fritz.

http://www.talkchess.com/forum/viewtopi ... 473#475473

Uri Blass · Post by **Uri Blass** » Tue Jul 24, 2012 11:22 pm

gleperlier wrote:
lkaufman wrote:
gleperlier wrote:
lkaufman wrote: Yes I think book have big influence on ratings, not only by the number of ply but by the variety of openings.
You can also add "tablebases" in the factors. I run 1000 games Komodo versus Houdini without TB, with TB 3, with TB 4 and with TB 5, you will see some difference. If I run a tournament with some engines without, some with Robbobases, some with Nalimov etc. it will also change a lot.

I would even say that for me, future of Chess engine "official" tournaments should be with books limited in plys and limited Tablebases.

Cheers,

Gab

I thought it was not at all clear that TBs of any sort help ratings, but please tell me what your findings were. Did you find that Houdini did progressively better with more TBs, or were the results pretty much random? Same question for other engines too.
I always found this quite obvious in my games but have to make some tests to confirm. For example, my engines could have lost 100 elo points on Playchess without TBs.

Will keep you updated with my tests.

Cheers,

Gab
100 elo points?

You cannot be serious.
I always thought that tablebases have almost no influence on playing strength based on what I read(Stockfish does not support tablebases exactly because the authors found no way to earn elo from using them).

Rating on playchess may be unstable and it probably possible to lose 100 elo or to earn 100 elo with no change in the program.

gleperlier · Post by **gleperlier** » Tue Jul 24, 2012 11:28 pm

Uri Blass wrote: 100 elo points?

You cannot be serious.
I always thought that tablebases have almost no influence on playing strength based on what I read(Stockfish does not support tablebases exactly because the authors found no way to earn elo from using them).

Rating on playchess may be unstable and it probably possible to lose 100 elo or to earn 100 elo with no change in the program.

Maybe yes, maybe not. That's only my feelings. Ask all theses guys with 6 to 12 cores engines who have only one thing in mind 'Elo' to remove their tablebases... you will see that none will do. I guess their is a reason and they have tested it.

But again, I don't want to argue, it's only a feeling.

Gab

The influence of books on test results.

The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.

Re: The influence of books on test results.