What happens using egbb

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: What happens using egbb

Post by Ferdy »

Laskos wrote:
Ferdy wrote:
Yes, it seems a Houdini problem. Is the ELO benefit from egbb measurable in Deuterium?
I start testing the latest released Deuterium with egbb at tc 40moves/60sec, with different positions and color reversed.

(1) From positions with mixed pieces on late middle game but more than 5-men. Target games is 1600
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              7.85     0.00   7.06   7.06     1091  52.25%   -7.85  57.93%  23.28%       254-205-632
   2 Deuterium-v13.1.31.113-64bit   -7.85   -15.69   7.06   7.06     1091  47.75%    7.85  57.93%  18.79%       205-254-632
SPRT:

Code: Select all

Engine: Deuterium-113-egbb
SPRT: elo0 = -1.5, elo1 = +4.5, a = +0.05, b = +0.05
LLR = +1.25278 (-2.94444, +2.94444)
T = +1091, W = +254, L = +205, D = +632, WNet = +49
Next will be the following, positions generated from Ed's protools.
(2) Rook and pawn ending but more than 5-men. Target games is 200
(3) Pawn ending but more than 5-men. Target games is 200
(4) Queen and pawn ending but more than 5-men. Target games is 200
(5) Bishop and knight and pawn ending but more than 5-men, Target games is 200
Thanks, so it seems than there is a real ELO benefit from egbb in late middlegames. Your error margins are 1SD, right? 15 ELO points are a lot, even if it reduces with more games to 5 ELO points, it's highly significant. Would be even better if you wait until SPRT stop, to have clear uncertainties.
That is 95% confidence from bayeselo default.
Update:

Code: Select all

SPRT: elo0 = -1.5, elo1 = +4.5, a = +0.05, b = +0.05
LLR = +1.42909 (-2.94444, +2.94444)
T = +1448, W = +305, L = +250, D = +893, WNet = +55
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: What happens using egbb

Post by Daniel Shawul »

I never tested how much ELOs I get, so I wouldn't know. Endgames rarely happen in engines, but for analysis and endgame testing I know they do work well. Here I am just glad someone produced data that shows it is absurd to say they hurt performance, hopefully that stops the FUD. But I don't dwell on it, because even now after Houdini is shown to be at fault (thanks to you), it goes on with questions. Well at least we made progress since the beginning of this thread, there is no mistaking who is at fault. It is unbelievable the amount of badmouthing egbbs received from the rybka forum link due to Houdini's screwed up implementation.
Michel
Posts: 2273
Joined: Mon Sep 29, 2008 1:50 am

Re: What happens using egbb

Post by Michel »

I am not partnering with anyone. Everybody is free to add support for my tablebase format.
I am kind of curious. Does Houdini reimplement your GPL probing code? Or did you offer the probing code to Houdini under a different license?
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: What happens using egbb

Post by Ferdy »

Ferdy wrote:
Yes, it seems a Houdini problem. Is the ELO benefit from egbb measurable in Deuterium?
I start testing the latest released Deuterium with egbb at tc 40moves/60sec, with different positions and color reversed.

(1) From positions with mixed pieces on late middle game but more than 5-men. Target games is 1600
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              7.85     0.00   7.06   7.06     1091  52.25%   -7.85  57.93%  23.28%       254-205-632
   2 Deuterium-v13.1.31.113-64bit   -7.85   -15.69   7.06   7.06     1091  47.75%    7.85  57.93%  18.79%       205-254-632
SPRT:

Code: Select all

Engine: Deuterium-113-egbb
SPRT: elo0 = -1.5, elo1 = +4.5, a = +0.05, b = +0.05
LLR = +1.25278 (-2.94444, +2.94444)
T = +1091, W = +254, L = +205, D = +632, WNet = +49
Next will be the following, positions generated from Ed's protools.
(2) Rook and pawn ending but more than 5-men. Target games is 200
(3) Pawn ending but more than 5-men. Target games is 200
(4) Queen and pawn ending but more than 5-men. Target games is 200
(5) Bishop and knight and pawn ending but more than 5-men, Target games is 200
I stopped the test as result is probably reliable already for these specific test positions. I set the rook and pawn ending to play 1000 games instead of the planned 200 games.
(1) Mixed pieces
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              7.18     0.00   5.37   5.36     1600  52.06%   -7.18  62.62%  20.75%       332-266-1002
   2 Deuterium-v13.1.31.113-64bit   -7.18   -14.36   5.36   5.37     1600  47.94%    7.18  62.62%  16.62%       266-332-1002
(2) Rook and pawn ending
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb             12.48     0.00   5.58   5.79      562  53.74%  -12.48  81.85%  12.81%        72-30-460
   2 Deuterium-v13.1.31.113-64bit  -12.48   -24.96   5.79   5.58      562  46.26%   12.48  81.85%   5.34%        30-72-460
Overall bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              8.60     0.00   4.37   4.56     2162  52.50%   -8.60  67.62%  18.69%       404-296-1462
   2 Deuterium-v13.1.31.113-64bit   -8.60   -17.20   4.56   4.37     2162  47.50%    8.60  67.62%  13.69%       296-404-1462
Ordo:

Code: Select all

   # PLAYER                          : RATING  ERROR   POINTS  PLAYED    (%)
   1 Deuterium-113-egbb              :    8.8    5.3   1135.0    2162   52.5%
   2 Deuterium-v13.1.31.113-64bit    :   -8.8    5.3   1027.0    2162   47.5%
SPRT:

Code: Select all

Engine: Deuterium-113-egbb
SPRT: elo0 = -1.5, elo1 = +4.5, a = +0.05, b = +0.05
LLR = +2.98664 (-2.94444, +2.94444)
T = +2162, W = +404, L = +296, D = +1462, WNet = +108
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: What happens using egbb

Post by Daniel Shawul »

Thanks a lot for the tests! The most important one is indeed the KRPkr bitbase. Many other bitbases used to have 4-men and the compulsory KRPKr from the 5-men for this reason. I have always maintained that if the engine reaches endgames often enough, bitbases will help. Often when people say they don't help, I interpret is as the engine they used has a style that wins or looses in mid-game. Anyway it is unfair to say bitbases don't help, and I don't mean only to egbbs. Most of the time the reason is they are not probed (used) well enough.

One can try to code some rules for 4 men, but for 5 men it is a waste of time. Infact what I realized when I tried to use prediction by heuristics such as capture search/neural nets ala Knightdreamer's way is that bitbases like KRPkr store mostly the exceptions! So I lost hope in that after the absymal result I got with prediction. Also why worry about implementing ton of rules for prediction or in one's engine, when you can just probe a bitbase and forget about it. It is a smart choice IMO but convincing others is a tough job.

Cheers
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens using egbb

Post by Laskos »

Ferdy wrote:
Ferdy wrote:
Yes, it seems a Houdini problem. Is the ELO benefit from egbb measurable in Deuterium?
I start testing the latest released Deuterium with egbb at tc 40moves/60sec, with different positions and color reversed.

(1) From positions with mixed pieces on late middle game but more than 5-men. Target games is 1600
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              7.85     0.00   7.06   7.06     1091  52.25%   -7.85  57.93%  23.28%       254-205-632
   2 Deuterium-v13.1.31.113-64bit   -7.85   -15.69   7.06   7.06     1091  47.75%    7.85  57.93%  18.79%       205-254-632
SPRT:

Code: Select all

Engine: Deuterium-113-egbb
SPRT: elo0 = -1.5, elo1 = +4.5, a = +0.05, b = +0.05
LLR = +1.25278 (-2.94444, +2.94444)
T = +1091, W = +254, L = +205, D = +632, WNet = +49
Next will be the following, positions generated from Ed's protools.
(2) Rook and pawn ending but more than 5-men. Target games is 200
(3) Pawn ending but more than 5-men. Target games is 200
(4) Queen and pawn ending but more than 5-men. Target games is 200
(5) Bishop and knight and pawn ending but more than 5-men, Target games is 200
I stopped the test as result is probably reliable already for these specific test positions. I set the rook and pawn ending to play 1000 games instead of the planned 200 games.
(1) Mixed pieces
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              7.18     0.00   5.37   5.36     1600  52.06%   -7.18  62.62%  20.75%       332-266-1002
   2 Deuterium-v13.1.31.113-64bit   -7.18   -14.36   5.36   5.37     1600  47.94%    7.18  62.62%  16.62%       266-332-1002
(2) Rook and pawn ending
Bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb             12.48     0.00   5.58   5.79      562  53.74%  -12.48  81.85%  12.81%        72-30-460
   2 Deuterium-v13.1.31.113-64bit  -12.48   -24.96   5.79   5.58      562  46.26%   12.48  81.85%   5.34%        30-72-460
Overall bayeselo:

Code: Select all

Rank Name                             Elo     Diff     +     -      Games  Score    Oppo.   Draws     Win          W-L-D 
   1 Deuterium-113-egbb              8.60     0.00   4.37   4.56     2162  52.50%   -8.60  67.62%  18.69%       404-296-1462
   2 Deuterium-v13.1.31.113-64bit   -8.60   -17.20   4.56   4.37     2162  47.50%    8.60  67.62%  13.69%       296-404-1462
Ordo:

Code: Select all

   # PLAYER                          : RATING  ERROR   POINTS  PLAYED    (%)
   1 Deuterium-113-egbb              :    8.8    5.3   1135.0    2162   52.5%
   2 Deuterium-v13.1.31.113-64bit    :   -8.8    5.3   1027.0    2162   47.5%
SPRT:

Code: Select all

Engine: Deuterium-113-egbb
SPRT: elo0 = -1.5, elo1 = +4.5, a = +0.05, b = +0.05
LLR = +2.98664 (-2.94444, +2.94444)
T = +2162, W = +404, L = +296, D = +1462, WNet = +108
Wow, impressive results using Scorpio egbb.
I tested Shredder 12 with "all345_fast" Shredderbases at 15''+0.15'' TC, and found that Shredder doesn't use them optimally, a bit like Houdini, but not that bad.
The starting positions are 3-4-5 men white wins, and I used Shredder EGBB against Shredder Nalimov TB, which solves all of them perfectly. The result is

Code: Select all

    Program                            Score     %      Elo   

  1 Shredder Nalimov               : 336.5/600  56.1     21
  2 Shredder EGBB                  : 263.5/600  43.9    -21
In about 20% of cases Shredder with Shredderbases fails to convert the won positions. So, more so I am impressed by your results with Scorpio egbb. The implementation of egbb probing seems to be very important.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: What happens using egbb

Post by Daniel Shawul »

Kai, shredder needs the Nalimov tbs to make progress. So you should test Shredder_Nalimov+egbb vs Shredder_Nalimov. For scorpio egbbs like I mentioned, the fact that I have a separate dll means, I can provide the engine author with a modified score instead of WDL. That score takes into consideration material, distance from root, pawn closeness to promotion, distance of pieces to opponent king, special rules for difficult kBnk etc... All those heuiristic has to be combined in a certain weighted manner, so that you make progress most of the time. The most difficult one is combining it with distance from root. Usually you have MATE-1, MATE-2 to indicate mate in 1 or 2, but when using scorpio egbbs that 1,2 is replaced with 40 and 80 and then combined with the previous heuirstic scores to make progress. You don't need bulky DTM/DTZ/DTC for scorpio egbbs because it has this smart heuristic that helps to make progress. We worked to polish this for months, getting feedback for failed positions, when they were first released. After that I rarely had reports that scorpio failed to win a won position. Others , like Shredder, would rather use DTM/DTZ table to make progress but scorpio egbb users, handle it with WDL alone by making the engine do some work to make progress. Atleast for 5 men it has worked well, but there are question for 6 men but even that has been used by Diep so I will continue using this approach.

Daniel
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens using egbb

Post by Laskos »

Daniel Shawul wrote:Kai, shredder needs the Nalimov tbs to make progress. So you should test Shredder_Nalimov+egbb vs Shredder_Nalimov.
Ah, sorry, so it's harder to measure, as I cannot use 3-4-5 men won starting positions to check egbb implementation, in both cases they will give perfect results. I would probably need thousands of games from late-middlegame positions to check the benefit, but I am too lazy to do that.

For scorpio egbbs like I mentioned, the fact that I have a separate dll means, I can provide the engine author with a modified score instead of WDL. That score takes into consideration material, distance from root, pawn closeness to promotion, distance of pieces to opponent king, special rules for difficult kBnk etc... All those heuiristic has to be combined in a certain weighted manner, so that you make progress most of the time. The most difficult one is combining it with distance from root. Usually you have MATE-1, MATE-2 to indicate mate in 1 or 2, but when using scorpio egbbs that 1,2 is replaced with 40 and 80 and then combined with the previous heuirstic scores to make progress. You don't need bulky DTM/DTZ/DTC for scorpio egbbs because it has this smart heuristic that helps to make progress. We worked to polish this for months, getting feedback for failed positions, when they were first released. After that I rarely had reports that scorpio failed to win a won position. Others , like Shredder, would rather use DTM/DTZ table to make progress but scorpio egbb users, handle it with WDL alone by making the engine do some work to make progress. Atleast for 5 men it has worked well, but there are question for 6 men but even that has been used by Diep so I will continue using this approach.

Daniel
I pretty much got this from your previous replies. I was surprised that the implementation of Scorpio egbb probing in Houdini is screwed up.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: What happens using egbb

Post by Laskos »

I managed to get a surprisingly fast conclusive result for Shredder 12 using endgame bases. Shredder 12 + Nalimov + egbb against Shredder 12 without any bases. Standard opening 8-move positions, TC 15''+0.15'', LOS 99.9% as stopping rule.

Code: Select all

Games Completed = 3315 of 30000 (Avg game length = 43.826 sec)
Settings = Gauntlet/16MB/15000ms+150ms/M 700000cp for 1000 moves, D 150000 moves/PGN:C:\LittleBlitzer\swcr.pgn(5120)
Time = 39180 sec elapsed, 315392 sec remaining
 1.  Shredder EGBB            	1723.5/3315	881-749-1685  	(L: m=749 t=0 i=0 a=0)	(D: r=1344 i=283 f=50 s=8 a=0)	(tpm=312.2 d=10.25 nps=765881)
 2.  Shredder                 	1591.5/3315	749-881-1685  	(L: m=881 t=0 i=0 a=0)	(D: r=1344 i=283 f=50 s=8 a=0)	(tpm=328.0 d=11.79 nps=892042)
The NPS is significantly lower using bases, but the net benefit is evident:

Code: Select all

    Program                            Score       %    Elo    +   -    Draws

  1 Shredder EGBB                  : 1723.5/3315  52.0    7    8   8   50.8 %
  2 Shredder                       : 1591.5/3315  48.0   -7    8   8   50.8 %
 
14 +/- 8 (2SD) ELO points benefit for Shredder using Nalimov + egbb, LOS 99.95%. This is the first time I get a conclusive result using endgame bases, and the benefit is pretty substantial. Ferdy got a benefit with Scorpio egbb, and all this amounts to disproving the skeptics of endgame bases (especially egbb) ELO-wise benefits. Maybe I will manage a match Shredder + Nalimov + egbb vs. Shredder + Nalimov.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: What happens using egbb

Post by Daniel Shawul »

14 +/- 8 (2SD) ELO points benefit for Shredder using Nalimov + egbb, LOS 99.95%. This is the first time I get a conclusive result using endgame bases, and the benefit is pretty substantial. Ferdy got a benefit with Scorpio egbb, and all this amounts to disproving the skeptics of endgame bases (especially egbb) ELO-wise benefits. Maybe I will manage a match Shredder + Nalimov + egbb vs. Shredder + Nalimov.
Exactly. Atleast bitbases should help some since they are loaded in RAM. IMO it is the knowledge that matters when you are playing a KRPkr and such, because you have to search long to the see the promotion of pawns. Even KPk was a must some time ago. I think the fact that we hear EGTBs don't help much so much repeatedly here, which also implies the same for EGBBs, made even me skeptic as you can tell from my initial reaction to Ferd's result, sorry :( I guess we should all just do the test and see the result instead of speculating or in some cases blowing hot air.