Battle of the Goths 2012 (live broadcast)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Battle of the Goths 2012 (live broadcast)

Post by Evert »

Yikes!
I'm running a gauntlet with earlier versions of Sjaak and it seems there was a regression between 399 and 437 (for gothic anyway); at least 399 measures at +130 elo against Fairy-Max compared to 437 (which still measures as +30 in my test, and slightly below 467).

I'll try to track it further, but in the mean time for a more interesting match, it might be better to replace Sjaak with the earlier 399 (EDIT although the commit message for revision 400 says "fix a crash that occurs in Win32"...).
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Battle of the Goths 2012 (live broadcast)

Post by mar »

Btw. have you tried Gothic Vortex against Bihasa or Joker?
User avatar
George Tsavdaris
Posts: 1627
Joined: Thu Mar 09, 2006 12:35 pm

Re: Battle of the Goths 2012 (live broadcast)

Post by George Tsavdaris »

mar wrote:Btw. have you tried Gothic Vortex against Bihasa or Joker?
Yes against Bihasa.

First 2 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 2-0
Next 16 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 0-16 !!!!
Next one draw.
Then 2 wins of Bihasa and i stopped it....

Against Joker i have:
Gothic Vortex 2.2.5 - Joker80 13-7

I have to play games manually with Vortex so it's difficult to have more games....
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Battle of the Goths 2012 (live broadcast)

Post by mar »

George Tsavdaris wrote:
mar wrote:Btw. have you tried Gothic Vortex against Bihasa or Joker?
Yes against Bihasa.

First 2 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 2-0
Next 16 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 0-16 !!!!
Next one draw.
Then 2 wins of Bihasa and i stopped it....

Against Joker i have:
Gothic Vortex 2.2.5 - Joker80 13-7

I have to play games manually with Vortex so it's difficult to have more games....
Impressive! Looks like Bihasa is the strongest 10x8 program in the world :) Thanks George!
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Battle of the Goths 2012 (live broadcast)

Post by hgm »

Evert wrote:Yikes!
I'm running a gauntlet with earlier versions of Sjaak and it seems there was a regression between 399 and 437 (for gothic anyway);
Well, it seems clear that there must be something wrong with this version. It even lost to ArcBishop now...

I am a bit queasy w.r.t. replacing Sjaak by a version that is prone to crashing, without any testing. I am using WinBoard's internal tournament manager now, and I am not completely sure if a crashing engine will somehow spoil the settings for later games (I did make an engine crash non-fatal, but it switches WinBoard back to -ncp mode, which is the logical thing to do outside a tourney). It would be good to test that, of course, but this does not seem the right occasion for doing so!

I will try to run some quick tests with 399 on the other core; I hope this will be conclusive before the next game Sjaak has to play.

The WB internal tourney manager allows me to play a dirty trick, btw, which I am using now. By already specifying a result for the games that were played in the qualifier, I can instruct WB to play a normal round-robin. It will then automatically skip those games. And I already copied them to the PGN for the second division. All it required was editing the tourney file after the tourney started, replacing the result string by

Code: Select all

-results "* **  **      **"
which should skip the game pairs 1b, 2a and 3b (which in the tourney schedule were the ones already played). The first * is the game WB was already playing, but now it will simply think the other games are being played by some other WB instance, and skip them!
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Battle of the Goths 2012 (live broadcast)

Post by Evert »

hgm wrote: I am a bit queasy w.r.t. replacing Sjaak by a version that is prone to crashing, without any testing. I am using WinBoard's internal tournament manager now, and I am not completely sure if a crashing engine will somehow spoil the settings for later games (I did make an engine crash non-fatal, but it switches WinBoard back to -ncp mode, which is the logical thing to do outside a tourney). It would be good to test that, of course, but this does not seem the right occasion for doing so!
Absolutely not.
I can easily deply an intermediate version if there's a problem with 399, but the problem is I probably will not get round to that before tonight.

I will be revising my testing method though, because I would have liked to spot a problem like this (much) earlier... I normally test using normal chess games, and then do verification matches using other variants (typically Spartan, gothic and sometimes Makruk; XiangQi I usually test separately). Somehow this slipped through.
User avatar
George Tsavdaris
Posts: 1627
Joined: Thu Mar 09, 2006 12:35 pm

Re: Battle of the Goths 2012 (live broadcast)

Post by George Tsavdaris »

hgm wrote:After the Battle of the Goths has finished I will do a gauntlet with Heretic 0.2 versus all playoff partcipants, to get ratings. There is a slight chance I will also be able to add the commercial program Gothic Vortex to that. (If Ed Trice sends me the sources, and I manage to convert it to WB protocol.)
I guess this will not be your only problem. Gothic Vortex in its own GUI doesn't have a X moves per Y minutes time setting and i guess this is not supported internally for the Vortex engine.
It only has time per move setting.

So you will probably have to do what i'm doing and use a comparable time per move setting that adds up to the same 40/X time control that other engines use.
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Battle of the Goths 2012 (live broadcast)

Post by hgm »

The tourney is played in incremental TC, mostly because TSCP Gothic also does not support N moves per M minutes. The increment is very small, so that engines not implementing an increment (such as Smirf) are not too disadvantaged, but get it as a pleasant surprise. (Which, in case of Smirf, really helps to prevent time losses at fast TC.)

But when I have the sources I probably could find a trick to manipulate the time per move Vortex uses during the game, based on how much time WB says it still has, and knowledge of how many moves WB expects it to play in this time.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Battle of the Goths 2012 (live broadcast)

Post by Evert »

Ok, I think I have isolated the revision where the regression occurs. Not entirely sure yet, since I'm only 50 games into the test match and I still need to run the verification (the single revision befrore) after that, but indications are good:

Code: Select all

Rank Name             Elo    +    - games score oppo. draws 
   1 Sjaak 399       2179   10   10  4000   66%  2049   10% 
   2 Sjaak 417       2143   22   22   834   61%  2049    9% 
   3 Sjaak 467       2099   10   10  4000   56%  2049    9% 
   4 Sjaak 437       2090   10   10  4000   55%  2049    8% 
   5 Fairy-Max 4.8O  2049    6    6 12888   41%  2124    9% 
   6 Sjaak 422       2041   86   86    54   49%  2049    6% 
I know exactly what I changed in that revision, and reversing it will be easy (I removed an unsafe form of futility pruning which would assume that no qualising capture existed without testing that assumption). The only downside is that I also remember clearly why I made that change: it can backfire spectacularly.

EDIT: that does look like it accounts for a large chunk of the regression, but not all of it. There is one other non-cosmetic change in between, which is reducing checking moves with bad SEE. So this idea may not pay off in Gothic chess (possibly because sacrificing a minor to get an attack against the king is more likely to succeed with three heavy pieces). Interesting.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Battle of the Goths 2012 (live broadcast)

Post by Evert »

Ok, I've now reversed a number of apparently harmful (for Gothic chess) changes in Sjaak and I'm running a verification match at the moment. It currently looks like this version is indeed at least as strong as the earlier version and much stronger than the latest release.
If I'm reasonably confident that this is indeed so I'll release it in a couple of hours. For what it's worth at this point.