Yikes!
I'm running a gauntlet with earlier versions of Sjaak and it seems there was a regression between 399 and 437 (for gothic anyway); at least 399 measures at +130 elo against Fairy-Max compared to 437 (which still measures as +30 in my test, and slightly below 467).
I'll try to track it further, but in the mean time for a more interesting match, it might be better to replace Sjaak with the earlier 399 (EDIT although the commit message for revision 400 says "fix a crash that occurs in Win32"...).
Battle of the Goths 2012 (live broadcast)
Moderators: hgm, Rebel, chrisw
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
-
- Posts: 2567
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Battle of the Goths 2012 (live broadcast)
Btw. have you tried Gothic Vortex against Bihasa or Joker?
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: Battle of the Goths 2012 (live broadcast)
Yes against Bihasa.mar wrote:Btw. have you tried Gothic Vortex against Bihasa or Joker?
First 2 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 2-0
Next 16 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 0-16 !!!!
Next one draw.
Then 2 wins of Bihasa and i stopped it....
Against Joker i have:
Gothic Vortex 2.2.5 - Joker80 13-7
I have to play games manually with Vortex so it's difficult to have more games....
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....
-
- Posts: 2567
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Battle of the Goths 2012 (live broadcast)
Impressive! Looks like Bihasa is the strongest 10x8 program in the world Thanks George!George Tsavdaris wrote:Yes against Bihasa.mar wrote:Btw. have you tried Gothic Vortex against Bihasa or Joker?
First 2 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 2-0
Next 16 games:
Gothic Vortex 2.2.5 - Bihasa 2.0 0-16 !!!!
Next one draw.
Then 2 wins of Bihasa and i stopped it....
Against Joker i have:
Gothic Vortex 2.2.5 - Joker80 13-7
I have to play games manually with Vortex so it's difficult to have more games....
-
- Posts: 27842
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Battle of the Goths 2012 (live broadcast)
Well, it seems clear that there must be something wrong with this version. It even lost to ArcBishop now...Evert wrote:Yikes!
I'm running a gauntlet with earlier versions of Sjaak and it seems there was a regression between 399 and 437 (for gothic anyway);
I am a bit queasy w.r.t. replacing Sjaak by a version that is prone to crashing, without any testing. I am using WinBoard's internal tournament manager now, and I am not completely sure if a crashing engine will somehow spoil the settings for later games (I did make an engine crash non-fatal, but it switches WinBoard back to -ncp mode, which is the logical thing to do outside a tourney). It would be good to test that, of course, but this does not seem the right occasion for doing so!
I will try to run some quick tests with 399 on the other core; I hope this will be conclusive before the next game Sjaak has to play.
The WB internal tourney manager allows me to play a dirty trick, btw, which I am using now. By already specifying a result for the games that were played in the qualifier, I can instruct WB to play a normal round-robin. It will then automatically skip those games. And I already copied them to the PGN for the second division. All it required was editing the tourney file after the tourney started, replacing the result string by
Code: Select all
-results "* ** ** **"
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Battle of the Goths 2012 (live broadcast)
Absolutely not.hgm wrote: I am a bit queasy w.r.t. replacing Sjaak by a version that is prone to crashing, without any testing. I am using WinBoard's internal tournament manager now, and I am not completely sure if a crashing engine will somehow spoil the settings for later games (I did make an engine crash non-fatal, but it switches WinBoard back to -ncp mode, which is the logical thing to do outside a tourney). It would be good to test that, of course, but this does not seem the right occasion for doing so!
I can easily deply an intermediate version if there's a problem with 399, but the problem is I probably will not get round to that before tonight.
I will be revising my testing method though, because I would have liked to spot a problem like this (much) earlier... I normally test using normal chess games, and then do verification matches using other variants (typically Spartan, gothic and sometimes Makruk; XiangQi I usually test separately). Somehow this slipped through.
-
- Posts: 1627
- Joined: Thu Mar 09, 2006 12:35 pm
Re: Battle of the Goths 2012 (live broadcast)
I guess this will not be your only problem. Gothic Vortex in its own GUI doesn't have a X moves per Y minutes time setting and i guess this is not supported internally for the Vortex engine.hgm wrote:After the Battle of the Goths has finished I will do a gauntlet with Heretic 0.2 versus all playoff partcipants, to get ratings. There is a slight chance I will also be able to add the commercial program Gothic Vortex to that. (If Ed Trice sends me the sources, and I manage to convert it to WB protocol.)
It only has time per move setting.
So you will probably have to do what i'm doing and use a comparable time per move setting that adds up to the same 40/X time control that other engines use.
After his son's birth they've asked him:
"Is it a boy or girl?"
YES! He replied.....
"Is it a boy or girl?"
YES! He replied.....
-
- Posts: 27842
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Battle of the Goths 2012 (live broadcast)
The tourney is played in incremental TC, mostly because TSCP Gothic also does not support N moves per M minutes. The increment is very small, so that engines not implementing an increment (such as Smirf) are not too disadvantaged, but get it as a pleasant surprise. (Which, in case of Smirf, really helps to prevent time losses at fast TC.)
But when I have the sources I probably could find a trick to manipulate the time per move Vortex uses during the game, based on how much time WB says it still has, and knowledge of how many moves WB expects it to play in this time.
But when I have the sources I probably could find a trick to manipulate the time per move Vortex uses during the game, based on how much time WB says it still has, and knowledge of how many moves WB expects it to play in this time.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Battle of the Goths 2012 (live broadcast)
Ok, I think I have isolated the revision where the regression occurs. Not entirely sure yet, since I'm only 50 games into the test match and I still need to run the verification (the single revision befrore) after that, but indications are good:
I know exactly what I changed in that revision, and reversing it will be easy (I removed an unsafe form of futility pruning which would assume that no qualising capture existed without testing that assumption). The only downside is that I also remember clearly why I made that change: it can backfire spectacularly.
EDIT: that does look like it accounts for a large chunk of the regression, but not all of it. There is one other non-cosmetic change in between, which is reducing checking moves with bad SEE. So this idea may not pay off in Gothic chess (possibly because sacrificing a minor to get an attack against the king is more likely to succeed with three heavy pieces). Interesting.
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Sjaak 399 2179 10 10 4000 66% 2049 10%
2 Sjaak 417 2143 22 22 834 61% 2049 9%
3 Sjaak 467 2099 10 10 4000 56% 2049 9%
4 Sjaak 437 2090 10 10 4000 55% 2049 8%
5 Fairy-Max 4.8O 2049 6 6 12888 41% 2124 9%
6 Sjaak 422 2041 86 86 54 49% 2049 6%
EDIT: that does look like it accounts for a large chunk of the regression, but not all of it. There is one other non-cosmetic change in between, which is reducing checking moves with bad SEE. So this idea may not pay off in Gothic chess (possibly because sacrificing a minor to get an attack against the king is more likely to succeed with three heavy pieces). Interesting.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Battle of the Goths 2012 (live broadcast)
Ok, I've now reversed a number of apparently harmful (for Gothic chess) changes in Sjaak and I'm running a verification match at the moment. It currently looks like this version is indeed at least as strong as the earlier version and much stronger than the latest release.
If I'm reasonably confident that this is indeed so I'll release it in a couple of hours. For what it's worth at this point.
If I'm reasonably confident that this is indeed so I'll release it in a couple of hours. For what it's worth at this point.