I can play this later, as I needed more games in the standard TC rating list.hgm wrote:It is clear the big strength gap is between Lima / Shokidoki and the CrazyWa / Sjaak / Nebiyu / TJshogi. The latter four are pretty close, though. So a full round-robin with 16 games/pairing would also be very interesting, perhaps even more so than the stage II round robin. (Which was a bit of a massacre.) So out of curiosity: could you still do the remaining 5 matches between between Crazywa / Sjaak/ Nebiyu and TJshogi?
1st Mini Shogi Computer Association Championships 2017
Moderators: hgm, Rebel, chrisw
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Stage II finished
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Stage II finished
I only run CrazyWa gauntlet, I will not run Nebiyu gauntlet as it does not really comply important rules.Ferdy wrote:I can play this later, as I needed more games in the standard TC rating list.hgm wrote:It is clear the big strength gap is between Lima / Shokidoki and the CrazyWa / Sjaak / Nebiyu / TJshogi. The latter four are pretty close, though. So a full round-robin with 16 games/pairing would also be very interesting, perhaps even more so than the stage II round robin. (Which was a bit of a massacre.) So out of curiosity: could you still do the remaining 5 matches between between Crazywa / Sjaak/ Nebiyu and TJshogi?
Here is the result.
Code: Select all
Results from file msca-t8-std.pgn:
No. Name 1 2 3 Score
------------------------------------------------
1 CrazyWa 1.0.2 32bit xxxx 9.0 10.0 19.0
2 TJshogi 5x5 0.19 32bit 7.0 xxxx 7.0
3 Sjaak II v1.4.1 64bit 6.0 xxxx 6.0
Total Games: 32
White Wins: 15 (46.9%)
Black Wins: 17 (53.1%)
Draws: 0 (0.0%)
Code: Select all
Rank Engine Games Pts Pct rep tif per crash
1 CrazyWa 1.0.2 32bit 32 19.0 59.4 0 0 0 0
2 TJshogi 5x5 0.19 32bit 16 7.0 43.8 0 0 0 0
3 Sjaak II v1.4.1 64bit 16 6.0 37.5 2 0 0 0
:: Rule infraction ::
rep : Repetition
tif : Time forfeit
per : Perpetual
crash : Program exited unexpectedly
games : 32
tc : 3600+10
file : msca-t8-std.pgn
This is now under MSCA tournament nr 8, std rating list is not yet updated with this result, games can be downloaded from the link below.
https://sites.google.com/view/minishogi ... ournaments
-
- Posts: 27869
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Stage II finished
Thank you. It is a pity that Evert did not release the version of Sjaak that fixed this repetition problem (by allowing the result for repetitions to be defined in a color-dependent way) before the tourney. TJshogi seems to be completely unaware that repetions lose for sente, and even makes them when none of the moves are captures.
Sometimes going for a perpetual or a repetition as sente can be a valid choice, as an alternative for being mated. (Many Xiangqi engines do that.) But that was definitely not the case here.
Anyway, I am happy that CrazyWa is so stable and rule-compliant.
Sometimes going for a perpetual or a repetition as sente can be a valid choice, as an alternative for being mated. (Many Xiangqi engines do that.) But that was definitely not the case here.
Anyway, I am happy that CrazyWa is so stable and rule-compliant.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Stage II finished
Indeed. I wanted to squeeze out a few more improvements, but this proved to be harder than I had thought and I ran out of time (king safety is hard). I should have a bit of time over the next few weeks though.hgm wrote:Thank you. It is a pity that Evert did not release the version of Sjaak that fixed this repetition problem (by allowing the result for repetitions to be defined in a color-dependent way) before the tourney.
Anyway, it's not as if it plays illegal games, in a sense it's just blind to certain types of mate-in-one. I didn't check what fixing that did for playing strength though. I suppose I could run a match and find out.
-
- Posts: 27869
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Stage II finished
Indeed, it is questionable how much recognizing the repetition as a sente loss will actually help. It is very difficult to break out of this 'impasse'. It is quite possible that the engine would just paint itself into a corner, first making the most reasonable escapes unreachable, so that it finally has to make an outright losing move.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Stage I is completed
Actually, I should probably test that again: the Tori Shogi results are unreliable because the piece values it used were based on estimates, which are apparently quite poor. Improving those gains a few 100 Elo points in self-play...Evert wrote:Interesting. I tried something like that two weeks ago, and it worked quite well for mini-Shogi and regular Shogi, and even for Crazyhouse. It blew up spectacularly for Tori Shogi.hgm wrote:The only changes to the evaluation compared to Crazyhouse were halving the weight of King Safety, (because material balance is much more important in mini-Shogi than in larger drop variants)
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Stage II finished
Talking about non-perpetual check repeats, there are 3 cases that I can think of.hgm wrote:Indeed, it is questionable how much recognizing the repetition as a sente loss will actually help. It is very difficult to break out of this 'impasse'. It is quite possible that the engine would just paint itself into a corner, first making the most reasonable escapes unreachable, so that it finally has to make an outright losing move.
1. An engine that does not know the rule at all that repeats is a loss for sente.
2. An engine that knows the repeat rule but choose to repeat as lesser evil than getting mated. It consider being mated is more evil than 3rd repeat and 3rd repeat is more evil that 2nd repeat and 2nd repeat is more evil that 1st repeat.
3. An engine that knows the repeat rule but consider 1st repeat, 2nd repeat and 3rd repeat as the same as being mated.
There is advantage of type 2 engine, it could exploit the opponent that does not know the rule of repeats.
Lima is of type 3. The moment there is 1st repeat found in the search, I return immediately a mated score.
But I think type 2 is smart.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Stage II finished
I'm not convinced.Ferdy wrote: 1. An engine that does not know the rule at all that repeats is a loss for sente.
2. An engine that knows the repeat rule but choose to repeat as lesser evil than getting mated. It consider being mated is more evil than 3rd repeat and 3rd repeat is more evil that 2nd repeat and 2nd repeat is more evil that 1st repeat.
3. An engine that knows the repeat rule but consider 1st repeat, 2nd repeat and 3rd repeat as the same as being mated.
There is advantage of type 2 engine, it could exploit the opponent that does not know the rule of repeats.
If you're playing Black, then repeating is never bad for you (since White will lose on the 4th one). If you're White, it's never good for you.
SjaakII considers repeating illegal (worse than mate), so it will prefer mate over repeating illegally. It allows for repetitions against the game history though (but not the search history).
-
- Posts: 27869
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Stage II finished
There is also a 'type 0' that fails to recognize a repetition as such. Some of the observed rule infractions can be blamed on that. In particular this seems to happen with repetition cycles that include captures.Ferdy wrote:Talking about non-perpetual check repeats, there are 3 cases that I can think of.
1. An engine that does not know the rule at all that repeats is a loss for sente.
2. An engine that knows the repeat rule but choose to repeat as lesser evil than getting mated. It consider being mated is more evil than 3rd repeat and 3rd repeat is more evil that 2nd repeat and 2nd repeat is more evil that 1st repeat.
3. An engine that knows the repeat rule but consider 1st repeat, 2nd repeat and 3rd repeat as the same as being mated.
For engines that make no distinction between 1st and later repeats, type 2 makes sense, because they will be allowed many extra 'surprise repeats' before they actually lose. A repeat loop takes at least 4 ply, so at the first repeat you will still get 12 more ply before you actually lose. So if the purpose is to last as long as possible (which is silly in itself, but usually what engines do in the face of being mated), a first repeat is preferable to being mated in 5. Of course it would pay to switch to the mated-in-5 line when the 4th repeat approaches, but this is usually a moot point. It is an issue at the same level as whether it is better to sacrifice your Queen in a spite check to delay a mate 1 move. Almost every Chess engine would do that, and thus could be considered buggy.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
On UEC cup
I sent yesterday my application to enter Lima in UEC cup, I just indicated the link of Lima v4.0 in the email.hgm wrote: Anyway, it seems we will get a very interesting UEC Cup, this year.
Are you going to create a new minishogi winboard pack?