Time-Odds tournament

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Ovyron
Posts: 4562
Joined: Tue Jul 03, 2007 4:30 am

Re: Time-Odds tournament

Post by Ovyron »

hgm wrote:I have not spent a great deal of time on Fairy-Max, (it was never intended as an engine for serious competition, I made it only for determining piece values of exotic pieces, in very fast games)
How did that came up? Did accurate values for the exotic pieces were found?
Tony

Re: Time-Odds tournament

Post by Tony »

hgm wrote:
Uri Blass wrote:Very strange performance of fairymax.
The results suggest that playing strength of fairy-max is not monotonic function of time.

13. Faiiry-Max / 3 45% 20.5 / 46 (1065.0, 343.3)
15. Fairy-Max 4.8 t 40% 18.5 / 46 (1069.0, 293.8)
19. Fairy-Max /24 30% 14.0 / 46 (1078.0, 198.5)
21. Fairy-Max / 9 23% 10.5 / 46 (1085.0, 188.5)

Edit:It is also possible that there is some problem with the experiment
and one program was slowed down during part of the games.

It is also possible that it is because of luck but I think that the results suggest to check if there was some problem.
Indeed the behavior of Fairy-Max is very suspicious. I know that Fairy-Max is buggy: it frequently hangs, and even was responsible for spoiling earlier tournaments by soaking up CPU time by many hanging processes. But now I use a kill list that kills any Fairy-Max processes after every game. So I believe the current tournament to be free of such corruption.

Unfortunately I have never been able to find a position where Fairy-Max reproducibly crashes. If I take positions where it was hanging on, and rerun them, it always finds a move without problems. It is supposed to be nearly the same as micro-Max, and micro-Max never hangs. Although sometimes it thinks extremely long, like 5 min on one move in a 2-min game. But in thos cases a normal search is going on, and eventually it always comes with a move. But the memory foortprint of a hanging Fairy-Max process goes to nearly zero, so it cannot be doing normal search (as it would then access most of the hash table).

Anyway, an explanation of the non-montonic behavior might be that there is a fixed probability per unit of time for Fairy-Max crashing. That would mean that longer time controls are counter-productive, as at some point the probability for a crash (and forfeiting on time) would approach 100%. Only games played fast enough that there is a significant probability to finish them without a crash could ever be won.

I have not spent a great deal of time on Fairy-Max, (it was never intended as an engine for serious competition, I made it only for determining piece values of exotic pieces, in very fast games), but I really should put it under the microscope to get rid of this very annoying bug. The frequency with which it happens, and the irreproducibility, suggests that hash collissions are somehow involved.

But it could be that
I would suggest an overflow. In my experience they are the most common if longer runs give (relative) more bugs.

Tony
User avatar
hgm
Posts: 28418
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Time-Odds tournament

Post by hgm »

Ovyron wrote: How did that came up? Did accurate values for the exotic pieces were found?
Because I got distracted by the 'Gothic scene' and started to convert Joker to 10x8 play, this project suffered. I have been doing a lot of measurements on Archbishop and Chancellor, (see the Gothic-chess forum), using Joker80 rather than Fairy-Max. (This because Joker80 is much stronger, and can thus give the same quality games at horter time control. But it can only do A and C, it is not generally configurable for any fairy piece , such as Fairy-Max is.)

The data-set generated by Joker80 for 10x8 chess has raised some minor questions as to the validity of the method used. (E.g. was the time-control really good enough for the results to be meaningful.) The value of the Knight seems to come out suspiciously low compared to the Bishop, and this might conceivably be an artifact of not having good end-game evaluation, trading down to a Knight+Pawns ending when a passer is out of reach of the Knight to stop it (and not having enough depth for the search to se this). If you have a Bishop in stead you don't fall so easily in this trap. Perhaps I should extend the remaining depth by 50% when the last slider disappears from the board (just like I should double the remaining depth when conversion to a Pawn ending takes place).

Anyway, because of these question marks I did some measurements of N vs B at longer time control (40/2' and 40/5', in stead of 40/1'), and indeed the Knight value seems to go up significantly (but not spectacularly). Before I believe the results I want to compare with 8x8 chess where the N vs B value is accurately known, to see if I have the same problem there at 40/1'. But normal Joker doesn't support setboard correctly yet, and I had no time to fix that yet. So in part it is waiting for that.

Another surprise was the high Archbishop value: it finds A and C are about equal. Which is surprising, as A=B+N and C=R+N, and clearly R >> B. A possible explanation would be that the B value is much suppressed by its color-boundedness, a thing from which A does not suffer. So I did switch back to Fairy-Max, to test the value of a piece that moves as Bishop + some one-step orthogonal non-captures. E.g. a piece that moves like B+P. The non-capture forward step in itself is probably not very useful, but on the Bishop itbreaks the color-boundedness. Preliminary tests suggest that such a 'naughty bishop' is not really worth much more than a normal one: Two naughty bishops with a backward step non-capture added to B (replacing the normal B in the opening position) only beat a normal B pair by ~54%, which is 1/3 of the Pawn-odds score. So it seems that the extra step on the Bishop is only worth about 0.17 pawn, and that breaking the color-boundedness is not a big deal. So it seems that the large A value (or, equivalently, the low Bishop value) cannot be explained this way. perhaps the major factor is mating potential: B does not have it, while R does, but both A and C have it.
User avatar
hgm
Posts: 28418
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Time-Odds tournament

Post by hgm »

Uri Blass wrote:Very strange performance of fairymax.
The results suggest that playing strength of fairy-max is not monotonic function of time.
It seems I finally was able to identify the illusive bug that made Fairy-Max hang so frequently:

The direct cause of the hang was infinite regression of the search, due to check extension on every ply. Tony was right after all that an overflow was involved: in this version, for debugging purposes, I added a stack that contains all moves of the current branch. That array went out of bounds and probably overflowed into other data structures, causing an infinite loop what in reality should have been a stack overflow.

Now in Chess perpetual sequences of checks are not possible (although it is occasionally possible to evade a check with a move that checks). But micro-Max / Fairy-Max are determining the in-check status from the returned score of the null move (they have pseudo-legal move generators, and also the null move can be pseudo-legal). If there was a perpetual check, where some of the moves of the checking side were erroneuosly thinking this side was in check too, the infinite regression would occur. And such mistaken in-check detections can occur due to a key collission in the hash-table lookup for the position after null move, if it happens to collide with an in-check position.

Now key collissions should be pretty rare, (especially since the problem only occurs if _all_ moves of the genuinely checking side in the repeat cycle are mistakingly identified as check evasions), but there was a genuine bug in Fairy-Max that made them less rare. In micro-max sideToMove and e.p. rights are worked into the key as

index = (hashKey + sideToMove * epSquare) & 0xFFFFF;

where epSquare has a value outside the board if no e.p. rights exist. This value is non-zero, as zero is a valid square number, so the side to move always matters. (As zero encodes for a8, there is never any e.p. capture on this square.) Now in micro-Max sideToMove is 8 for white and 16 for black. But in Fairy-Max I changed this to 0 for white and 16 for black, as in the piece encoding I needed an extra bit for having more different piece types (so the color of the piece became piece&16, rather than piece&24).

Now if sideToMove can be zero, the above indexing doesn't work anymore, since with white to move you lose the information of the e.p. rights from the key. This affected the in-check detection, because exposing the Rook to capture directly after castling is actually considered e.p. capture of the King. So castling over an attacked square is actually considered moving into (e.p.) check. Illegal positions where the King can be captured (e.p. or otherwise) are stored into the hash table like any other position. But they are only illegal by virtue of the e.p. rights for capturing the King; if the Rook would be attacked later, they would be perfectly legal.

It was the mix up of these two types of positions in the hash table (Rook next to a castled King attacked directly after castling, or only later) that would lead to a large probability for positions being falsely flagged as in-check positions, while any move out of them (also genuine checks) would not suffer a similar collision, and thus be considered legal check evasions (and hence be extended).

I fixed it by simply adding something to the sideToMove in the index calculation:

index = (hashKey + (sideToMove+128) * epSquare) & 0xFFFFF;

This should make it just as unlikely that key collisons exist that cause false in-check status as in micro-Max (where I never observed this to be a problem).

The fixed Fairy-Max is now on my website for download.