Time loss in the IPON

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Engin
Posts: 1001
Joined: Mon Jan 05, 2009 7:40 pm
Location: Germany
Full name: Engin Üstün

Re: Time loss in the IPON

Post by Engin »

OK !

+1 :)
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Time loss in the IPON

Post by IWB »

Hi
Engin wrote:...and if this overtake the 0.7% ?

you say you will stop it.
Why especialy 0.7%? But yes, if it is too high (which means it will significantly change the rating) I will not include the engine and report this (as it happened recently with Stockfish 2.2 and before with others). If an author then tells me that it is a feature - like you described it - I will leave it in, so be it - I dont mind!



Bye
INgo
User avatar
hgm
Posts: 28387
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Time loss in the IPON

Post by hgm »

The main problem is not that an engine gets punished Elo-wise for time losses, but that their opponents get unjustly rewarded. This corrupts the entire rating list, because it invalidates the rating model on which the analysis is based.

All standard rating models assume the score percentage tails off very rapidly when you are much weaker than the opponent. But against an opponent that throws games, there will always be a fair chance you score points, no matter how weak you are. But the rating extractor will consider it very strong evidence the engine played a world-class game, and will up its rating accordingly.

With standard programs like BayesElo, I think it would work much better to calculate 'raw' ratings based on games without any irregularities, and keep separate statistics for the probability to lose on time (or through bugs), and then just correct the raw ratings with a penalty (like 7 Elo for each percent of irregularly lost games). That won't corrupt the ratings of those programs that got the win for free.
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Time loss in the IPON

Post by IWB »

Hello H.G.

With all respect, but that is theoretical nitpicking!

I have 0.26% time losses overall, distributed over many engines. The worst engine does have a 0.7% loss on time rate, which is about 4 Elo (in best case, means all games would be won, which is unlikely). These 0.7% (or 66 games) are distributed again over many engines ...
I dont see any statistical problem here which is not very, very well within any error bar I can possibly reach with my recourses (and my will).

Additionaly, you might be right that bayes elo is considering a loss as a world class game for the winning engine but - it IS lost by another engine. IN ANY Tourney I know AND according to the chess rules, a loss on time is a loss!

Bye
INgo
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Time loss in the IPON

Post by MM »

I can't see anymore ipon chess, it is blank. I tried scrolling but nothing to do. I have i.e. (i mean i cant see stockfish results).

Thanks

Regards
MM
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Time loss in the IPON

Post by Sven »

hgm wrote:The main problem is not that an engine gets punished Elo-wise for time losses, but that their opponents get unjustly rewarded. This corrupts the entire rating list, because it invalidates the rating model on which the analysis is based.

All standard rating models assume the score percentage tails off very rapidly when you are much weaker than the opponent. But against an opponent that throws games, there will always be a fair chance you score points, no matter how weak you are. But the rating extractor will consider it very strong evidence the engine played a world-class game, and will up its rating accordingly.

With standard programs like BayesElo, I think it would work much better to calculate 'raw' ratings based on games without any irregularities, and keep separate statistics for the probability to lose on time (or through bugs), and then just correct the raw ratings with a penalty (like 7 Elo for each percent of irregularly lost games). That won't corrupt the ratings of those programs that got the win for free.
For the reasons you describe, I like my approach of removing time-loss games from the PGN better than assigning some artificial penalty. A time-loss game has almost zero information relevant for rating. Assigning a penalty is just a wild guess IMO, and can produce much less reliable results (but not significantly less "rating corruption" in the sense above) than not evaluating the game.

Generally speaking, the same "rating corruption" applies also to human games. Nevertheless it would not make sense to exclude time-loss games of human players from rating since that might encourage some strange people to let their flag fall in a hopeless position just for the purpose to avoid losing rating points, and/or to steal his opponent's rating gain.

Maybe my approach is not perfectly suitable for practicing it in rating lists, I only propose it for other testing activities.

Sven
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Time loss in the IPON

Post by IWB »

FIXED

Since recently the transmission (Autocopy) is crashing sometimes. I have no idea why!

Bye
Ingo
User avatar
hgm
Posts: 28387
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Time loss in the IPON

Post by hgm »

Well, if it does happen that infrequently, it is not even worth considering. I am of course used to testing engines that lose on time far more frequently (by crashing).
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Time loss in the IPON

Post by IWB »

hgm wrote:Well, if it does happen that infrequently, it is not even worth considering. I am of course used to testing engines that lose on time far more frequently (by crashing).
I mentioned in the post you replyed how often it happend ... and don't forget, I am testing 5 + 3. The + 3 makes it really hard to lose on time. Actually I guess I have less problems than the sudden time controls which are called repeating ...

Nonetheless, reading the post of Sven Schüle as well I have to agree that for the purpose of engine development a removal of time losses might be the way to go ... for rating and ranking it is not needed (as long as someone has an eye on it and makes some reasonable decisions)

Bye
Ingo
Last edited by IWB on Sun Jan 08, 2012 12:11 am, edited 1 time in total.
User avatar
hgm
Posts: 28387
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Time loss in the IPON

Post by hgm »

Sven Schüle wrote:Nevertheless it would not make sense to exclude time-loss games of human players from rating since that might encourage some strange people to let their flag fall in a hopeless position just for the purpose to avoid losing rating points, and/or to steal his opponent's rating gain.
Why would that only apply to humans? It seems a cheap way for engines to gain a few rating points too. When you are losing, miss the time control by 0.1 sec, and hope for a replay...

The best treatment would be a replay from the lost position.

Another point worth noting is that time losses are not always the engines fault. Sometimes the OS just doesn't give any time to the engine at a critical moment. I remember receiving a 'bug report' from WBEC once, where Joker lost a bullet game after 10 moves. But the winboard.debug file showed the 10 moves had been done in 10 sec, and that the 11th move had deepened to only 3 ply in the remaining 50 sec. But the 3-ply thinking output was arriving just before the 60-sec TC, and in the thinking output, Joker (which reports CPU time) told that it had been thinking about that only for a centi-second. So it was clear that it simply was not getting any CPU, and there is nothing an engine can do about that.
Last edited by hgm on Sun Jan 08, 2012 12:17 am, edited 1 time in total.