Unstable results with little blitzer

Discussion of chess software programming and technical issues.

Moderator: Ras

jacobbl
Posts: 80
Joined: Wed Feb 17, 2010 3:57 pm

Unstable results with little blitzer

Post by jacobbl »

I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?

Regards
Jacob
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Unstable results with little blitzer

Post by Edmund »

Just some notes from own experience.
Any losses on time?
Are there any processes running in the background?
Are the stats (nps, average time per move, average depth) similar?
jacobbl
Posts: 80
Joined: Wed Feb 17, 2010 3:57 pm

Re: Unstable results with little blitzer

Post by jacobbl »

Maximum 20 time losses, mostly the opponents.
No other processes running.
I am not quite sure of the statistics, but I think they are the same.

Regards
Jacob
User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

Re: Unstable results with little blitzer

Post by Kempelen »

jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?

Regards
Jacob
I had an experience similar to your. I realized the problem was that I have two programming environment in my house and they both had different compiler version, altough similar, one of the produced better code..... I suspect is not your case, but just in case....
Fermin Serrano
Author of 'Rodin' engine
http://sites.google.com/site/clonfsp/
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Unstable results with little blitzer

Post by Sven »

jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?

Regards
Jacob
I could imagine that it is related to different handling of opening books. Apart from the general remark that many people do not use opening books but fixed sets of starting positions when testing with a large number of fast TC games (but this will not be your current main problem since you report almost stable results under Arena), I would expect that there is at least one of the involved engines that uses the opening book differently somehow under LittleBlitzer vs. Arena. Are you sure that the opening book is always the same in both cases from each engine's viewpoint?

Sven
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Unstable results with little blitzer

Post by lucasart »

jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?

Regards
Jacob
use cutechess-cli !
F. Bluemers
Posts: 880
Joined: Thu Mar 09, 2006 11:21 pm
Location: Nederland

Re: Unstable results with little blitzer

Post by F. Bluemers »

jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?

Regards
Jacob
I don't know LB too well but maybe one or more of the engines in your test is sending more uci info stuff for it to handle well.
Arena has a filter for uci spamming.

You could try hgm's package for testing gui.
It could indicate lag problems
http://rybkaforum.net/cgi-bin/rybkaforu ... ?tid=24367
jacobbl
Posts: 80
Joined: Wed Feb 17, 2010 3:57 pm

Re: Unstable results with little blitzer

Post by jacobbl »

After testing some more, the different versions play as expected relative too each other, but they all play about 6% lower than the same version did last week. It is especially one engine (eeyore) where the difference is large, but also the other engines don't match last weeks results within a reasonable errorbar. I will do some more testing, and I might try cutechess as well. Can you play 10 threads in paralell with cutechess?

Thanks for all help.

Regards
Jacob
User avatar
Rebel
Posts: 7514
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Unstable results with little blitzer

Post by Rebel »

Not sure if this addresses the trouble your are facing but have you done a reliability test on your testing environment? A good test is to do a self-play match at fixed depth. It's output should be 100% identical, an exact 50% result, identical games, scores and even nodes.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Unstable results with little blitzer

Post by Sven »

Rebel wrote:Not sure if this addresses the trouble your are facing but have you done a reliability test on your testing environment? A good test is to do a self-play match at fixed depth. It's output should be 100% identical, an exact 50% result, identical games, scores and even nodes.
Unfortunately this won't happen when using an opening book, at least not with the attribute "exact" ...