I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?
Regards
Jacob
Unstable results with little blitzer
Moderator: Ras
-
Edmund
- Posts: 670
- Joined: Mon Dec 03, 2007 3:01 pm
- Location: Barcelona, Spain
Re: Unstable results with little blitzer
Just some notes from own experience.
Any losses on time?
Are there any processes running in the background?
Are the stats (nps, average time per move, average depth) similar?
Any losses on time?
Are there any processes running in the background?
Are the stats (nps, average time per move, average depth) similar?
-
jacobbl
- Posts: 80
- Joined: Wed Feb 17, 2010 3:57 pm
Re: Unstable results with little blitzer
Maximum 20 time losses, mostly the opponents.
No other processes running.
I am not quite sure of the statistics, but I think they are the same.
Regards
Jacob
No other processes running.
I am not quite sure of the statistics, but I think they are the same.
Regards
Jacob
-
Kempelen
- Posts: 620
- Joined: Fri Feb 08, 2008 10:44 am
- Location: Madrid - Spain
Re: Unstable results with little blitzer
I had an experience similar to your. I realized the problem was that I have two programming environment in my house and they both had different compiler version, altough similar, one of the produced better code..... I suspect is not your case, but just in case....jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?
Regards
Jacob
-
Sven
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Unstable results with little blitzer
I could imagine that it is related to different handling of opening books. Apart from the general remark that many people do not use opening books but fixed sets of starting positions when testing with a large number of fast TC games (but this will not be your current main problem since you report almost stable results under Arena), I would expect that there is at least one of the involved engines that uses the opening book differently somehow under LittleBlitzer vs. Arena. Are you sure that the opening book is always the same in both cases from each engine's viewpoint?jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?
Regards
Jacob
Sven
-
lucasart
- Posts: 3243
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Unstable results with little blitzer
use cutechess-cli !jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?
Regards
Jacob
-
F. Bluemers
- Posts: 880
- Joined: Thu Mar 09, 2006 11:21 pm
- Location: Nederland
Re: Unstable results with little blitzer
I don't know LB too well but maybe one or more of the engines in your test is sending more uci info stuff for it to handle well.jacobbl wrote:I have some problems with the stabillity of my results when I am using little blitzer. I run 10.000 games divided on 5 opponents (40 moves 8 sec). When I tested the same version I once got a score of 44.4% and the next time got a score of 37.8%. This is way to big diference considering the number of games. I have had this problem before with little blitzer, but when I test with arena my results are allways within an exepteced error bar. Does anyone have any suggestion if I might be doing something wrong during testing? I test by using openingbooks, and as far as I can see there are not may equal games. Is there a tool for removing equal games from a PGN file?
Regards
Jacob
Arena has a filter for uci spamming.
You could try hgm's package for testing gui.
It could indicate lag problems
http://rybkaforum.net/cgi-bin/rybkaforu ... ?tid=24367
-
jacobbl
- Posts: 80
- Joined: Wed Feb 17, 2010 3:57 pm
Re: Unstable results with little blitzer
After testing some more, the different versions play as expected relative too each other, but they all play about 6% lower than the same version did last week. It is especially one engine (eeyore) where the difference is large, but also the other engines don't match last weeks results within a reasonable errorbar. I will do some more testing, and I might try cutechess as well. Can you play 10 threads in paralell with cutechess?
Thanks for all help.
Regards
Jacob
Thanks for all help.
Regards
Jacob
-
Rebel
- Posts: 7514
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Unstable results with little blitzer
Not sure if this addresses the trouble your are facing but have you done a reliability test on your testing environment? A good test is to do a self-play match at fixed depth. It's output should be 100% identical, an exact 50% result, identical games, scores and even nodes.
-
Sven
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Unstable results with little blitzer
Unfortunately this won't happen when using an opening book, at least not with the attribute "exact" ...Rebel wrote:Not sure if this addresses the trouble your are facing but have you done a reliability test on your testing environment? A good test is to do a self-play match at fixed depth. It's output should be 100% identical, an exact 50% result, identical games, scores and even nodes.