GenoM wrote:Perhaps you never will...but let me try.mcostalba wrote:Marco, you sorry? Why? I can't understand... You're sharing all your ideas (if not code) with all commercial authors, right? That's more than enough for not feeling any remorse for that what you're doing.Houdini wrote:
Just my 2 cents.
Lets say 'A' makes an Income from developing and selling commercial chess programs. He works hard to to do this and so has a right to make profits from his efforts. (Not to mention, that a large percentage of his Profits is eaten into by Pirates ).
Now, enter Mr 'B', the Philanthropist. He develops and publishes free chess software, simply out of love for the Game of Chess, which is indeed commendable. Now, the unforeseen happens ! The free Program becomes stronger than Commercial program. People stop buying A's commercial program...big Loss for A !
Now, if B was actually selling his Program, he would have indeed nothing to apologize for, following the principle of 'All is fair in love and War' or " Survival of the Fittest " !
But since B is actually gaining NO material benefit from his superior Program and is causing a Loss to "A" to boot, for no good financial reason, he feels morally obliged to apologize to "A".
THIS is what Marco feels and that is why he is apologizing to Robert Houdart !
18 days from SF4 release and about ~30+ ELO gain!
Moderator: Ras
-
- Posts: 1339
- Joined: Fri Nov 02, 2012 9:43 am
- Location: New Delhi, India
Re: 18 days from SF4 release and about ~30+ ELO gain!
i7 5960X @ 4.1 Ghz, 64 GB G.Skill RipJaws RAM, Twin Asus ROG Strix OC 11 GB Geforce 2080 Tis
-
- Posts: 4669
- Joined: Sun Mar 12, 2006 2:40 am
- Full name: Eelco de Groot
Re: 18 days from SF4 release and about ~30+ ELO gain!
The "30 Elo" is probably a bit too much, because the latest regression test shows about 26 Elo although some new patches are in this test. The timecontrol is maybe different, but that should not have such a big impact on Larry's test. The 8moves_GM.pgn from Adam Hair is public, Adam has given a link recently in this forum again and the experiences with it have been good at least better than with variety.bin as a testing book... In case Larry uses an actual book instead of fixed openings not clearing TT between games would introduce some noise but I believe that is not done in the non-release Stockfish 4? If there is still this big discrepancy that Larry measures, after a switch to the non release SF 4 and using same openings, it must be something in the testing conditions that differs too much between the framework and Larry's tests?gladius wrote:What were your testing conditions (time control, threads, # of games)? I'm assuming it's 11.5 elo +- some error barlkaufman wrote:Since I found this hard to believe, I ran a similar test myself (SF Sept. 8 vs SF4). While the details differ slightly (book, exact time limit, hardware) the test was quite similar. My result showed a gain of just 11.5 elo. The difference is too large to attribute to sample error. Any other theories?Masta wrote:Yeah...seems that SF will run over other engines like a damn TRUCK!
18 days from release date of SF4 and almost +30 ELO gain. -> http://95.47.140.100/tests/view/522bcb1 ... 2ee68dc04a
Have a nice day yo false magicians. Your days are counted..
SF4 release version has a few changes that can influence self tests, the TT is not cleared between games, and Idle threads sleep is set to false, but that only affects matches with threads > 1. For this reason, our regression tests are performed against the non-release version.
Otherwise, I'm not really sure to be honest.
Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
-
- Posts: 10885
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: 18 days from SF4 release and about ~30+ ELO gain!
The latest regression test include some patchs that are probably a regression so let wait for the next regression test.Eelco de Groot wrote:The "30 Elo" is probably a bit too much, because the latest regression test shows about 26 Elo although some new patches are in this test. The timecontrol is maybe different, but that should not have such a big impact on Larry's test. The 8moves_GM.pgn from Adam Hair is public, Adam has given a link recently in this forum again and the experiences with it have been good at least better than with variety.bin as a testing book... In case Larry uses an actual book instead of fixed openings not clearing TT between games would introduce some noise but I believe that is not done in the non-release Stockfish 4? If there is still this big discrepancy that Larry measures, after a switch to the non release SF 4 and using same openings, it must be something in the testing conditions that differs too much between the framework and Larry's tests?gladius wrote:What were your testing conditions (time control, threads, # of games)? I'm assuming it's 11.5 elo +- some error barlkaufman wrote:Since I found this hard to believe, I ran a similar test myself (SF Sept. 8 vs SF4). While the details differ slightly (book, exact time limit, hardware) the test was quite similar. My result showed a gain of just 11.5 elo. The difference is too large to attribute to sample error. Any other theories?Masta wrote:Yeah...seems that SF will run over other engines like a damn TRUCK!
18 days from release date of SF4 and almost +30 ELO gain. -> http://95.47.140.100/tests/view/522bcb1 ... 2ee68dc04a
Have a nice day yo false magicians. Your days are counted..
SF4 release version has a few changes that can influence self tests, the TT is not cleared between games, and Idle threads sleep is set to false, but that only affects matches with threads > 1. For this reason, our regression tests are performed against the non-release version.
Otherwise, I'm not really sure to be honest.
Eelco
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: 18 days from SF4 release and about ~30+ ELO gain!
Time limit was 2' + 1.2" for about half the games and 30" + .3" on the other half (on faster hardware), so on average about like yours. Number of games was something like 6 or 7 thousand (I forget exact number and I don't have it handy right now), so error bar was somewhere around 4 elo I think. Does clearing TT make a measurable difference in these direct matches? Any other settings or factors that could explain the discrepancy? I used default settings for both versions.gladius wrote:What were your testing conditions (time control, threads, # of games)? I'm assuming it's 11.5 elo +- some error barlkaufman wrote:Since I found this hard to believe, I ran a similar test myself (SF Sept. 8 vs SF4). While the details differ slightly (book, exact time limit, hardware) the test was quite similar. My result showed a gain of just 11.5 elo. The difference is too large to attribute to sample error. Any other theories?Masta wrote:Yeah...seems that SF will run over other engines like a damn TRUCK!
18 days from release date of SF4 and almost +30 ELO gain. -> http://95.47.140.100/tests/view/522bcb1 ... 2ee68dc04a
Have a nice day yo false magicians. Your days are counted..
SF4 release version has a few changes that can influence self tests, the TT is not cleared between games, and Idle threads sleep is set to false, but that only affects matches with threads > 1. For this reason, our regression tests are performed against the non-release version.
Otherwise, I'm not really sure to be honest.
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: 18 days from SF4 release and about ~30+ ELO gain!
Uri Blass wrote:Eelco de Groot wrote:gladius wrote: The "30 Elo" is probably a bit too much, because the latest regression test shows about 26 Elo although some new patches are in this test. The timecontrol is maybe different, but that should not have such a big impact on Larry's test. The 8moves_GM.pgn from Adam Hair is public, Adam has given a link recently in this forum again and the experiences with it have been good at least better than with variety.bin as a testing book... In case Larry uses an actual book instead of fixed openings not clearing TT between games would introduce some noise but I believe that is not done in the non-release Stockfish 4? If there is still this big discrepancy that Larry measures, after a switch to the non release SF 4 and using same openings, it must be something in the testing conditions that differs too much between the framework and Larry's tests?
Eelco
How many positions are in the Adam Hair book you mention? Since the test shows 20,000 games, it should be at least 10,000 to avoid possible duplicated games; is it that big? I use our own set of over 35,000 opening positions, enough for 70k games. If the book used in the regression test was much smaller than 10k, this might mean that the true error margin was much larger than the reported one.
The latest regression test include some patchs that are probably a regression so let wait for the next regression test.
-
- Posts: 2122
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: 19 days from SF 4 release and about ~30 Elo gain!
Hello Larry:
Regards from Spain.
Ajedrecista.
Sure? I think that this error bar of circa ± 4 Elo for 6000 or 7000 games corresponds for a one-sigma confidence level, that is, ~ 68.27% confidence level. Since we are accustomed to 95% confidence level ~ 1.96-sigma confidence level, and an Elo gap of 11.5 Elo translates into a score 51.7%-48.3% (near 50%-50%), then the error bars for 95% confidence are (in first approximation) 1.96*(± 4), that is, around ± 8 Elo (from ± 7 to ± 9 because the original ± 4 could be ± 3.6 or ± 4.4 Elo). Please confirm my thought. Thanks in advance.lkaufman wrote:Number of games was something like 6 or 7 thousand (I forget exact number and I don't have it handy right now), so error bar was somewhere around 4 elo I think.
Regards from Spain.
Ajedrecista.
-
- Posts: 568
- Joined: Tue Dec 12, 2006 10:10 am
- Full name: Gary Linscott
Re: 18 days from SF4 release and about ~30+ ELO gain!
7000 games is 95% error bar of 8 ELO or so, it's entirely possible this was just an unlucky run.lkaufman wrote:Time limit was 2' + 1.2" for about half the games and 30" + .3" on the other half (on faster hardware), so on average about like yours. Number of games was something like 6 or 7 thousand (I forget exact number and I don't have it handy right now), so error bar was somewhere around 4 elo I think. Does clearing TT make a measurable difference in these direct matches? Any other settings or factors that could explain the discrepancy? I used default settings for both versions.gladius wrote:What were your testing conditions (time control, threads, # of games)? I'm assuming it's 11.5 elo +- some error barlkaufman wrote:Since I found this hard to believe, I ran a similar test myself (SF Sept. 8 vs SF4). While the details differ slightly (book, exact time limit, hardware) the test was quite similar. My result showed a gain of just 11.5 elo. The difference is too large to attribute to sample error. Any other theories?Masta wrote:Yeah...seems that SF will run over other engines like a damn TRUCK!
18 days from release date of SF4 and almost +30 ELO gain. -> http://95.47.140.100/tests/view/522bcb1 ... 2ee68dc04a
Have a nice day yo false magicians. Your days are counted..
SF4 release version has a few changes that can influence self tests, the TT is not cleared between games, and Idle threads sleep is set to false, but that only affects matches with threads > 1. For this reason, our regression tests are performed against the non-release version.
Otherwise, I'm not really sure to be honest.
The PGN has 48,491 games, so we should be okay there.
-
- Posts: 10885
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: 18 days from SF4 release and about ~30+ ELO gain!
I do not see how you get error bar of 8 elo for 7000 games and I think that it is 4-5 elo.gladius wrote:7000 games is 95% error bar of 8 ELO or so, it's entirely possible this was just an unlucky run.lkaufman wrote:Time limit was 2' + 1.2" for about half the games and 30" + .3" on the other half (on faster hardware), so on average about like yours. Number of games was something like 6 or 7 thousand (I forget exact number and I don't have it handy right now), so error bar was somewhere around 4 elo I think. Does clearing TT make a measurable difference in these direct matches? Any other settings or factors that could explain the discrepancy? I used default settings for both versions.gladius wrote:What were your testing conditions (time control, threads, # of games)? I'm assuming it's 11.5 elo +- some error barlkaufman wrote:Since I found this hard to believe, I ran a similar test myself (SF Sept. 8 vs SF4). While the details differ slightly (book, exact time limit, hardware) the test was quite similar. My result showed a gain of just 11.5 elo. The difference is too large to attribute to sample error. Any other theories?Masta wrote:Yeah...seems that SF will run over other engines like a damn TRUCK!
18 days from release date of SF4 and almost +30 ELO gain. -> http://95.47.140.100/tests/view/522bcb1 ... 2ee68dc04a
Have a nice day yo false magicians. Your days are counted..
SF4 release version has a few changes that can influence self tests, the TT is not cleared between games, and Idle threads sleep is set to false, but that only affects matches with threads > 1. For this reason, our regression tests are performed against the non-release version.
Otherwise, I'm not really sure to be honest.
The PGN has 48,491 games, so we should be okay there.
You have 2.8 error bar after 20,000 games
see for example the regression of latest stockfish
http://tests.stockfishchess.org/tests/v ... 63f25cba49
you should have 2.8*sqrt(20,000/7000) after 7000 games that is between 4 elo and 5 elo.
-
- Posts: 10885
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: 19 days from SF 4 release and about ~30 Elo gain!
after 20,000 games the error bar is 2.8 elo with 95% confidence after 20,000 gamesAjedrecista wrote:Hello Larry:
Sure? I think that this error bar of circa ± 4 Elo for 6000 or 7000 games corresponds for a one-sigma confidence level, that is, ~ 68.27% confidence level. Since we are accustomed to 95% confidence level ~ 1.96-sigma confidence level, and an Elo gap of 11.5 Elo translates into a score 51.7%-48.3% (near 50%-50%), then the error bars for 95% confidence are (in first approximation) 1.96*(± 4), that is, around ± 8 Elo (from ± 7 to ± 9 because the original ± 4 could be ± 3.6 or ± 4.4 Elo). Please confirm my thought. Thanks in advance.lkaufman wrote:Number of games was something like 6 or 7 thousand (I forget exact number and I don't have it handy right now), so error bar was somewhere around 4 elo I think.
Regards from Spain.
Ajedrecista.
see http://tests.stockfishchess.org/tests/v ... 63f25cba49
It means that in the worst case of 6000 games the error bar is
2.8*sqrt(20,000/6000) that is near 5 elo.
-
- Posts: 568
- Joined: Tue Dec 12, 2006 10:10 am
- Full name: Gary Linscott
Re: 19 days from SF 4 release and about ~30 Elo gain!
I used my rating calculator http://forwardcoding.com/projects/ajaxchess/rating.html, assuming 60% draw rate, and 10 elo advantage. It gives this:Uri Blass wrote:after 20,000 games the error bar is 2.8 elo with 95% confidence after 20,000 gamesAjedrecista wrote:Hello Larry:
Sure? I think that this error bar of circa ± 4 Elo for 6000 or 7000 games corresponds for a one-sigma confidence level, that is, ~ 68.27% confidence level. Since we are accustomed to 95% confidence level ~ 1.96-sigma confidence level, and an Elo gap of 11.5 Elo translates into a score 51.7%-48.3% (near 50%-50%), then the error bars for 95% confidence are (in first approximation) 1.96*(± 4), that is, around ± 8 Elo (from ± 7 to ± 9 because the original ± 4 could be ± 3.6 or ± 4.4 Elo). Please confirm my thought. Thanks in advance.lkaufman wrote:Number of games was something like 6 or 7 thousand (I forget exact number and I don't have it handy right now), so error bar was somewhere around 4 elo I think.
Regards from Spain.
Ajedrecista.
see http://tests.stockfishchess.org/tests/v ... 63f25cba49
It means that in the worst case of 6000 games the error bar is
2.8*sqrt(20,000/6000) that is near 5 elo.
ELO: 9.93 +- 8.15
LOS: 99.99%
Wins: 1600 Losses: 1400 Draws: 4000
However, I just tried it with fishtest's stat_util.py https://github.com/glinscott/fishtest/b ... at_util.py and it gives
ELO: 11.56 +- 6.2
LOS: 99.99%
I would tend to trust stat_util.py more, but I'm not honestly sure.