Open Source Blitz Rating List: Pepito 1.59, IvanHoe 999946h

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by Adam Hair »

lucasart wrote:
Adam Hair wrote:
lucasart wrote:IvanHoe after 100 games (50 vs Stockfish, 50 vs Protector)

Code: Select all

Rank Name                  Elo    +    - games score oppo. draws 
   1 IvanHoe 999946h      3238   59   56   100   73%  3063   29% 
   2 Stockfish 2.2.1      3189   43   41   250   80%  2917   21% 
   3 Protector 1.4        2937   34   34   300   53%  2923   24% 
   4 Umko 1.2             2871   31   30   350   57%  2822   27% 
   5 Toga 1.4.1           2846   29   29   400   59%  2782   24% 
   6 Daydreamer 1.75      2735   29   29   350   56%  2690   29% 
   7 Fruit 2.1            2700   28   28   400   46%  2725   25% 
   8 Crafty 23.4          2694   29   29   400   36%  2816   24% 
   9 GNU Chess 5.07.173b  2656   30   30   350   44%  2701   25% 
  10 Pepito 1.59          2592   35   35   250   42%  2651   24% 
  11 Greko 9.0            2473   38   40   250   21%  2706   18% 
IvanHoe is a beast, and certainly the strongest open source chess engine in existence. A real shame all well established rating lists don't want to test it...
Who do you think we test engines for (besides for our own benefit)? Authors.
Who is the author of IvanHoe?
oh, is that the argument ?
...
It is for me.

I am sorry that I responded in this thread. It should be focused on your list.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by Adam Hair »

lucasart wrote:
kranium wrote:
lucasart wrote:
kranium wrote:
Adam Hair wrote: Who do you think we test engines for (besides for our own benefit)? Authors.
Who is the author of IvanHoe?
this ugly statement confirms what i have been saying for a long time...
the CCRL cares only about themselves and program authors?
the CCRL has no interest in presenting fair, all-inclusive, and unbiased information to the many millions of 'ordinary enthusiasts'?

the fact that CCRL accepts complimentary copies of commercial engines, while at the same time excluding a comparably strong/stronger 'free' alternative engine
(IvanHoe) from their rating lists is irresponsible to the community...

the CCRL (CEGT and IPON as well) promote commercial engines (2 of which have have been severely tainted with allegations of plagiarism) while simultaneously withholding
info that could/should be made available to 'newbies' (and all users) to aid them in their choice of engine, and purchasing decisions?

any group with such influence and power should be held responsible to the community, and expected to deliver inclusive, fair, and unbiased information.
they should absolutely refuse free copies (just like Consumer Reports, etc.), thereby eliminating any and all possibilities of undue influence, abuse, cronyism, graft, etc.

the CCC community should have none of it, and respond simply by boycotting these misguided rating groups.
you're putting it a bit harshly, but in principle, I agree with you
it's a pretty lame excuse if you ask me, but it's all they have left to justify their actions.

IvanHoe is blacklisted, and Rybka/Houdini are actively being tested?
so, according to the CCRL, CEGT, and IPON:
using a 'pseudonym' (if that's even true?) is worse than serious allegations of plagiarism when it come to being included or not!?
:shock:
agreed. it's not a reason but an excuse, and the only one they have left, since the truth was revealed about the Rybka/Ippolit story (and all the clone war that follows).

even if the authors of Ippolit (and Ippolit's derivatives) revealed their names, it wouldn't change anything. They would find another excuse, mark my words!
The problem is that none of that speculation explains me.

1) I do not protect Rybka, Houdini, or any other commercial engine. Read my posts on Rybka and Houdini. Nor do I support them with testing. There are plenty of people who do that whether they are a part of rating agency or not. The only commercial engine that I supported with testing was Komodo 4. Why? Because I am highly appreciative that Don made and shared the similarity tool. So I decided to test Komodo 4, even though it is commercial, out of appreciation (and because I had a computer free at that moment).

2)My interests have always been focused on amateur engines. A simple review of my posts will show that to be true.

3) And, as I have stated several times before, I can choose to test whichever engine I want and then submit those games to the CCRL. And the other members would have to accept those games. So the group is not keeping me from testing IvanHoe.

As I keep saying, I do not test IvanHoe simply because there are no authors willing to put their real names with it. That offends some authors that I respect. So I will not test it. If the real authors come foward, then I will test it. There would be no reason for me not to at that point.

I think that I am open and honest with people in CCC. Do you have some reason to suspect I have ulterior reasons for not testing IvanHoe?
lucasart
Posts: 3241
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by lucasart »

Adam Hair wrote:
lucasart wrote:
kranium wrote:
lucasart wrote:
kranium wrote:
Adam Hair wrote: Who do you think we test engines for (besides for our own benefit)? Authors.
Who is the author of IvanHoe?
this ugly statement confirms what i have been saying for a long time...
the CCRL cares only about themselves and program authors?
the CCRL has no interest in presenting fair, all-inclusive, and unbiased information to the many millions of 'ordinary enthusiasts'?

the fact that CCRL accepts complimentary copies of commercial engines, while at the same time excluding a comparably strong/stronger 'free' alternative engine
(IvanHoe) from their rating lists is irresponsible to the community...

the CCRL (CEGT and IPON as well) promote commercial engines (2 of which have have been severely tainted with allegations of plagiarism) while simultaneously withholding
info that could/should be made available to 'newbies' (and all users) to aid them in their choice of engine, and purchasing decisions?

any group with such influence and power should be held responsible to the community, and expected to deliver inclusive, fair, and unbiased information.
they should absolutely refuse free copies (just like Consumer Reports, etc.), thereby eliminating any and all possibilities of undue influence, abuse, cronyism, graft, etc.

the CCC community should have none of it, and respond simply by boycotting these misguided rating groups.
you're putting it a bit harshly, but in principle, I agree with you
it's a pretty lame excuse if you ask me, but it's all they have left to justify their actions.

IvanHoe is blacklisted, and Rybka/Houdini are actively being tested?
so, according to the CCRL, CEGT, and IPON:
using a 'pseudonym' (if that's even true?) is worse than serious allegations of plagiarism when it come to being included or not!?
:shock:
agreed. it's not a reason but an excuse, and the only one they have left, since the truth was revealed about the Rybka/Ippolit story (and all the clone war that follows).

even if the authors of Ippolit (and Ippolit's derivatives) revealed their names, it wouldn't change anything. They would find another excuse, mark my words!
The problem is that none of that speculation explains me.

1) I do not protect Rybka, Houdini, or any other commercial engine. Read my posts on Rybka and Houdini. Nor do I support them with testing. There are plenty of people who do that whether they are a part of rating agency or not. The only commercial engine that I supported with testing was Komodo 4. Why? Because I am highly appreciative that Don made and shared the similarity tool. So I decided to test Komodo 4, even though it is commercial, out of appreciation (and because I had a computer free at that moment).

2)My interests have always been focused on amateur engines. A simple review of my posts will show that to be true.

3) And, as I have stated several times before, I can choose to test whichever engine I want and then submit those games to the CCRL. And the other members would have to accept those games. So the group is not keeping me from testing IvanHoe.

As I keep saying, I do not test IvanHoe simply because there are no authors willing to put their real names with it. That offends some authors that I respect. So I will not test it. If the real authors come foward, then I will test it. There would be no reason for me not to at that point.

I think that I am open and honest with people in CCC. Do you have some reason to suspect I have ulterior reasons for not testing IvanHoe?
Fair enough, I understand your reasons. At least you will be ready to test IvanHoe if/when the authors come forward and reveal their names.

But there is still something wrong. Any member of CCRL can test anything they want and they won't be censored ? I'm not really convinced this is true. Or maybe it is true now, but wasn't before... Anyway, isn't it a strange coincidence, that out of all the CCRL testers, no one wants to test IvanHoe ? In statistics we call this a selection biais...

For example, what if *I* wanted to join the CCRL testing team and test IvanHoe ? Would I be accepted ? Or would they refuse me...?
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by Adam Hair »

kranium wrote:
Adam Hair wrote: Who do you think we test engines for (besides for our own benefit)? Authors.
Who is the author of IvanHoe?
this ugly statement confirms what i have been saying for a long time...
the CCRL cares only about themselves and program authors?
the CCRL has no interest in presenting fair, all-inclusive, and unbiased information to the many millions of 'ordinary enthusiasts'?
Who do we test engines for? Authors.

I do not see that the above statement means solely commercial authors. It means all authors .

The "ordinary enthusiasts" are important also. But they are third on the list. Let's be honest here. Why are we even testing engines? #1 by a wide margin is because we want to. In some ways it is foolish, given the electric bills we have to pay. But we enjoy doing it. #2 we are appreciative of every author who shares his (and hopefully her, at some point) work with everyone. And if an author asks for his engine to be tested, we jump on it. #3 we share the results of our testing with everyone in CCC.

If anyone says #3 is suppose to be #1 in that list, I will tell them to make the effort themself to create a list. Actually, though I try to be polite to everybody, I would give them a few choice words also,

If anyone says #3 should be #2 on that list, what I would tell them is similar to the above statement. If you are simply interested in the engines and not the authors, then create your own list.
kranium wrote: the fact that CCRL accepts complimentary copies of commercial engines, while at the same time excluding a comparably strong/stronger 'free' alternative engine
(IvanHoe) from their rating lists is irresponsible to the community...

the CCRL (CEGT and IPON as well) promote commercial engines (2 of which have have been severely tainted with allegations of plagiarism) while simultaneously withholding
info that could/should be made available to 'newbies' (and all users) to aid them in their choice of engine, and purchasing decisions?

any group with such influence and power should be held responsible to the community, and expected to deliver inclusive, fair, and unbiased information.
they should absolutely refuse free copies (just like Consumer Reports, etc.), thereby eliminating any and all possibilities of undue influence, abuse, cronyism, graft, etc.

the CCC community should have none of it, and respond simply by boycotting these misguided rating groups.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by Adam Hair »

lucasart wrote:Fair enough, I understand your reasons. At least you will be ready to test IvanHoe if/when the authors come forward and reveal their names.

But there is still something wrong. Any member of CCRL can test anything they want and they won't be censored ? I'm not really convinced this is true. Or maybe it is true now, but wasn't before... Anyway, isn't it a strange coincidence, that out of all the CCRL testers, no one wants to test IvanHoe ? In statistics we call this a selection biais...
Each person has their own reason. There honestly has been no discussion on why each person holds that opinion in a while. I will state that IvanHoe has certain properties that make it different the general population of engines. So the bias is not necessarily suprising.

To be completely honest, I may be the reason (though more likely it is a reason). Though everyone has the freedom to test as they please, we try to develop a consensus before we make changes. We value gentlemen's agreements. At one point, if IvanHoe was included in the lists, I would have left the group (for the reason I have stated). I am not sure now, after everything I have learned over the past two years, that I would do that.
lucasart wrote: For example, what if *I* wanted to join the CCRL testing team and test IvanHoe ? Would I be accepted ? Or would they refuse me...?
I would have to say that you would be turned down. Though I could be wrong on that. I am not certain how I would vote, and I do not want to speak for anybody else
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by Adam Hair »

lucasart wrote:Finished testing IvanHoe. So its #1 open source place is confirmed, with a margin of 20 elo

Code: Select all

Rank Name                  Elo    +    - games score oppo. draws 
   1 IvanHoe 999946h      3192   45   43   200   79%  2956   24% 
   2 Stockfish 2.2.1      3172   42   40   250   80%  2907   21% 
   3 Protector 1.4        2930   34   33   300   53%  2913   24% 
   4 Umko 1.2             2878   29   29   400   53%  2865   27% 
   5 Toga 1.4.1           2844   28   28   450   53%  2826   23% 
   6 Daydreamer 1.75      2736   29   29   350   56%  2691   29% 
   7 Fruit 2.1            2700   28   28   400   46%  2725   25% 
   8 Crafty 23.4          2693   29   29   400   36%  2814   24% 
   9 GNU Chess 5.07.173b  2656   30   30   350   44%  2702   25% 
  10 Pepito 1.59          2592   35   35   250   42%  2652   24% 
  11 Greko 9.0            2473   38   40   250   21%  2706   18% 
Lucas,

I am sincerely not trying to be a critic, but I have to ask something. As a statistician, how can you make this statement, based on your testing?

Come on now, you are suppose to be chiding the rest of us if we make statements like that :lol:

Just joking around,

Adam
lucasart
Posts: 3241
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by lucasart »

oh you mean the confidence interval that says it's not 100 clear ivanhoe is number one ?
let me get the LOS matrix for you

Code: Select all

                     Cr Iv St
Critter 1.4             98 99
IvanHoe 999946h       1    76
Stockfish 2.2.1       0 23   
yes... there is a 76% probability ivanhoe beats stockfish, which still means a 24% probability it doesn't... good point. As the number of game increases (1 only have 1 computer) these elo intervals will hopefully reduce enough to distinguish.
but you're right!
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by kranium »

Adam Hair wrote: Let's be honest here. Why are we even testing engines? #1 by a wide margin is because we want to. In some ways it is foolish, given the electric bills we have to pay. But we enjoy doing it. #2 we are appreciative of every author who shares his (and hopefully her, at some point) work with everyone.
sorry i don't believe your motives are pure as the driven snow...
you have huge electric bills?
:lol:

sorry i also run several computers day and night....also w/ extra costs.
(and electricity costs quite a bit more here in Europe than the US, take it from somebody who has spent many years on both sides of the Atlantic).

you're no martyr Adam...
your testing efforts also involve significant recognition and prestige, (not to mention free copies and a close relationship with authors like Vas, etc.)
Last edited by kranium on Sat Jan 14, 2012 8:28 pm, edited 1 time in total.
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by kranium »

Adam Hair wrote:
lucasart wrote:Finished testing IvanHoe. So its #1 open source place is confirmed, with a margin of 20 elo

Code: Select all

Rank Name                  Elo    +    - games score oppo. draws 
   1 IvanHoe 999946h      3192   45   43   200   79%  2956   24% 
   2 Stockfish 2.2.1      3172   42   40   250   80%  2907   21% 
   3 Protector 1.4        2930   34   33   300   53%  2913   24% 
   4 Umko 1.2             2878   29   29   400   53%  2865   27% 
   5 Toga 1.4.1           2844   28   28   450   53%  2826   23% 
   6 Daydreamer 1.75      2736   29   29   350   56%  2691   29% 
   7 Fruit 2.1            2700   28   28   400   46%  2725   25% 
   8 Crafty 23.4          2693   29   29   400   36%  2814   24% 
   9 GNU Chess 5.07.173b  2656   30   30   350   44%  2702   25% 
  10 Pepito 1.59          2592   35   35   250   42%  2652   24% 
  11 Greko 9.0            2473   38   40   250   21%  2706   18% 
Lucas,

I am sincerely not trying to be a critic, but I have to ask something. As a statistician, how can you make this statement, based on your testing?

Come on now, you are suppose to be chiding the rest of us if we make statements like that :lol:

Just joking around,

Adam

this appears to simply be an effort to deflect critcism to Lucas?

that's sad...
unfortunately, you CCRL testers make much more exaggerated claims:
geots wrote: Only problem is remember KLO's compiles are approx. 30 to 70 elo weaker than PPs in Windows,
:shock:

no, sorry CCRL...
i'm quite sure that Frank Q. ran 1400 games comparing the two compiles and found them extremely close...
how many games did George run?
Last edited by kranium on Sat Jan 14, 2012 8:43 pm, edited 1 time in total.
User avatar
Graham Banks
Posts: 44611
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Open Source Blitz Rating List: Pepito 1.59, IvanHoe 9999

Post by Graham Banks »

kranium wrote:unfortunately, you CCRL testers make much more exaggerated claims:
geots wrote: Only problem is remember KLO's compiles are approx. 30 to 70 elo weaker than PPs in Windows,
sorry, CCRL...
i'm quite sure that Frank Q. ran 1400 games comparing the two compiles and found them extremely close...
how many games did George run?
You mean "sorry George" because it's his individual opinion. Don't accuse others of making exaggerated claims when you do so yourself. :wink:
gbanksnz at gmail.com