A match between SF12+NNUE and Leele ver.0.26.2

mwyoung · Post by **mwyoung** » Tue Sep 22, 2020 10:55 pm

OliverBr wrote: ↑Tue Sep 22, 2020 10:49 pm
mwyoung wrote: ↑Tue Sep 22, 2020 10:41 pm Again your stupidity shows.
What, again, is your engine? Could you please post the git link?
Thank you very much.

And what rank is your engine. They are a dime a dozen. I am a tester, and we are talking about testing "LOS". And here you are clueless.

"Why are you playing 200 games? I thought 10 games are more than enough..."

OliverBr · Post by **OliverBr** » Tue Sep 22, 2020 11:24 pm

mwyoung wrote: ↑Tue Sep 22, 2020 10:55 pm
OliverBr wrote: ↑Tue Sep 22, 2020 10:49 pm
mwyoung wrote: ↑Tue Sep 22, 2020 10:41 pm Again your stupidity shows.
What, again, is your engine? Could you please post the git link?
Thank you very much.
And what rank is your engine. They are a dime a dozen. I am a tester, and we are talking about testing "LOS". And here you are clueless.

You are so in-polite and uneducated that you answer to a question with a question.
Anyway, my engine is infinity times higher rated than your engine, because you have none.

mwyoung · Post by **mwyoung** » Tue Sep 22, 2020 11:37 pm

OliverBr wrote: ↑Tue Sep 22, 2020 11:24 pm
mwyoung wrote: ↑Tue Sep 22, 2020 10:55 pm
OliverBr wrote: ↑Tue Sep 22, 2020 10:49 pm
mwyoung wrote: ↑Tue Sep 22, 2020 10:41 pm Again your stupidity shows.
What, again, is your engine? Could you please post the git link?
Thank you very much.
And what rank is your engine. They are a dime a dozen. I am a tester, and we are talking about testing "LOS". And here you are clueless.
You are so in-polite and uneducated that you answer to a question with a question.
Anyway, my engine is infinity times higher rated than your engine, because you have none.

From my post "That is just false. It can be done with 10 to 20 games. What matters is the Elo difference between A vs B. An Example Engine A scores 10 wins in 10 games."

From your post.

"Sorry, I have to correct this statement, because it is absolutely wrong. I have seen test series, where an engine lead after over 1000 games and still was finally beaten."

"For sure, 20 games are not enough. Statistically alone this makes no sense."

"Why are you playing 200 games? I thought 10 games are more than enough..."

"You are so in-polite and uneducated that you answer to a question with a question."

"What, again, is your engine? Could you please post the git link?
Thank you very much."

You need to stop projecting what you are doing here.

At 14 games LOS for Stockfish 98.7% and Ethereal 1.3%

The match continues until we know with 100% confidence.....

Live Stream:

mwyoung · Post by **mwyoung** » Wed Sep 23, 2020 12:31 am

mwyoung wrote: ↑Tue Sep 22, 2020 11:37 pm
OliverBr wrote: ↑Tue Sep 22, 2020 11:24 pm
mwyoung wrote: ↑Tue Sep 22, 2020 10:55 pm
OliverBr wrote: ↑Tue Sep 22, 2020 10:49 pm
mwyoung wrote: ↑Tue Sep 22, 2020 10:41 pm Again your stupidity shows.
What, again, is your engine? Could you please post the git link?
Thank you very much.
And what rank is your engine. They are a dime a dozen. I am a tester, and we are talking about testing "LOS". And here you are clueless.
You are so in-polite and uneducated that you answer to a question with a question.
Anyway, my engine is infinity times higher rated than your engine, because you have none.
From my post "That is just false. It can be done with 10 to 20 games. What matters is the Elo difference between A vs B. An Example Engine A scores 10 wins in 10 games."

From your post.

"Sorry, I have to correct this statement, because it is absolutely wrong. I have seen test series, where an engine lead after over 1000 games and still was finally beaten."

"For sure, 20 games are not enough. Statistically alone this makes no sense."

"Why are you playing 200 games? I thought 10 games are more than enough..."

"You are so in-polite and uneducated that you answer to a question with a question."

"What, again, is your engine? Could you please post the git link?
Thank you very much."

You need to stop projecting what you are doing here.

At 14 games LOS for Stockfish 98.7% and Ethereal 1.3%

The match continues until we know with 100% confidence.....

Live Stream:

At 20 games Stockfish LOS score 99.9%
At 20 games Ethereal LOS score 00.1%

Code: Select all

Result:
-------------------------------------------------------------------------------------
  #  name                     games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 210920            20       9      11       0    14.5    99.9   168.4
  2. Ethereal 12.50 (POPCNT)     20       0      11       9     5.5     0.1  -168.4

Cross table:
-------------------------------------------------------------------------------------
  #  name                        score   games                    1                    2
  1. Stockfish 210920             14.5      20                    x ===1==11=11===1=11=1
  2. Ethereal 12.50 (POPCNT)       5.5      20 ===0==00=00===0=00=0                    x

Tech:
-------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                       nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 210920           124740K    29553803     39.0      4.2     67.2    283.6
  2. Ethereal 12.50 (POPCNT)    172819K    38734749     29.8      4.5     67.3    300.3
     all ---                    145309K    34275015     34.4      4.3     67.3    292.0

Match continues until we know for sure! 100% confidence.....

OliverBr · Post by **OliverBr** » Wed Sep 23, 2020 12:42 am

mwyoung wrote: ↑Wed Sep 23, 2020 12:31 am Match continues until we know for sure! 100% confidence.....

And what rank is your engine?

mwyoung · Post by **mwyoung** » Wed Sep 23, 2020 12:50 am

OliverBr wrote: ↑Wed Sep 23, 2020 12:42 am
mwyoung wrote: ↑Wed Sep 23, 2020 12:31 am Match continues until we know for sure! 100% confidence.....
And what rank is your engine?

Asked before and answered.

I am a tester, and we are talking about testing "LOS". And here you are clueless.

Your quote "You are so in-polite and uneducated that you answer to a question with a question."

And again you are projecting. And keep trying to change the subject. Maybe this is because you have no clue what you are talking about with LOS testing....

mwyoung · Post by **mwyoung** » Wed Sep 23, 2020 1:57 am

mwyoung wrote: ↑Wed Sep 23, 2020 12:50 am
OliverBr wrote: ↑Wed Sep 23, 2020 12:42 am
mwyoung wrote: ↑Wed Sep 23, 2020 12:31 am Match continues until we know for sure! 100% confidence.....
And what rank is your engine?
Asked before and answered.

I am a tester, and we are talking about testing "LOS". And here you are clueless.

Your quote "You are so in-polite and uneducated that you answer to a question with a question."

And again you are projecting. And keep trying to change the subject. Maybe this is because you have no clue what you are talking about with LOS testing....

At 29 games we know with 100% LOS certainty that Stockfish 210920 is better then Ethereal 12.50.

Code: Select all

Result:
-------------------------------------------------------------------------------------
  #  name                     games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 210920            29      11      18       0    20.0   100.0   138.7
  2. Ethereal 12.50 (POPCNT)     29       0      18      11     9.0     0.0  -138.7

Cross table:
-------------------------------------------------------------------------------------
  #  name                        score   games                             1                             2
  1. Stockfish 210920             20.0      29                             x ===1==11=11===1=11=11=======1
  2. Ethereal 12.50 (POPCNT)       9.0      29 ===0==00=00===0=00=00=======0                             x

Tech:
-------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                       nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 210920           131546K    29712474     39.5      4.4     62.2    275.4
  2. Ethereal 12.50 (POPCNT)    180443K    39015862     30.3      4.6     62.4    288.7
     all ---                    152378K    34473412     34.9      4.5     62.3    282.0

The match will continue for the whole 200 games....

Live stream:

AndrewGrant · Post by **AndrewGrant** » Wed Sep 23, 2020 9:04 pm

Only 8 games, not the highly coveted 10, but it looks like I can say with 100% confidence that Ethereal > Houdini

Thanks ! I never knew testing was so easy

AndrewGrant · Post by **AndrewGrant** » Wed Sep 23, 2020 9:07 pm

Alayan wrote: ↑Tue Sep 22, 2020 8:13 pm Ethereal 12.50 vs Stockfish 12 CCRL FRC testing has this chunk of games right at the very end of the 300 games they played :
1 1 0 = 0 0 = 1 0 1 1 1
+6-4=2 for Ethereal. +4-1 in the last 5 games.

Overall results .....

Stop typing Alayan! We've got the 10 (12!) games we need to determine which engine is better. No need to post more data, I can extrapolate from there.

If anyone would like to download Ethereal 12.50, the worlds best FRC engine, feel free to use promocode FIRE7

mwyoung · Post by **mwyoung** » Wed Sep 23, 2020 11:12 pm

AndrewGrant wrote: ↑Wed Sep 23, 2020 9:04 pm Only 8 games, not the highly coveted 10, but it looks like I can say with 100% confidence that Ethereal > Houdini

Thanks ! I never knew testing was so easy

That is because you are not very bright. I will try to make this understandable for you. Did you see 10 wins in the first 10 games in your test. Like an example I posted from the LOS chart. I guess you do not understand how to read a LOS chart.. Even when I posted the LOS chart for you. To help you understand.. And broadcasted the games live. And then posted the cross table.

Stockfish can play a even weaker engine then your engine. If you would like to see a live example of 10 wins in the first 10 games.

Let us keep this going. I love engine testing. I will set up a new match.

A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2

Re: A match between SF12+NNUE and Leele ver.0.26.2