SPCC: Testrun of Sugar 5.2a finished

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

SPCC: Testrun of Sugar 5.2a finished

Post by pohl4711 »

Testrun of Sugar 5.2a finished. Remarkable result. Take a look on my Top-bullet-ratinglist.


http://spcc.beepworld.de


Endless RoundRobin-tournament updated, too.

(Perhaps you have to clear your browsercache or reload the website)
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: SPCC: Testrun of Sugar 5.2a finished

Post by Lyudmil Tsvetkov »

Latest update: 2015/04/01 (Sugar 5.2a. Sugar played "only" 7000 games (not against Stockfish), because Sugar should play a testrun, which is identical to a testrun of a Stockfish Dev-version (Sugar is Stockfish with some minor changes)). Sugar is now +12 Elo stronger than Stockfish 6 (and newer Stockfish-development versions are only +2 Elo stronger than Stockfish 6 (see my "Stockfish testing" -site)) - so Sugar is now measureable stronger than the latest Stockfish (around +10 Elo). Impressive! So Sugar should now be called a derivative of Stockfish - not longer a simple clone of Stockfish...



Stefan, you really amaze me.

That is what you write on your site about Sugar above, I bolded the words that are specifically wrong.

Does not it really occur to you that Sugar is simply using a faster compile and latest SF not???

Probably also large pages, and SF not?

So to say ímpressive', 'measurably stronger' , 'not a simple clone' is a complete distortion of truth, complete.

As obvious, the only reason Sugar is measurably stronger in your rating list is the faster compile.

There was a thread on fishcooking, I am sure you read that, and Vince Negri examined the code of Sugar 5.2, and again, Mr. Zerbinati did not add a single change on his own, not even tuning, he has been just automatically taking all changes SF makes and putting them into Sugar. Automatically, without a single change.

Other trick Mr. Zerbinati used was to apply all the patches that failed yellow in SF to Sugar. Again, not a single change on his own.

So, if you ask me, Sugar performance is not impressive at all, but rather disgusting, it confuses the computer chess community, and even experienced testers like you. Sugar is a very simple clone of SF, even simpler than most other clones, as those at least tune their values a bit, but here we do not have even tuning of values - just taking all SF successful patches, and that is it.

I hope really the above statements are just an April Fool's joke, otherwise this community starts yileding more and more to clones with every single day...
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: SPCC: Testrun of Sugar 5.2a finished

Post by Lyudmil Tsvetkov »

On the other hand, even the very much respected by me Graham Banks has taken Sugar 5.2. into his latest tournament.

What does that mean?
We are completely clone-plagued.

I wonder how Graham could have done that?

I can not imagine what will happen after that...
User avatar
Graham Banks
Posts: 41415
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: SPCC: Testrun of Sugar 5.2a finished

Post by Graham Banks »

Lyudmil Tsvetkov wrote:On the other hand, even the very much respected by me Graham Banks has taken Sugar 5.2. into his latest tournament.

What does that mean?
We are completely clone-plagued.

I wonder how Graham could have done that?

I can not imagine what will happen after that...
My 8 core tourneys are non-CCRL tourneys.
That is why I include DON and Sugar. Purely out of interest.
gbanksnz at gmail.com
fenchel
Posts: 36
Joined: Thu Dec 04, 2014 6:01 am

Re: SPCC: Testrun of Sugar 5.2a finished

Post by fenchel »

Lyudmil Tsvetkov wrote:Latest update: 2015/04/01 (Sugar 5.2a. Sugar played "only" 7000 games (not against Stockfish), because Sugar should play a testrun, which is identical to a testrun of a Stockfish Dev-version (Sugar is Stockfish with some minor changes)). Sugar is now +12 Elo stronger than Stockfish 6 (and newer Stockfish-development versions are only +2 Elo stronger than Stockfish 6 (see my "Stockfish testing" -site)) - so Sugar is now measureable stronger than the latest Stockfish (around +10 Elo). Impressive! So Sugar should now be called a derivative of Stockfish - not longer a simple clone of Stockfish...



Stefan, you really amaze me.

That is what you write on your site about Sugar above, I bolded the words that are specifically wrong.

Does not it really occur to you that Sugar is simply using a faster compile and latest SF not???

Probably also large pages, and SF not?

So to say ímpressive', 'measurably stronger' , 'not a simple clone' is a complete distortion of truth, complete.

As obvious, the only reason Sugar is measurably stronger in your rating list is the faster compile.

There was a thread on fishcooking, I am sure you read that, and Vince Negri examined the code of Sugar 5.2, and again, Mr. Zerbinati did not add a single change on his own, not even tuning, he has been just automatically taking all changes SF makes and putting them into Sugar. Automatically, without a single change.

Other trick Mr. Zerbinati used was to apply all the patches that failed yellow in SF to Sugar. Again, not a single change on his own.

So, if you ask me, Sugar performance is not impressive at all, but rather disgusting, it confuses the computer chess community, and even experienced testers like you. Sugar is a very simple clone of SF, even simpler than most other clones, as those at least tune their values a bit, but here we do not have even tuning of values - just taking all SF successful patches, and that is it.

I hope really the above statements are just an April Fool's joke, otherwise this community starts yileding more and more to clones with every single day...
Lyudmil, there was some discussion of Sugar on fishcooking. It's apparently a couple rejected patches (and maybe a faster compile, not remembering). It's possible that it's highly tuned to very short TC (IIRC some of the failed patches passed stc but not ltc). If you're interested, the discussion had sugar in the topic title, and/or maybe someone else here can comment...
User avatar
pohl4711
Posts: 2432
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: SPCC: Testrun of Sugar 5.2a finished

Post by pohl4711 »

Lyudmil Tsvetkov wrote:Latest update: 2015/04/01 (Sugar 5.2a. Sugar played "only" 7000 games (not against Stockfish), because Sugar should play a testrun, which is identical to a testrun of a Stockfish Dev-version (Sugar is Stockfish with some minor changes)). Sugar is now +12 Elo stronger than Stockfish 6 (and newer Stockfish-development versions are only +2 Elo stronger than Stockfish 6 (see my "Stockfish testing" -site)) - so Sugar is now measureable stronger than the latest Stockfish (around +10 Elo). Impressive! So Sugar should now be called a derivative of Stockfish - not longer a simple clone of Stockfish...



Stefan, you really amaze me.

That is what you write on your site about Sugar above, I bolded the words that are specifically wrong.

Does not it really occur to you that Sugar is simply using a faster compile and latest SF not???

Probably also large pages, and SF not?

So to say ímpressive', 'measurably stronger' , 'not a simple clone' is a complete distortion of truth, complete.

As obvious, the only reason Sugar is measurably stronger in your rating list is the faster compile.

There was a thread on fishcooking, I am sure you read that, and Vince Negri examined the code of Sugar 5.2, and again, Mr. Zerbinati did not add a single change on his own, not even tuning, he has been just automatically taking all changes SF makes and putting them into Sugar. Automatically, without a single change.

Other trick Mr. Zerbinati used was to apply all the patches that failed yellow in SF to Sugar. Again, not a single change on his own.

So, if you ask me, Sugar performance is not impressive at all, but rather disgusting, it confuses the computer chess community, and even experienced testers like you. Sugar is a very simple clone of SF, even simpler than most other clones, as those at least tune their values a bit, but here we do not have even tuning of values - just taking all SF successful patches, and that is it.

I hope really the above statements are just an April Fool's joke, otherwise this community starts yileding more and more to clones with every single day...
My tests are done without large pages and Sugar is (on my machine) only 2.5% faster, than the abrok.eu compiles (2-3 Elo?!). So Sugar is definitly a bit stronger than the latest Stockfish in my tests. If you like it or not and if you believe it or not.
And if Sugar additionally uses only patches, which are kicked out of Stockfish, why not? Perhaps you should better ask, why these patches are good in my testings against other engines and Stockfish do not use them...

Stefan
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: SPCC: Testrun of Sugar 5.2a finished

Post by Lyudmil Tsvetkov »

OK, here is the direct link to the discussion on fishcooking (thanks to Vince Negri for taking a look at the code):
https://groups.google.com/forum/#!topic ... i0kxn0NCog

I think it is better for Mr. Zerbinati to explain himself what he is doing differently in Sugar.
He reads here, so certainly he can explain that.

Other good option is for someone to provide Stefan with a fast compile of latest SF development, same compiler settings as for Sugar 5.2, so that an objective comparison could be drawn.

Stefan's list is the only reference on intermediate engine versions, people frequently look at it, and that is why it is important that the info is fully objective.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: SPCC: Testrun of Sugar 5.2a finished

Post by Lyudmil Tsvetkov »

pohl4711 wrote:
Lyudmil Tsvetkov wrote:Latest update: 2015/04/01 (Sugar 5.2a. Sugar played "only" 7000 games (not against Stockfish), because Sugar should play a testrun, which is identical to a testrun of a Stockfish Dev-version (Sugar is Stockfish with some minor changes)). Sugar is now +12 Elo stronger than Stockfish 6 (and newer Stockfish-development versions are only +2 Elo stronger than Stockfish 6 (see my "Stockfish testing" -site)) - so Sugar is now measureable stronger than the latest Stockfish (around +10 Elo). Impressive! So Sugar should now be called a derivative of Stockfish - not longer a simple clone of Stockfish...



Stefan, you really amaze me.

That is what you write on your site about Sugar above, I bolded the words that are specifically wrong.

Does not it really occur to you that Sugar is simply using a faster compile and latest SF not???

Probably also large pages, and SF not?

So to say ímpressive', 'measurably stronger' , 'not a simple clone' is a complete distortion of truth, complete.

As obvious, the only reason Sugar is measurably stronger in your rating list is the faster compile.

There was a thread on fishcooking, I am sure you read that, and Vince Negri examined the code of Sugar 5.2, and again, Mr. Zerbinati did not add a single change on his own, not even tuning, he has been just automatically taking all changes SF makes and putting them into Sugar. Automatically, without a single change.

Other trick Mr. Zerbinati used was to apply all the patches that failed yellow in SF to Sugar. Again, not a single change on his own.

So, if you ask me, Sugar performance is not impressive at all, but rather disgusting, it confuses the computer chess community, and even experienced testers like you. Sugar is a very simple clone of SF, even simpler than most other clones, as those at least tune their values a bit, but here we do not have even tuning of values - just taking all SF successful patches, and that is it.

I hope really the above statements are just an April Fool's joke, otherwise this community starts yileding more and more to clones with every single day...
My tests are done without large pages and Sugar is (on my machine) only 2.5% faster, than the abrok.eu compiles (2-3 Elo?!). So Sugar is definitly a bit stronger than the latest Stockfish in my tests. If you like it or not and if you believe it or not.
And if Sugar additionally uses only patches, which are kicked out of Stockfish, why not? Perhaps you should better ask, why these patches are good in my testings against other engines and Stockfish do not use them...

Stefan
3% faster certainly makes a difference does not it, Stefan?
But you do not mention that fact on your page...

Again, it is entirely up to you what to test, but, for the sake of fairness, when you write about Sugar, you should write that it is using entirely ideas generated on the SF framework, and in such a case words like ímpressive' are completely inaproppriate. What is so impressive about an automatical transfer of code?

Concerning yuo statement that Sugar is not a full clone, it is more than that, it is a lazy clone, as practically there not even eval and search tunings.

I say the above, only because I appreciate your list.

And, if you would like to measure the real 20 elo progress of SF over the last 2 months, I would strongly recommend that you use a normal book instead of the opposite caslings one, take a compile of latest SF development as the compile for SF 6, and then you will certainly measure those 20 elo...
voyagerOne
Posts: 154
Joined: Tue May 17, 2011 8:12 pm

Re: SPCC: Testrun of Sugar 5.2a finished

Post by voyagerOne »

Step 1. Make coffee
Step 2. Check latest tests in SF framework...that passes STC
Step 3. Ctrl+A, Ctrl+C, Ctrl+V
Step 4. Repeat.


There are many tests in SF framework that passes STC but not LTC.
They are rejected because it will hurt Elo at LTC.

Sugar is strictly tuned to be good at STC.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: SPCC: Testrun of Sugar 5.2a finished

Post by Lyudmil Tsvetkov »

voyagerOne wrote:Step 1. Make coffee
Step 2. Check latest tests in SF framework...that passes STC
Step 3. Ctrl+A, Ctrl+C, Ctrl+V
Step 4. Repeat.


There are many tests in SF framework that passes STC but not LTC.
They are rejected because it will hurt Elo at LTC.

Sugar is strictly tuned to be good at STC.
:D :)

And step 5. Make coffee again

Possible step 6. Watch how many testers will fall into that trap.