Gambit Rating List halted

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
Rebel
Posts: 6946
Joined: Thu Aug 18, 2011 12:04 pm

Gambit Rating List halted

Post by Rebel »

I have decided to stop with the Gambit Rating List.

First of all I already far behind because I need all my machine power for the development of NNUE but that's not the main reason. It's an unfortunate fact that there are two kind of NNUE engines, 1) those that produce about the same NPS whether you use the AVX2 or the non AVX2 compile and 2) those that produce a significant higher NPS with AVX2 than the non AVX2 version.

And here is the problem, for the GRL I use 2 PC's, about the same speed, but one has AVX2, the other only AVX. And so as a tester you have to be very keen on which PC you run engine X and engine Y with the increasing numbers of NNUE engines and their often conflicting AVX2-NPS and NON-AVX2-NPS. And running a rating list on one PC is not doable.

Last and final argument to stop, Rebel 14 is one of those, the AVX2 version is considerable faster than the SSE4 version. So, on which PC shall I run it for the GRL, the AVX2 PC or the other one? Looks like a conflict of interest.
90% of coding is debugging, the other 10% is writing bugs.
Modern Times
Posts: 3517
Joined: Thu Jun 07, 2012 11:02 pm

Re: Gambit Rating List halted

Post by Modern Times »

Indeed, the arrival of NNUE has made some testing machines essentially obsolete.
chrisw
Posts: 4290
Joined: Tue Apr 03, 2012 4:28 pm

Re: Gambit Rating List halted

Post by chrisw »

Rebel wrote: Mon Jan 24, 2022 9:52 am I have decided to stop with the Gambit Rating List.

First of all I already far behind because I need all my machine power for the development of NNUE but that's not the main reason. It's an unfortunate fact that there are two kind of NNUE engines, 1) those that produce about the same NPS whether you use the AVX2 or the non AVX2 compile and 2) those that produce a significant higher NPS with AVX2 than the non AVX2 version.

And here is the problem, for the GRL I use 2 PC's, about the same speed, but one has AVX2, the other only AVX. And so as a tester you have to be very keen on which PC you run engine X and engine Y with the increasing numbers of NNUE engines and their often conflicting AVX2-NPS and NON-AVX2-NPS. And running a rating list on one PC is not doable.

Last and final argument to stop, Rebel 14 is one of those, the AVX2 version is considerable faster than the SSE4 version. So, on which PC shall I run it for the GRL, the AVX2 PC or the other one? Looks like a conflict of interest.
Are you playing PC vs PC? No, surely not. Why not just have AVX2 compiles on one PC, and SSE on the other? Then you get two relative lists and for each engine you can generate a composite Elo - “if this engine was AVX2 then it would have an AVX2 relative Elo of XYZ”. By the simple act of expediently averaging the percentage performance of each engine on each list, and only then applying the relative to absolute Elo function. All that’s really necessary is to not have game results from engineA-AVX2 vs engineB-SSE.
User avatar
Rebel
Posts: 6946
Joined: Thu Aug 18, 2011 12:04 pm

Re: Gambit Rating List halted

Post by Rebel »

chrisw wrote: Mon Jan 24, 2022 5:13 pm
Rebel wrote: Mon Jan 24, 2022 9:52 am I have decided to stop with the Gambit Rating List.

First of all I already far behind because I need all my machine power for the development of NNUE but that's not the main reason. It's an unfortunate fact that there are two kind of NNUE engines, 1) those that produce about the same NPS whether you use the AVX2 or the non AVX2 compile and 2) those that produce a significant higher NPS with AVX2 than the non AVX2 version.

And here is the problem, for the GRL I use 2 PC's, about the same speed, but one has AVX2, the other only AVX. And so as a tester you have to be very keen on which PC you run engine X and engine Y with the increasing numbers of NNUE engines and their often conflicting AVX2-NPS and NON-AVX2-NPS. And running a rating list on one PC is not doable.

Last and final argument to stop, Rebel 14 is one of those, the AVX2 version is considerable faster than the SSE4 version. So, on which PC shall I run it for the GRL, the AVX2 PC or the other one? Looks like a conflict of interest.
Are you playing PC vs PC? No, surely not. Why not just have AVX2 compiles on one PC, and SSE on the other? Then you get two relative lists and for each engine you can generate a composite Elo - “if this engine was AVX2 then it would have an AVX2 relative Elo of XYZ”. By the simple act of expediently averaging the percentage performance of each engine on each list, and only then applying the relative to absolute Elo function. All that’s really necessary is to not have game results from engineA-AVX2 vs engineB-SSE.
Or in case when NNUE engines differ too much in NPS play them twice and label them as engine-AVX2 and engine-SSE. I think that's more elegant. Will think about it. But for the moment nnue development.
90% of coding is debugging, the other 10% is writing bugs.