Scaling from FGRL results with top 3 engines

Uri Blass · Post by **Uri Blass** » Sat Sep 30, 2017 8:40 pm

Cardoso wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:If an engine scales better, it is most likely search that is better (lower branching factor).

The second most likely thing would be the SMP implementation.

The evaluation will not affect scaling much, except for improvement in the move ordering.
sorry Dann, but that is all BS.

1) SMP is irrelevant, as Andreas' tests are conducted with single thread

2) BF has nothing to do with elo, are you aware of that? an engine with higher BF might have higher elo.

3) evaluation will not affect scaling much, hmm, I said 'king safety' and not 'evaluation', king safety might be related to both evaluation and search(and move ordering too, for that matter).

I have not based my observations on pure speculation, but on following extreme number of games at STC between the tops.

also, see Kai's statistics, seems to be pointing at exactly my hypothesis.
>>
2) BF has nothing to do with elo, are you aware of that? an engine with higher BF might have higher elo.
<<
You're a funny man.
you are even funnier.

2) BF has nothing to do with elo, are you aware of that? an engine with higher BF might have higher elo.
Ah ah, please understand booze and chess programming are incompatible
If BF has nothing to do with elo then it is easy to test this, we artificially increase BF and test the engine.

I understand that the latest patch of SF reduce reductions in some conditions so I guess it increase the BF(note that I did not test the claim that it increase the BF of stockfish and it may be interesting to test it but my intuition say that if you reduce less it means that your BF is higher).

http://tests.stockfishchess.org/tests/v ... 16ff64b9bc

Uri Blass · Post by **Uri Blass** » Sat Sep 30, 2017 9:00 pm

Cardoso wrote:
to do move ordering, you are using hash moves, where the score is based on eval; killer moves are based on eval
Lyudmil, the hashmove and it's associated score, comes not from the eval, but from a search that has an eval, same as the killer moves, counter moves, followupmoves you name it. if you simple assigned the eval result to the hashmove and to killers/countermoves/followupmoves etc that would hurt the engine badly.
Do you think Stockfish is so good because of eval? Just drop the SF eval and use a simplistic material only eval, and play some games against it, and you will finally understand the power of the search and the search also regulates Branching Factor.
To me SF success is 90% search and 10% eval!
Before making so assertive comments I think you should take a course on chess programming, steadily and gradually understand and implement the basics of an alphabeta searcher.

Also drop the "expert" attitude, as you are not one, as you are not even a student. Also that bit of overbearing pride is not heathy to you and those around you.

I think that it will be an interesting experiment to have 2 versions of stockfish.

version A has the same search but only simple piece square table evaluation(no mobility no pawn structure and no king safety)

version B has the same evaluation but only simple alpha beta search.

Note that I expect version A to win convincingly but it may be interesting if A also scales better than B.

Note that I expect both of them to scale significantly worse than stockfish.

It means that
I guess we may see something like the following
1)Stockfish 0.1 seconds per move is the same level as simple stockfish 10 seconds per move.

2)Stockfish 1 seconds per move is at the same level as simple stockfish 300 seconds per move.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Sep 30, 2017 9:31 pm

Cardoso wrote:Ah ah, please understand booze and chess programming are incompatible
If BF has nothing to do with elo then it is easy to test this, we artificially increase BF and test the engine.

on the contrary, it helps a lot.

BF=30 will do just fine and place you first on the list, provided you have near-perfect eval and do just ply 2.

also, possibly, different engines count in different nodes to determine BF, so this is pretty much relative.

back to booze again.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Sep 30, 2017 9:33 pm

Uri Blass wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:It is also true that better evaluation will reduce branching factor, principally by improvement in move ordering (which is very important to the fundamental alpha-beta step).

There are other things that tangentially improve branching factor like hash tables and IID.

It is also true that pure wood counting is not good enough. But examine the effectiveness of Olithink, which has an incredibly simply eval. It has more than just wood, but an engine can be made very strong almost exclusively through search. I guess that grafting Stockfish evaluation into a minimax engine you will get less than 2000 Elo.

I guess that grafting Olithink eval into Stockfish you will still get more than 3000 Elo.

Note that I did not test this, it is only a gedankenexperiment.
so, no search without eval.

I guess you are grossly wrong about both the 2000 and 3000 elo mark.

wanna try one of the 2?

Olithink eval into SF will play something like 1500 elo, wanna bet?

I guess it is time to change gedankenexperiment for realitaetsueberpruefung...
From CCRL 40/40;
216 OliThink 5.3.2 64-bit 2372 +19 −19 48.3% +12.5 25.6% 1011

With a super simple eval and a fairly simple search, it is already 2372.
Adding the incredible, sophisticated search of Stockfish will lower the eval by more than 872 points?
of course, it is all about tuning.

we are not speaking here of downgrading SF, leaving all its search and using just a dozen basic eval terms, in which case SF will still be somewhat strong, but of patching an entirely alien eval onto SF search.

as the eval and search will not be tuned to each other, you will mostly get completely random results.
You are mostly right about that.
While good programming technique demands encapsulation, it is so ultra tempting to pierce that veil and get chummy with other parts of the program and show them your innards that virtually all programs do it.

I must mention Bas Hamstra's program, which was so beautifully crafted. But that is neither here nor there.

I guess that point I wanted to make is that branching factor (DONE PROPERLY) is the golden nail to better program success.

You point to eval. And eval has its place. But once (for instance) the fail high rate goes over 95% on the pv node, the rest is fluff, as far as BF goes. Now, there can be things to aim the engine better, I think everyone agrees on that. But if you are going to shock the world (and look at every world shocker) it is BF gains that drop the jaws and make the eyes bug out.

As I have said elsewhere, you are an interesting person and you know a lot about chess. But until you understand the complete implication of the branching factor, you cannot properly advice chess programmers.

The branching factor is the golden nail upon which all the kings will drape their mantles.

Mark my words,
Marking your words.
but before that, I will have to take a course on colloquial American.

neither me, nor you are chess programmers, as none of us has written/published a fully-fledged and functional chess engine from scratch.

I am not advising anyone, just sharing some thoughts.

with all your words and behaviour you want to deliver a single message:
'search is more important than evaluation what concerns performance of chess engines.'

and you drive your point home each and every time.

but this is simply not true.

just about everything in a chess engine revolves around eval, a specific estimate for each and every node:
- you call a function called eval() or something similar each and every node, unless you have a hash move
- to do move ordering, you are using hash moves, where the score is based on eval; killer moves are based on eval
- referring to main search functions, alpha is an evaluation estimate, beta too
- going to search routines, be it for futility pruning, razoring, null move reductions or something else, you are always using some kind of evaluation seed
- LMR/LMP, again, you need to order moves first, for which you use eval one way or another, and then reductions specifics again work only within a
particular evaluation framework

so that, to start a chess engine, after doing the move stack and generating the moves, the first thing you need is some kind of evaluation. search only comes in second.

of course, as a matter of fact, both are inseparable, but if I had to pick a more important factor, that would be evaluation.

and indeed, you can build a one-ply engine with some basic evaluation that will still pick some reasonable moves, while it is almost impossible to do the same with the most sophisticated search, provided the program does not know what the pieces are worth.
I agree that you need some evaluation to start but you can start with a very simple one.

I agree that search rules are basically based on some type of evaluation but it does not have to be the evaluation that give a single number to a position.

I would like engines to evaluate positions not in terms of advantage but in terms of confidence in the same way people know without search.

For example

1)Sure win for white

[d]2r5/Q7/4k3/1p6/2P5/1P1P4/P5PP/6K1 b - - 0 51

2)White does not lose
[d]4k3/3bppp1/8/8/8/8/3NPPPP/4K3 b - - 0 1

3)sure draw
[d]5rk1/5ppp/8/8/8/8/5PPP/5RK1 w - - 0 1

I think that these type of evaluations can be used simply for pruning.
For example as long as the computer evaluates that black is better then there is no reason to search forward lines that white does not lose and the computer can safely prune lines regardless of remaining depth.

what's the difference between score and confidence of outcome?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Sep 30, 2017 9:44 pm

Cardoso wrote:
to do move ordering, you are using hash moves, where the score is based on eval; killer moves are based on eval
Lyudmil, the hashmove and it's associated score, comes not from the eval, but from a search that has an eval, same as the killer moves, counter moves, followupmoves you name it. if you simple assigned the eval result to the hashmove and to killers/countermoves/followupmoves etc that would hurt the engine badly.
Do you think Stockfish is so good because of eval? Just drop the SF eval and use a simplistic material only eval, and play some games against it, and you will finally understand the power of the search and the search also regulates Branching Factor.
To me SF success is 90% search and 10% eval!
Before making so assertive comments I think you should take a course on chess programming, steadily and gradually understand and implement the basics of an alphabeta searcher.

Also drop the "expert" attitude, as you are not one, as you are not even a student. Also that bit of overbearing pride is not heathy to you and those around you.

and you are a BS.

why are you teaching me?

SF 90% search, 10% eval. are you certain?

drop QS and SF search will lead you nowhere with primitive eval. =less than 1000 elo.

if you have not understood by now eval and search are completely inseparable, than what programmer are you?

a checkers programmer into a chess forum?

upgrade to chess, and then we will talk.

you say: "the hashmove and it's associated score, comes not from the eval, but from a search that has an eval".

what is the difference, man?

I guess you are drinking more than me.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Sep 30, 2017 9:56 pm

Uri Blass wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:It is also true that better evaluation will reduce branching factor, principally by improvement in move ordering (which is very important to the fundamental alpha-beta step).

There are other things that tangentially improve branching factor like hash tables and IID.

It is also true that pure wood counting is not good enough. But examine the effectiveness of Olithink, which has an incredibly simply eval. It has more than just wood, but an engine can be made very strong almost exclusively through search. I guess that grafting Stockfish evaluation into a minimax engine you will get less than 2000 Elo.

I guess that grafting Olithink eval into Stockfish you will still get more than 3000 Elo.

Note that I did not test this, it is only a gedankenexperiment.
so, no search without eval.

I guess you are grossly wrong about both the 2000 and 3000 elo mark.

wanna try one of the 2?

Olithink eval into SF will play something like 1500 elo, wanna bet?

I guess it is time to change gedankenexperiment for realitaetsueberpruefung...
From CCRL 40/40;
216 OliThink 5.3.2 64-bit 2372 +19 −19 48.3% +12.5 25.6% 1011

With a super simple eval and a fairly simple search, it is already 2372.
Adding the incredible, sophisticated search of Stockfish will lower the eval by more than 872 points?
of course, it is all about tuning.

we are not speaking here of downgrading SF, leaving all its search and using just a dozen basic eval terms, in which case SF will still be somewhat strong, but of patching an entirely alien eval onto SF search.

as the eval and search will not be tuned to each other, you will mostly get completely random results.
Based on my experience with a different engine in the past(strelka) it is not the case.
I changed strelka's evaluation to a simple piece square table and was surprised to see that it is very strong at least in fast time control and beat engine like Joker in most games when joker has near 2300 CCRL rating.

Note that strelka's piece square table was simply the piece square table from the strelka's code that is not optimized and it was clear from watching the games that strelka could be better by increasing the value of the knight and bishop so it knows that knight and bishop are more than rook and a pawn.

In the original code strelka has imbalance table but I throwed all the code except piece square table.

I guess that strelka could get at least 2400 CCRL rating with piece square table evaluation at 40/4 time control and I have no reason to believe that it is going to be worse for stockfish inspite of the fact that stockfish is not tuned for olithink's evaluation(strelka is also not tuned for simplified strelka's evaluation that does not include pawn structure mobility and king safety).

don't believe that, you did something wrong.

psqt, no mobility, no piece values, and playing at 2400?

don't buy it.

maybe Joker has just very basic QS and Strelka(clone too?) a very refined one, so this might explain partially a result trend, but QS is not search proper.

disable QS for both engines, repeat the test and, if Strelka achieves more than 1500 elo, than I am damned.

btw., I have watched a range of Strelka games, and something strikes the knowledgeable watcher already at first glance: every 3rd or 4th move, Strelka plays completely random moves, obviously the result of random eval changes.

the chess impression is not appealing, I swear solemnly.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Sep 30, 2017 10:05 pm

Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:If an engine scales better, it is most likely search that is better (lower branching factor).

The second most likely thing would be the SMP implementation.

The evaluation will not affect scaling much, except for improvement in the move ordering.
I think that better search does not mean lower branching factor

It is easy to get lower branching factor by dubious pruning.

I think that evaluation is important and I expect top engines not to scale well if you change their evaluation to simple piece square table evaluation.
Every single great advancement is chess engines has been due to a reduction in branching factor. While it is obviously a mistake to prune away good stuff let's take a quick look at the list:

1) Alpha-Beta : Enormous improvement over mini-max
2) Null move reduction: Enormous improvement over plain alpha-beta
3) PVS search: Modest improvement over null move reduction due to zero window searches
4) History Reductions: (As pioneered by Fruit) - huge improvent over plain PVS search
5) Smooth scaling reductions in null move pruning (As, for instance, Stockfish) - significant improvement over ordinary null move
6) Razoring (like Rybka and Strelka): Enormous improvement over plain pvs search
7) Late Move Reductions: (with Tord taking the lead in both effectiveness and publication) -- a huge improvement over not having LMR.

There are, of course, many others that I did not mention here.

It is not a coincidence that the top ten engines all have branching factors of about 2, and it is not a coincidence that most weak engines have a large branching factor.

Now, your point in well taken with individual cases. For instance, ExChess had the best branching factor of all engines at one point. But it was not the strongest engine by far. So poorly tuned reductions are not nearly so beneficial as properly tuned reductions.

But almost every big advancement comes from a reduction in branching factor and the next revolution will come from a reduction in branching factor.

There are, of course, some exceptions. The material imbalance table in Rybka was another revolution, and almost entirely due to evaluation improvement in that case (as a 'for instance'). We can thank Larry Kaufman for that, I think.
so, what makes you think Komodo has better BF than SF?
I did not say that. I do not think it is clear which is better, but both have very good branching factors.
what is the connection to LTC scaling?
Suppose that engine A evaluates twice as many nodes to advance one ply. BF=2

Suppose that engine B evaluates three times as many nodes to advance one ply. BF=3

To get to 30 ply how many more nodes will B examine than A in orders of magnitude?
on the contrary, you said that Komodo might scale well, because BF is conducive to good scaling. that would presume Komodo has lower BF.

I asked you a question, and you reply with a riddle.

LTC or STC, BF always applies, so how does lower BF perform better at LTC?
Ah, I see.
You are a mathematical ignorant.

????

meaning?

you even don't know how BF is calculated. you can compute just an average BF, but BF values will differ across plies.
how much useful is this in pinpointing the validity and contribution of BF to playing strength?

you set me a mathematical riddle. what did you want me to reply?

pow(3/2,30) ?

I repeat my question: how does lower BF perform better at LTC?

Uri Blass · Post by **Uri Blass** » Sat Sep 30, 2017 10:10 pm

Lyudmil Tsvetkov wrote:
Uri Blass wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:It is also true that better evaluation will reduce branching factor, principally by improvement in move ordering (which is very important to the fundamental alpha-beta step).

There are other things that tangentially improve branching factor like hash tables and IID.

It is also true that pure wood counting is not good enough. But examine the effectiveness of Olithink, which has an incredibly simply eval. It has more than just wood, but an engine can be made very strong almost exclusively through search. I guess that grafting Stockfish evaluation into a minimax engine you will get less than 2000 Elo.

I guess that grafting Olithink eval into Stockfish you will still get more than 3000 Elo.

Note that I did not test this, it is only a gedankenexperiment.
so, no search without eval.

I guess you are grossly wrong about both the 2000 and 3000 elo mark.

wanna try one of the 2?

Olithink eval into SF will play something like 1500 elo, wanna bet?

I guess it is time to change gedankenexperiment for realitaetsueberpruefung...
From CCRL 40/40;
216 OliThink 5.3.2 64-bit 2372 +19 −19 48.3% +12.5 25.6% 1011

With a super simple eval and a fairly simple search, it is already 2372.
Adding the incredible, sophisticated search of Stockfish will lower the eval by more than 872 points?
of course, it is all about tuning.

we are not speaking here of downgrading SF, leaving all its search and using just a dozen basic eval terms, in which case SF will still be somewhat strong, but of patching an entirely alien eval onto SF search.

as the eval and search will not be tuned to each other, you will mostly get completely random results.
Based on my experience with a different engine in the past(strelka) it is not the case.
I changed strelka's evaluation to a simple piece square table and was surprised to see that it is very strong at least in fast time control and beat engine like Joker in most games when joker has near 2300 CCRL rating.

Note that strelka's piece square table was simply the piece square table from the strelka's code that is not optimized and it was clear from watching the games that strelka could be better by increasing the value of the knight and bishop so it knows that knight and bishop are more than rook and a pawn.

In the original code strelka has imbalance table but I throwed all the code except piece square table.

I guess that strelka could get at least 2400 CCRL rating with piece square table evaluation at 40/4 time control and I have no reason to believe that it is going to be worse for stockfish inspite of the fact that stockfish is not tuned for olithink's evaluation(strelka is also not tuned for simplified strelka's evaluation that does not include pawn structure mobility and king safety).
don't believe that, you did something wrong.

psqt, no mobility, no piece values, and playing at 2400?

don't buy it.

maybe Joker has just very basic QS and Strelka(clone too?) a very refined one, so this might explain partially a result trend, but QS is not search proper.

disable QS for both engines, repeat the test and, if Strelka achieves more than 1500 elo, than I am damned.

btw., I have watched a range of Strelka games, and something strikes the knowledgeable watcher already at first glance: every 3rd or 4th move, Strelka plays completely random moves, obviously the result of random eval changes.

the chess impression is not appealing, I swear solemnly.

piece square table include piece values but no more than it.
piece square table means that every pair of piece and square has a score.

I remember that I added side to move bouns that was in strelka but not more than it and it was some years ago.

I did not change the search and I consider Strelka's QS as part of the search.

I could see that strelka does not understand the position from watching the games but it still won against Joker because Strelka outsearched it.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Sep 30, 2017 10:20 pm

Dann Corbit wrote:
Lyudmil Tsvetkov wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:
Uri Blass wrote:
Dann Corbit wrote:If an engine scales better, it is most likely search that is better (lower branching factor).

The second most likely thing would be the SMP implementation.

The evaluation will not affect scaling much, except for improvement in the move ordering.
I think that better search does not mean lower branching factor

It is easy to get lower branching factor by dubious pruning.

I think that evaluation is important and I expect top engines not to scale well if you change their evaluation to simple piece square table evaluation.
Every single great advancement is chess engines has been due to a reduction in branching factor. While it is obviously a mistake to prune away good stuff let's take a quick look at the list:

1) Alpha-Beta : Enormous improvement over mini-max
2) Null move reduction: Enormous improvement over plain alpha-beta
3) PVS search: Modest improvement over null move reduction due to zero window searches
4) History Reductions: (As pioneered by Fruit) - huge improvent over plain PVS search
5) Smooth scaling reductions in null move pruning (As, for instance, Stockfish) - significant improvement over ordinary null move
6) Razoring (like Rybka and Strelka): Enormous improvement over plain pvs search
7) Late Move Reductions: (with Tord taking the lead in both effectiveness and publication) -- a huge improvement over not having LMR.

There are, of course, many others that I did not mention here.

It is not a coincidence that the top ten engines all have branching factors of about 2, and it is not a coincidence that most weak engines have a large branching factor.

Now, your point in well taken with individual cases. For instance, ExChess had the best branching factor of all engines at one point. But it was not the strongest engine by far. So poorly tuned reductions are not nearly so beneficial as properly tuned reductions.

But almost every big advancement comes from a reduction in branching factor and the next revolution will come from a reduction in branching factor.

There are, of course, some exceptions. The material imbalance table in Rybka was another revolution, and almost entirely due to evaluation improvement in that case (as a 'for instance'). We can thank Larry Kaufman for that, I think.
I agree about the history.
I do not think it means that always the future is going to be reduction of the branching factor.

The target is to play better and not to reduce the branching factor and I see no reason to assume that the next improvement is going to be more reductions and it also can be more extensions of the right lines.
Branching factor improvement is exponential improvement.
Other improvements will not be as astounding.
Until branching factor becomes one, it will always be possible to improve it.

I also agree that a perfect evaluation would lead to a branching factor of 1.
It is just that a perfect evaluation is probably many times more difficult to do and exponential improvements via search happen all the time.

In fact, I think that the key to beating SF is simple. Stop focusing on eval and focus on search. The SF team spends way too much time looking at eval and not enough time looking at search.

In this sense, you can call me a disciple of Christophe Theron, who said:
"Search is also knowledge."
if it were that simple, someone would already have done it.

from SF framework stats, search and eval patches are split about equal.

so what to do more with search, without improving eval on a par?

someone says, well, about the most important thing in chess programming
is move ordering. well, how do you achieve better move ordering without necessarily resorting to a more advanced move ordering function, which one way or another has to deal with a more refined eval?
Once move ordering passes 95% correct, the rest is fluff mathematically.

And someone will have already done it?
Yes, of course.
Every drop in branching factor (probably there are 50 by now) is a literal revolution in chess engine strength.

Do you never wonder why the expansion in strength of chess engines is exponential in time (possibly even super-exponential)?
Clearly, this is 99% due to branching factor.

Do you understand what the difference between:
36^40
and 1.8^40
IS?

Hint:
1.7868991024601705453143247728944e+62/16251752663 is a very big number.

The first number is the approximate value for a 40 ply search using mini-max
The second number is the same search using the strongest programs today.

The ratio is a truly enormous number.

why are you using scientific notation to denote double float values?

man, I have read about that?
are not those just basics?

if you have not read about that somewhere, and memorised it, no one would have explained it to you.
the important thing is to think on your own.

I don't know what you are talking about. move ordering passing 95% correct? what does that mean?

no chess engine as of today has still passed even the 50% correct mark, as modern top engines, SF and Komodo, never guess more than 1 oout of every 3 best moves.

so, talking about fluff is simply ridiculous.

branching factor, branching factor... that is just a measure, man, nothing more.

you change things in eval and search, and if the branching factor happens to
decrease, that is fine, but it is not the branching factor that is responsible for success, rather the implemented changes.

do you know what a measure is?

Cardoso · Post by **Cardoso** » Sat Sep 30, 2017 11:56 pm

Uri Blass wrote:
Cardoso wrote:
to do move ordering, you are using hash moves, where the score is based on eval; killer moves are based on eval
Lyudmil, the hashmove and it's associated score, comes not from the eval, but from a search that has an eval, same as the killer moves, counter moves, followupmoves you name it. if you simple assigned the eval result to the hashmove and to killers/countermoves/followupmoves etc that would hurt the engine badly.
Do you think Stockfish is so good because of eval? Just drop the SF eval and use a simplistic material only eval, and play some games against it, and you will finally understand the power of the search and the search also regulates Branching Factor.
To me SF success is 90% search and 10% eval!
Before making so assertive comments I think you should take a course on chess programming, steadily and gradually understand and implement the basics of an alphabeta searcher.

Also drop the "expert" attitude, as you are not one, as you are not even a student. Also that bit of overbearing pride is not heathy to you and those around you.

I think that it will be an interesting experiment to have 2 versions of stockfish.

version A has the same search but only simple piece square table evaluation(no mobility no pawn structure and no king safety)

version B has the same evaluation but only simple alpha beta search.

Note that I expect version A to win convincingly but it may be interesting if A also scales better than B.

Note that I expect both of them to scale significantly worse than stockfish.

It means that
I guess we may see something like the following
1)Stockfish 0.1 seconds per move is the same level as simple stockfish 10 seconds per move.

2)Stockfish 1 seconds per move is at the same level as simple stockfish 300 seconds per move.

That was not exactly the experiment I suggested, I said:

and play some games against it

meaning Lyudmil to try some games against SF himself with only material eval.
I meant to say he would find SFmat can play amazing chess against humans.

Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines

Re: Scaling from FGRL results with top 3 engines