Limiting nodes to reduce engine strength

Uri Blass · Post by **Uri Blass** » Mon Nov 04, 2019 11:10 am

D Sceviour wrote: ↑Mon Nov 04, 2019 4:39 am Certainly, it is time for engines to produce more features for human players rather than increases in elo strength. The Deuterium method is to use an exponential function to calculate the rating based on nodes:

rating = 297.12 x Ln(nodes) - 976.7

This does not seem correct. Rather, the elo performance might more realistically follow a Sigmoid S-curve pattern against ply depths and node counts. It might be an interesting approach to consider. A curve defined as:

#define sigmoid(K, S) ( 1.0 / (1.0 + pow(20.0, -K * S / 2000.0)) )

can produce a scale something like this:

elo ply depth
---------------
1000 1
1500 2
2000 4
2500 9
3000 18
3500 32
4000 50

With adjustment to the constants, the formula can be fined tuned for performance. I might look at this further when the opportunity arises.

rating of a fixed depth or fixed number of nodes can be defined only for specific time control.

When we say rating 2000 we mean something different when we talk about blitz and when we talk about long time control.
I think that the target should be to have a program that does not perforum relatively better or relatively worse against humans when you change the time control.

If in every time control it score 50% against players with fide rating of 2000 elo then you can have it has rating of 2000 and in this case you should define some table like the following

elo, ply depth, time control
2000, 1, 1 minute per game
2000, 2, 3 minutes per game or 1+1 time control
2000, 3, 10 minute per game or 4+4 time control
...

D Sceviour · Post by **D Sceviour** » Mon Nov 04, 2019 1:21 pm

PK wrote: ↑Mon Nov 04, 2019 8:30 am Denis, does Your formula allow to calculate percentage of depth 1 and depth 2 searches needed to achieve, say, 1200 Elo?

Edit: stupid me, it's pretty obvious when you see that Elo maps to fractional depth.

The formula is only a starting point for realistic human thinking by using ply depths as a guide line. There can and will be further adjustments for time allotment and game phase. Less material means greater thinking depth in the same amount of time. For the table displayed I used:

ply_depth = (int) (100 * sigmoid(-1.0,4000-elo))

The S-curve idea is empirical. There must be better and younger math experts out there than me to improve on the theory.

PK · Post by PK » Mon Nov 04, 2019 2:00 pm

this might be of interest: http://web.ist.utl.pt/diogo.ferreira/pa ... impact.pdf

Using node limit instead of depth would eliminate problem of too shalow endgame searches and would silently take into account position complexity. And Your formula would be convertible using som kind of nodes to depth average.

D Sceviour · Post by **D Sceviour** » Mon Nov 04, 2019 2:47 pm

PK wrote: ↑Mon Nov 04, 2019 2:00 pm this might be of interest: http://web.ist.utl.pt/diogo.ferreira/pa ... impact.pdf

Using node limit instead of depth would eliminate problem of too shalow endgame searches and would silently take into account position complexity. And Your formula would be convertible using som kind of nodes to depth average.

True.
Interesting. Ferreira's Table 8 produces similar results to my findings:

Code: Select all

Depth  Strength (Elo)
  20      2894
  19      2828
  18      2761
  17      2695
  16      2629
  15      2563
  14      2496
  13      2430
  12      2364
  11      2298
  10      2231
   9      2165
   8      2099
   7      2033
   6      1966

Table 8: Estimated strength of the engine at different search depths

"It is apparent that these data seem to follow some sort of sigmoid function (Ferreira)." I agree. However, Ferreira's experimental data is a limited range of grandmaster strengths. Further, he make no discussion of strength related to time allotment, game phase or computer hardware. I would like to see the elo match from beginner to postal master.

bob · Post by **bob** » Thu Nov 07, 2019 7:12 pm

Not a fan of a depth limit, since they can occasionally be quite "coarse" in terms of time.

I have a pretty decent NPS idea in the works. I have normalized on 2800 Elo == 6M nodes per second (on my MacBook). Elo scales down the NPS and introduces evaluation randomness. And it is actually working pretty well. Only issue is getting good elo numbers within my testing approach. My first step is to make "elo 2800", "elo 2600", ... actually produce Elo numbers around those settings. Then I will have to figure out how accurate the 6M = 2800 Elo actually is. I am also working on lower levels, but I had to take a few programs and hack 'em to drop their ratings. I have such a version of Fruit that is clocking in around 1800, and a version of Crafty that is in the 1600's, so I am "getting there". Going to make a few slower versions of Fruit to see if I can get representative numbers down to at least 1000.

I'll look for some testers once I get the numbers settled in... The scaled (and self-adjusting) NPS idea proposed here was really pretty good. For example, right now, if you say "elo 2800" it does nothing on my MacBook. Elo 2600 turns into 6M nps. Elo 2400 turns into 1.6M nps. Note that is not the only change. Eval randomness slow ramps up until at elo 800 it is 100% random and very slow search. However, from experience, down there the NPS becomes critical, as a fast search + purely random eval will still produce a program over 1800. You can find the discussion for this in the programmer's forum quite a few years back...

Uri Blass · Post by **Uri Blass** » Thu Nov 07, 2019 7:40 pm

bob wrote: ↑Thu Nov 07, 2019 7:12 pm Not a fan of a depth limit, since they can occasionally be quite "coarse" in terms of time.

I have a pretty decent NPS idea in the works. I have normalized on 2800 Elo == 6M nodes per second (on my MacBook). Elo scales down the NPS and introduces evaluation randomness. And it is actually working pretty well. Only issue is getting good elo numbers within my testing approach. My first step is to make "elo 2800", "elo 2600", ... actually produce Elo numbers around those settings. Then I will have to figure out how accurate the 6M = 2800 Elo actually is. I am also working on lower levels, but I had to take a few programs and hack 'em to drop their ratings. I have such a version of Fruit that is clocking in around 1800, and a version of Crafty that is in the 1600's, so I am "getting there". Going to make a few slower versions of Fruit to see if I can get representative numbers down to at least 1000.

I'll look for some testers once I get the numbers settled in... The scaled (and self-adjusting) NPS idea proposed here was really pretty good. For example, right now, if you say "elo 2800" it does nothing on my MacBook. Elo 2600 turns into 6M nps. Elo 2400 turns into 1.6M nps. Note that is not the only change. Eval randomness slow ramps up until at elo 800 it is 100% random and very slow search. However, from experience, down there the NPS becomes critical, as a fast search + purely random eval will still produce a program over 1800. You can find the discussion for this in the programmer's forum quite a few years back...

If the target is weak level against humans then
I think that you should use games against humans with known fide rating to decide about elo of different levels.

bob · Post by **bob** » Thu Nov 07, 2019 8:47 pm

This is hard. Where do I find such humans? Certainly not on ICC since so many use computers there. If we had enough tournaments here in Birmingham, I might try that but it would probably cause objections. The 1800 version might well be much stronger against humans, disrupting the 1800-1999 class competition.

Probably the best I can do is ask a 2000 (or something else) player to try it out, setting Crafty's rating to match theirs exactly (it does handle any value like 2319 and will scale it appropriately between the 2200 and 2400 level breaks...

But that only works after I get the base code tuned to at least properly degrade the Elo in 200 point chunks...

Ras · Post by **Ras** » Fri Nov 08, 2019 12:27 am

bob wrote: ↑Thu Nov 07, 2019 7:12 pmNot a fan of a depth limit, since they can occasionally be quite "coarse" in terms of time.

And also unrealistic because humans also calculate deeper if there are fewer moves to consider.

I have a pretty decent NPS idea in the works. I have normalized on 2800 Elo == 6M nodes per second (on my MacBook). Elo scales down the NPS and introduces evaluation randomness.

That's similar to what I did also, and disabling selective deepening below some Elo threshold, and even "overlooking" mate randomly at the lowest end.

I also added dynamic Elo scaling depending on the time allocated for the current move. Nominal Elo is for 15 seconds per move, which at 80 moves per game would be rapid chess with 20 minutes. The actual Elo gets either scaled up by +50 Elo for tournament time, or scaled down by -50 for blitz, with linear interpolation in between. The idea is that humans are weaker at blitz so that it's nice if the engine auto-adjusts.

bob · Post by **bob** » Fri Nov 08, 2019 6:42 pm

I am still working on getting reliable (within my testing framework) Elo for 2800, 2600, 2400 ... It is not exactly easy. I have not fiddled with extensions and reductions as of yet, because I believe that reducing the NPS far enough will do effectively the same thing. Since I only have a check extension, I can't really see that being an issue, particularly if the NPS drops enough.

Still turning out to be harder than I expected...

Ponti · Post by **Ponti** » Wed Nov 27, 2019 2:42 am

Uri Blass wrote: ↑Thu Nov 07, 2019 7:40 pm
bob wrote: ↑Thu Nov 07, 2019 7:12 pm Not a fan of a depth limit, since they can occasionally be quite "coarse" in terms of time.
If the target is weak level against humans then
I think that you should use games against humans with known fide rating to decide about elo of different levels.

I remember Ed Shröder's Rebel series have many games against GMs.
You can also search for games in the internet chess servers.

Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength

Re: Limiting nodes to reduce engine strength