Draw rate calculation between two elite players

mbabigian · Post by **mbabigian** » Wed Jan 08, 2025 6:30 pm

I'm entertaining myself by writing a human tournament simulator. I was looking into how to calculate the estimated draw rate between two arbitrary players of Elo1 and Elo2 where both Elo's are 2700+. An internet search turned up https://www.chessprogramming.org/Match_ ... Draw_Ratio which led me to https://kirill-kryukov.com/chess/kcec/draw_rate.html.

So temporarily I've inserted the following formula from Kirill's very interesting analysis:

Code: Select all

Draw rate = − R.Diff / 32.49 + exp((Av.Rating - 2254.7) / 208.49) + 23.87

It is clear that time control significantly influences the draw rate in addition to average rating and rating difference.

Since I've never analyzed this, my questions are as follows. 1) Is there a better formula for determining estimated draw rate? 2) Assuming we have a good draw rate formula, what adjustments would be suggested to generate the draw rate for rapid versus classical time controls (again in elite play between humans).

My goal is to first simulate an elite RR tournament at classical time controls, and later simulate something like the CCT knockout format at rapid time controls. In both of these types of events, the vast majority of players are in the 2700 and above club which narrows the scope of the formula. (i.e. I don't need it to be accurate for club level players or when Elo1 and Elo2 are extremely different >+-300.)

Any suggestions, recommended reading, etc. would be appreciated,
Mike.

Brunetti · Post by **Brunetti** » Wed Jan 08, 2025 7:07 pm

mbabigian wrote: ↑Wed Jan 08, 2025 6:30 pm It is clear that time control significantly influences the draw rate in addition to average rating and rating difference.
...
1) Is there a better formula for determining estimated draw rate?

Hello,
I believe a player's style has a significant influence. While having the same Elo implies that two players will, on average, achieve the same score in a given tournament, their paths to that score may differ: one might draw all their games, while the other could achieve a mix of wins, draws, and losses in equal proportions, for example.

Thus, while it's possible to define general statistical formulas, when simulating a tournament with specific players, it's important to also account for an "aggressiveness factor" unique to each of them.

Alex

mbabigian · Post by **mbabigian** » Wed Jan 08, 2025 7:13 pm

True, but that is beyond the scope of what I'm willing to program. I'm just trying to have a reasonable draw rate statistic for a typical player of high Elo at the two different time controls. Assigning different draw values per player pair although interesting, is likely to be impossible to calculate and needlessly complicated.

Ajedrecista · Post by **Ajedrecista** » Wed Jan 08, 2025 8:32 pm

Hello Mike:

I do not know about other formulas, but I find your project very interesting. TC plays a role, of course, with longer TC resulting in a higher draw rate. The parameter here could be average time per move, if we guess a number of moves of a mean game (the number of moves can depend on the base time, increments if available, sudden death TC, repeating TC, Fisher TC... we can complicate the whole problem ad infinitum, which do not seem a good idea). I know a source of a estimate of relation of thinking times versus R.Diff, which should be useless for you since your RR simulator expects normal games without time odds, this is, each player with the same time at the start of the game.

A basic analysis on the given formula in the original post suggests that draw rate = 23.87% with both players being rated at 2254.7 Elo (Av.Rating = 2254.7 Elo and R.Diff = 0), with the draw rate raising when Av.Rating > 2254.7 Elo (diminishing when Av.Rating < 2254.7 Elo) and logically diminishing with a growing R.Diff (the score being more favourable to the stronger player, with more wins and less draws).

To be honest, I have been thinking from time to time in something related for a Swiss tournament simulator, without achieving significant results: if you estimate an overall draw rate, obtain a rule-of-the-thumb formula of the number of rounds, given the number of players. The simplest, self-evident case is no draws (D = 0), then rounds = ceiling[log2(players)] = rounds(players)... but what about rounds(players, D)? I have thought on some outlier cut like the Chauvenet's criterion, a mean (µ) of ½ (in a random game, each player gets ½ points on average, regardless of the outcome of the game; and the average of points of all players after the games is rounds/2, or ½ in the [0, 1] range) and a sample standard deviation from the trinomial distribution s = sqrt{[µ·(1 - µ) - D/4]/(rounds - 1)} = ½·sqrt[(1 - D)/(rounds - 1)]. With Chauvenet's criterion being Prob. > 1 - 1/(2·players) or something similar like Prob. > 1 - k/players, with k < 3/2 in a try to ensure less than two players winning the tournament.

Or in the reverse way, with known number of rounds and unknown number of players, directly coming into the trinomial distribution, without means and standard distributions, just with the probability mass function, knowing that rounds = wins + draws + loses, start computing the probability of a perfect score [knowing that Prob.(1 win) = Prob.(1 loss) = (1 - D)/2], then the probability of points = rounds - ½ and so on, applying again the Chauvenet's criterion and trying to bound the number of players for each number of rounds. The drawback might be that small changes in D (the estimated overall draw rate) could lead to big changes in the results.

I wish you good luck and some feedback, if you achieve something. I stay tuned!

Regards from Spain.

Ajedrecista.

mbabigian · Post by **mbabigian** » Wed Jan 08, 2025 10:17 pm

Ajedrecista, your reply is much appreciated. Although I hadn't originally thought about doing Swiss pairings, a conversation with my wife as we watched the World Rapid and Blitz, got me thinking about that too. After reading the Swiss pairing rules however, I shelved that nightmare for some day when I am more motivated.

The conversation with my wife morphed from game theory and why FIDE doesn't hire someone skilled in the topic to analyze their broken rules and tournament construction (beforehand), to an example regarding byes. Say you have a Swiss event with known players (ratings and total participants), and that tournament allows for half or full point byes in certain rounds. I suggested to her that you could simulate your chances of winning the tournament, playing all rounds, and separate simulations taking one bye in each of the allowed rounds. With full point byes for example, there must be a round of the tournament based on your rating and number of players that would give you the maximum boost up the standings. I assume no tournament would allow a full point bye in the final round. In such a situation, if a half or full point bye helped your chances, you could select the best round in advance. Game theory of course would suggest that once one person started doing this everyone would follow suit, but I digress. The whole conversation started with "chess players are smart, if there is a way to exploit the rules, someone will find it."

So if you ever do write a Swiss tournament simulator, it may just change the way players handle allowed byes!

Currently I've got the Tournament input data coded (Players, ratings, any round or partial round results already played etc.), I have the current draw model implemented along with the Elo Calc that determines the percent chance of each possible outcome and am now working on calculating typical tie-breaks for each player (Buchholz, Average Rating of Opponents, Sonneborn-Berger, Sum of Progressive Scores, etc). Once done with that I can select the tie breaks the tournament rules state are in use and run the simulator. Since I'm doing round robin first (easiest pairing scheme), I'll worry about tournaments that use Armageddon to break ties, if and when, I code the knockout pairings. Armageddon is another noodle twister. I suspect the best I'll be able to do is take the Elo calc, assume draws are not possible, and roll the dice, but that topic requires more thought at a future date.

Again thanks for the comments and links,
Mike

towforce · Post by **towforce** » Wed Jan 08, 2025 11:09 pm

Ajedrecista wrote: ↑Wed Jan 08, 2025 8:32 pmA basic analysis on the given formula in the original post suggests that draw rate = 23.87% with both players being rated at 2254.7 Elo (Av.Rating = 2254.7 Elo and R.Diff = 0)...

It looks reasonable on that part of the chart, but this expression gives a draw rate of 100% with both players equal at Elo 3158, which we know to be wrong.

I would be inclined to drive this expression with a reciprocal rather than an exponential, so that it would get close to 100% at about 4200 (which is probably the upper limit for chess Elo, because at this level a player is basically unbeatable), would get ever closer to 100% as the Elo parameter got higher, but would never exceed 100%.

Two issues with doing this:

1. The Elo/draw curve is probably not actually a smooth curve (this probably doesn't matter in reality - just as many probability and statistical distributions get close enough to normal distributions with a large enough data sample to be able to safely calculate with the normal distribution - which is usually a lot simpler)

2. If you follow this suggestion and build such an expression, you'll be obliged to take a guess at the Elo rating at which the draw rate goes to 100%. My guess was 4400, but Larry Kaufman has given some very good reasons to go with 4200, and I believe that his calculations are better than mine (which was mainly guesswork)

mbabigian · Post by **mbabigian** » Wed Jan 08, 2025 11:43 pm

towforce wrote: ↑Wed Jan 08, 2025 11:09 pm I would be inclined to drive this expression with a reciprocal rather than an exponential, so that it would get close to 100% at about 4200 (which is probably the upper limit for chess Elo, because at this level a player is basically unbeatable), would get ever closer to 100% as the Elo parameter got higher, but would never exceed 100%.

This is very interesting. If say, you took a database of games played where each player is 2700 or higher and all games played at classic TC and calculated the draw curve as Kirill did, then adjusted the coefficients of the (reciprocal) formula to be centered at the average player rating (say ~2750) rather than centered on 2254 as the original formula. Wouldn't this reduce the issue of picking the upper limit (4200/4400) as the calculation should be more accurate around the center where it will be used and only drift away once the ratings are well above human level or well below elite play? Then repeat the process using a database of rapid TC games and use those coefficients for faster events.

Does that make sense? This doesn't resolve the upper Elo limit question, but may reduce its significance.

When I originally posted my question I was already wondering if someone in the community had already done such an analysis. For example, Chess.com often shows their computer simulation results at the beginning of their tournaments showing the odds of each player winning and @ChessNumbers (on X) does the same. You'd assume a satisfactory draw rate solution must have been attempted already.

towforce · Post by **towforce** » Thu Jan 09, 2025 12:00 am

mbabigian wrote: ↑Wed Jan 08, 2025 11:43 pm
towforce wrote: ↑Wed Jan 08, 2025 11:09 pm I would be inclined to drive this expression with a reciprocal rather than an exponential, so that it would get close to 100% at about 4200 (which is probably the upper limit for chess Elo, because at this level a player is basically unbeatable), would get ever closer to 100% as the Elo parameter got higher, but would never exceed 100%.
This is very interesting. If say, you took a database of games played where each player is 2700 or higher and all games played at classic TC and calculated the draw curve as Kirill did, then adjusted the coefficients of the (reciprocal) formula to be centered at the average player rating (say ~2750) rather than centered on 2254 as the original formula. Wouldn't this reduce the issue of picking the upper limit (4200/4400) as the calculation should be more accurate around the center where it will be used and only drift away once the ratings are well above human level or well below elite play? Then repeat the process using a database of rapid TC games and use those coefficients for faster events.

Does that make sense? This doesn't resolve the upper Elo limit question, but may reduce its significance.

Good point - thank you.

For your usage, this is undoubtedly the better way to go. Obviously I was thinking about building a universal function to give draw rate from Elo rating (with both players having the same rating).

mbabigian · Post by **mbabigian** » Thu Jan 09, 2025 8:19 pm

What do you think of this formula construction

DrawRate = 1 / (1+exp(-(a-b*Rdiff+c*Ravg))

Swagging in some values for a, b, and c (I don't have time to do database mining at the moment). I used:
a=-10.69
b=0.005
c=0.000419

This gives a draw rate of 99.9% for two 4200 players, and a draw rate of 65% for two 2700 players. A draw rate of 58.2% for a 2800 vs 2700.
Obviously a, b, and c would need to be calibrated to available data for classical events and again for rapid events.

Is this more in line with what you were thinking?

Ajedrecista · Post by **Ajedrecista** » Thu Jan 09, 2025 8:57 pm

Hello Mike:

I think I bring good news to you: I suddenly remembered that the author of Gaviota chess engine coded a Round-Robin simulator a while ago:

Round Robin simulations program

The site is gone, but there is an archived copy with working download links!

bleeding edge - gaviota chess engine

The programme is called rrsim-v0.5-win and works on Windows. The direct download link is here.

There was some discussion about draw models in that TalkChess thread, but I think they are not related to average rating.

------------

There are more threads about draw models, which can be dense:

1 draw=1 win + 1 loss (always!)

Re: H4 or S5 !?

And probably without relation between draw rate versus average rating.

------------

mbabigian wrote: ↑Wed Jan 08, 2025 10:17 pm[...]

[...] Since I'm doing round robin first (easiest pairing scheme), I'll worry about tournaments that use Armageddon to break ties, if and when, I code the knockout pairings. Armageddon is another noodle twister. I suspect the best I'll be able to do is take the Elo calc, assume draws are not possible, and roll the dice, but that topic requires more thought at a future date.

[...]

I see that you write about Armageddon, so my link with time odds can be useful after all! The difference of allocated times creates an artificial additional rating difference, which was modeled as (t_B)/(t_A) = 2^[(Rating_A - Rating_B)/120]; Rating_A - Rating_B = 120*log2[(t_B)/(t_A)] in that source, and I propose that can be added to the real rating difference. Since typical time odds for Armageddon are 6 minutes (white) versus 5 minutes (black), or 5 minutes (white) versus 4 minutes (black), you get 120*log2(6/5) ~ 31.56 Elo or 120*log2(5/4) ~ 38.63 Elo of additional rating difference in favour of the white side. Since your original parameter R.Diff >= 0 as I understand it (an absolute value could be helpful in the formula, to do not mess things, such as -|R.Diff|/32.49), you must be careful in who plays white (the stronger player or the weakest).

I would go in this way: if you have W playing white with a rating of 2700 Elo and B playing black with a rating of 2685 Elo (ordinary R.Diff = 15 Elo), with a supposed artificial additional rating difference of 35 Elo in Armageddon, then R.Diff = 2700 - 2685 + 35 = 50 Elo in favour of W. OTOH, playing reverse sides (the weakest player with a little help from the time odds), R.Diff = 2700 - 2685 + (-35) = -20 Elo in favour of W, so go with 20 Elo in favour of B (the importance of absolute value once again). Furthermore, you can keep Av.Rating since you split this new artificial additional rating difference by half to each player, in a try to do not complicate the problem more. In other words, with players W and B as before, you feed the formula with R.Diff = 15 Elo and Av.Rating = 2692.5 Elo in a normal game; but, when playing Armageddon, you can set:

Code: Select all

W vs. B: Rating_W = 2700 + 35/2 = 2717.5 Elo; Rating_B = 2685 - 35/2 = 2667.5 Elo; R.Diff = |2717.5 - 2667.5| = 50 Elo; Av.Rating = (2717.5 + 2667.5)/2 = 2692.5 Elo.
B vs. W: Rating_B = 2685 + 35/2 = 2702.5 Elo; Rating_W = 2700 - 35/2 = 2682.5 Elo; R.Diff = |2702.5 - 2682.5| = 20 Elo; Av.Rating = (2702.5 + 2682.5)/2 = 2692.5 Elo.

Once you have computed the draw rate Prob.D with the input parameters, then winning white (Prob.W) and winning black (Prob.B) must add 1 (Prob.W + Prob.D + Prob.B = 1), this is, 100%. Since Prob.W + Prob.B = 1 - Prob.D in virtue of the former equation, you can set up the other equation Prob.W - Prob.B:

Code: Select all

Prob.W + Prob.B = 1 - Prob.D  // (Eq. 1)

------------

From white's POV:
R.Diff = 400*log10{[½ + ½·(Prob.W - Prob.B)]/[½ - ½·(Prob.W - Prob.B)]}
10^[(R.Diff)/400] = [1 + (Prob.W - Prob.B)]/[1 - (Prob.W - Prob.B)]
[...]  // Doing the math...
Prob.W - Prob.B = {10^[(R.Diff)/400] - 1}/{10^[(R.Diff)/400] + 1}  // (Eq. 2)

------------

(Eq. 1) + (Eq. 2)
2*Prob.W = 1 - Prob.D + {10^[(R.Diff)/400] - 1}/{10^[(R.Diff)/400] + 1}

######
Prob.W = ½·(1 - Prob.D + {10^[(R.Diff)/400] - 1}/{10^[(R.Diff)/400] + 1})
######

------------

(Eq. 1) - (Eq. 2)
2*Prob.B = 1 - Prob.D - {10^[(R.Diff)/400] - 1}/{10^[(R.Diff)/400] + 1}

######
Prob.B = ½·(1 - Prob.D - {10^[(R.Diff)/400] - 1}/{10^[(R.Diff)/400] + 1})
######

Finally, the player that plays with the white pieces wins the Armageddon tie-break with probability Prob.W and loses with probability P.Draw + P.Black, which you can do with a proper PRNG ('rolling the dice' in your own words).

Checking the original post formula of draw rate plus my own math with real numbers:

Code: Select all

W(2700), B(2685)
B(2685), W(2700)
NORMAL GAME (regardless of W-B or B-W because first move advantage is not part of the model):
W wins ~  36.37%
Draw   ~  31.57%
B wins ~  32.06%
----------------
SUM    = 100.00%

############

W(2717.5), B(2667.5)
ARMAGEDDON W-B:
W wins ~  41.90%
Draw   ~  30.50%
B wins ~  27.60%
----------------
SUM    = 100.00%
W wins Armageddon ~ 41.90%
B wins Armageddon ~ 58.10%

############

B(2702.5), W(2682.5)
ARMAGEDDON B-W:
B wins ~  37.17%
Draw   ~  31.42%
W wins ~  31.41%
----------------
SUM    = 100.00%
B wins Armageddon ~ 37.17%
W wins Armageddon ~ 62.83%

I hope no typos.

Regards from Spain.

Ajedrecista.

Draw rate calculation between two elite players

Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players.

Re: Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players.

Re: Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players

Re: Draw rate calculation between two elite players.