Sat. UPDATE- Houdini & Rainbow @ 400 games!

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Sat. UPDATE- Houdini & Rainbow @ 400 games!

Post by geots »

Houdini 2.0c x64 vs Rainbow UNLimited


I guess nothing lasts forever, and neither did Rainbow's run. Tho the elo difference is still respectable, Houdini has increased his lead from 12 games back to 16 games. I'm afraid that pretty much seals the deal. There is not one doubt left in my mind that over the long haul- Houdini is the stronger engine. By exactly "what" elo difference is debatable- for me the rest is written in stone.


Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

5'+5"
Match=500 games


[after 400 games]

Code: Select all

Houdini 2.0c x64     +14    +111/-95/=194   52.00%   208.0/400
Rainbow UNLimited    -14    +95/-111/=194   48.00%   192.0/400

Well, I came to post this update, and stopped in "General" first. Read Ansari's thread and clicked on his link to the chessbase article concerning Fischer-Spassky in '72. Ended up playing thru "The Game of the Century" ag. Byrne 2 or 3 more times, and almost forgot to post this update.

By the time I posted the above- Houdini has run his lead from the 16 games here- back up to 19 games. Mop-up time, I would think.



Tally-ho,

george
ZirconiumX
Posts: 1361
Joined: Sun Jul 17, 2011 11:14 am
Full name: Hannah Ravensloft

Re: Sat. UPDATE- Houdini & Rainbow @ 400 games!

Post by ZirconiumX »

Try pitting Fern's latest beta version of Moron against Houdini. That'll be dramatic.

Sarcasm aside, I think the next version of Strelka is the only one with even a hint of a chance.

Matthew:out
tu ne cede malis, sed contra audentior ito
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Sat. UPDATE- Houdini & Rainbow @ 400 games!

Post by geots »

ZirconiumX wrote:Try pitting Fern's latest beta version of Moron against Houdini. That'll be dramatic.

Sarcasm aside, I think the next version of Strelka is the only one with even a hint of a chance.

Matthew:out

When it's all said and done, I think Moron's chances are about as good as the rest. :lol: :lol:


george
User avatar
Ajedrecista
Posts: 2179
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Sat. UPDATE - Houdini & Rainbow @ 400 games!

Post by Ajedrecista »

Hello!
geots wrote:Houdini 2.0c x64 vs Rainbow UNLimited


I guess nothing lasts forever, and neither did Rainbow's run. Tho the elo difference is still respectable, Houdini has increased his lead from 12 games back to 16 games. I'm afraid that pretty much seals the deal. There is not one doubt left in my mind that over the long haul- Houdini is the stronger engine. By exactly "what" elo difference is debatable- for me the rest is written in stone.


Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

5'+5"
Match=500 games


[after 400 games]

Code: Select all

Houdini 2.0c x64     +14    +111/-95/=194   52.00%   208.0/400
Rainbow UNLimited    -14    +95/-111/=194   48.00%   192.0/400

Well, I came to post this update, and stopped in "General" first. Read Ansari's thread and clicked on his link to the chessbase article concerning Fischer-Spassky in '72. Ended up playing thru "The Game of the Century" ag. Byrne 2 or 3 more times, and almost forgot to post this update.

By the time I posted the above- Houdini has run his lead from the 16 games here- back up to 19 games. Mop-up time, I would think.



Tally-ho,

george
Referring to the first 400 games, Rainbow performance is very decent IMHO; here are my LOS for this match (from Rainbow POV) and my error bars for 95% confidence:

Code: Select all

LOS_and_Elo_uncertainties_calculator, ® 2012.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins (up to 1825361100):

95

Write down the number of loses (up to 1825361100):

111

Write down the number of draws (up to 2147483646):

194

 Write down the confidence level (in percentage) between 65% and 99.9% (it will be rounded up to 0.01%):

95

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3

---------------------------------------
Elo interval for 95.00 % confidence:

Elo rating difference:    -13.90 Elo

Lower rating difference:  -38.45 Elo
Upper rating difference:   10.50 Elo

Lower bound uncertainty:  -24.54 Elo
Upper bound uncertainty:   24.41 Elo
Average error:        +/-  24.48 Elo

K = (average error)*[sqrt(n)] =  489.52

Elo interval: ] -38.45,   10.50[
---------------------------------------

Number of games of the match:       400
Score: 48.00 %
Elo rating difference:  -13.90 Elo
Draw ratio: 48.50 %

*********************************************************
Standard deviation:  3.5109 % of the points of the match.
*********************************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------

LOS (taking into account draws) is always calculated, if possible.

LOS (not taking into account draws) is only calculated if wins + loses < 16001.

LOS (average value) is calculated only when LOS (not taking into account draws) is calculated.
______________________________________________

LOS:  13.21 % (taking into account draws).
LOS:  13.30 % (not taking into account draws).
LOS:  13.26 % (average value).
______________________________________________

These values of LOS are rounded up to 0.01%

End of the calculations. Approximated elapsed time:   54 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
------------------------

Houdini should have scored 214 out of 400 games (or more) for getting a LOS of 97.5% or more:

Code: Select all

Minimum_score_for_no_regression, ® 2012.

 Calculation of the minimum score for no regression (i.e. negative Elo gain) in a match between two engines:

 Write down the number of games of the match (it must be a positive integer, up to 1073741823):

400

Write down the draw ratio (in percentage):

48.5

 Write down the likelihood of superiority (in percentage) between 75% and 99.9% (LOS will be rounded up to 0.01%):

97.5

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3
_______________________________________________________________________________

Theoretical minimum score for no regression: 53.4996 %
Theoretical standard deviation in this case:  3.4996 %

Minimum number of won points for the engine in this match:       214.0 points.

Minimum Elo advantage, which is also the negative part of the error bar:
 24.3603 Elo (for a LOS value of 97.50 %).

A LOS value of 97.50 % is equivalent to 95.00 % confidence in a two-sided test.
_______________________________________________________________________________

End of the calculations. Approximated elapsed time:  17 ms.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
------------------------

Code: Select all

Minimum_number_of_games, ® 2012.

 Calculation of the minimum number of games in a match between two engines to ensure an Elo gain with a given LOS value:

Write down the wanted Elo gain between 0.1 and 40 Elo (it will be rounded up to 0.01 Elo):

13.9

 Write down the likelihood of superiority (in percentage) between 90% and 99.9% (LOS will be rounded up to 0.01%):

97.5

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3
_______________________________________________________________________________

Score for a wanted gain of 13.90 Elo:  51.9993 %
Standard deviation for 97.50 % of LOS:  1.9993 %

A LOS value of 97.50 % is equivalent to 95.00 % confidence in a two-sided test.

Minimum number of needed games:       2400 games.
_______________________________________________________________________________

End of the calculations. Approximated elapsed time:  18 ms.

Thanks for using Minimum_number_of_games. Press Enter to exit.
If this rating difference is maintained over the match, Houdini is better than Rainbow with a LOS of 97.5% after 2400 games... but 500 games are better than nothing! ;) Please keep up the good work. I stay tuned.

Regards from Spain.

Ajedrecista.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Sat. UPDATE - For Jesus!

Post by geots »

Ajedrecista wrote:Hello!
geots wrote:Houdini 2.0c x64 vs Rainbow UNLimited


I guess nothing lasts forever, and neither did Rainbow's run. Tho the elo difference is still respectable, Houdini has increased his lead from 12 games back to 16 games. I'm afraid that pretty much seals the deal. There is not one doubt left in my mind that over the long haul- Houdini is the stronger engine. By exactly "what" elo difference is debatable- for me the rest is written in stone.


Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

5'+5"
Match=500 games


[after 400 games]

Code: Select all

Houdini 2.0c x64     +14    +111/-95/=194   52.00%   208.0/400
Rainbow UNLimited    -14    +95/-111/=194   48.00%   192.0/400

Well, I came to post this update, and stopped in "General" first. Read Ansari's thread and clicked on his link to the chessbase article concerning Fischer-Spassky in '72. Ended up playing thru "The Game of the Century" ag. Byrne 2 or 3 more times, and almost forgot to post this update.

By the time I posted the above- Houdini has run his lead from the 16 games here- back up to 19 games. Mop-up time, I would think.



Tally-ho,

george
Referring to the first 400 games, Rainbow performance is very decent IMHO; here are my LOS for this match (from Rainbow POV) and my error bars for 95% confidence:

Code: Select all

LOS_and_Elo_uncertainties_calculator, ® 2012.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins (up to 1825361100):

95

Write down the number of loses (up to 1825361100):

111

Write down the number of draws (up to 2147483646):

194

 Write down the confidence level (in percentage) between 65% and 99.9% (it will be rounded up to 0.01%):

95

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3

---------------------------------------
Elo interval for 95.00 % confidence:

Elo rating difference:    -13.90 Elo

Lower rating difference:  -38.45 Elo
Upper rating difference:   10.50 Elo

Lower bound uncertainty:  -24.54 Elo
Upper bound uncertainty:   24.41 Elo
Average error:        +/-  24.48 Elo

K = (average error)*[sqrt(n)] =  489.52

Elo interval: ] -38.45,   10.50[
---------------------------------------

Number of games of the match:       400
Score: 48.00 %
Elo rating difference:  -13.90 Elo
Draw ratio: 48.50 %

*********************************************************
Standard deviation:  3.5109 % of the points of the match.
*********************************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------

LOS (taking into account draws) is always calculated, if possible.

LOS (not taking into account draws) is only calculated if wins + loses < 16001.

LOS (average value) is calculated only when LOS (not taking into account draws) is calculated.
______________________________________________

LOS:  13.21 % (taking into account draws).
LOS:  13.30 % (not taking into account draws).
LOS:  13.26 % (average value).
______________________________________________

These values of LOS are rounded up to 0.01%

End of the calculations. Approximated elapsed time:   54 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
------------------------

Houdini should have scored 214 out of 400 games (or more) for getting a LOS of 97.5% or more:

Code: Select all

Minimum_score_for_no_regression, ® 2012.

 Calculation of the minimum score for no regression (i.e. negative Elo gain) in a match between two engines:

 Write down the number of games of the match (it must be a positive integer, up to 1073741823):

400

Write down the draw ratio (in percentage):

48.5

 Write down the likelihood of superiority (in percentage) between 75% and 99.9% (LOS will be rounded up to 0.01%):

97.5

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3
_______________________________________________________________________________

Theoretical minimum score for no regression: 53.4996 %
Theoretical standard deviation in this case:  3.4996 %

Minimum number of won points for the engine in this match:       214.0 points.

Minimum Elo advantage, which is also the negative part of the error bar:
 24.3603 Elo (for a LOS value of 97.50 %).

A LOS value of 97.50 % is equivalent to 95.00 % confidence in a two-sided test.
_______________________________________________________________________________

End of the calculations. Approximated elapsed time:  17 ms.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
------------------------

Code: Select all

Minimum_number_of_games, ® 2012.

 Calculation of the minimum number of games in a match between two engines to ensure an Elo gain with a given LOS value:

Write down the wanted Elo gain between 0.1 and 40 Elo (it will be rounded up to 0.01 Elo):

13.9

 Write down the likelihood of superiority (in percentage) between 90% and 99.9% (LOS will be rounded up to 0.01%):

97.5

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3
_______________________________________________________________________________

Score for a wanted gain of 13.90 Elo:  51.9993 %
Standard deviation for 97.50 % of LOS:  1.9993 %

A LOS value of 97.50 % is equivalent to 95.00 % confidence in a two-sided test.

Minimum number of needed games:       2400 games.
_______________________________________________________________________________

End of the calculations. Approximated elapsed time:  18 ms.

Thanks for using Minimum_number_of_games. Press Enter to exit.
If this rating difference is maintained over the match, Houdini is better than Rainbow with a LOS of 97.5% after 2400 games... but 500 games are better than nothing! ;) Please keep up the good work. I stay tuned.

Regards from Spain.

Ajedrecista.



Jesus, I realize that a 5 game margin (and I am making these figures up) after 20 games will represent a much much bigger elo difference than a 5 game margin after 300 games.

But what surprised me is if an engine leads by 13 games after 360 games, and in the next 62 games he increases the lead by 7 more games- to lead at 422 games by a total of 20 games- I would have thought that increasing the lead by 7 games the elo difference would have had to increase by at least 1 elo point. But it doesn't. It is +14 either way. There is no argument there for me to make- it either does or it doesn't. It just sort of surprises me- that's all. (Of course it wouldn't when you are up around 5000 games or something like that. Then 30 more wins wouldn't affect it.) But I thought it would make at least 1 elo point difference at 350 or 400 games. Oh well. No big deal.


george
User avatar
Ajedrecista
Posts: 2179
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Rating difference in terms of (wins - loses).

Post by Ajedrecista »

Hello again:
geots wrote:Jesus, I realize that a 5 game margin (and I am making these figures up) after 20 games will represent a much much bigger elo difference than a 5 game margin after 300 games.

But what surprised me is if an engine leads by 13 games after 360 games, and in the next 62 games he increases the lead by 7 more games- to lead at 422 games by a total of 20 games- I would have thought that increasing the lead by 7 games the elo difference would have had to increase by at least 1 elo point. But it doesn't. It is +14 either way. There is no argument there for me to make- it either does or it doesn't. It just sort of surprises me- that's all. (Of course it wouldn't when you are up around 5000 games or something like that. Then 30 more wins wouldn't affect it.) But I thought it would make at least 1 elo point difference at 350 or 400 games. Oh well. No big deal.


george
For the first 360 games, I get a rating difference of ~ 12.55 Elo, not 14 Elo; if I am not wrong, a lead of 20 games after 422 games means a rating difference of 400·log(221/201) ~ 16.48 Elo, so there is a difference of 400·[log(221/201) - log(186.5/173.5)] ~ 3.93 Elo more less.

As you well know, rating difference is computed by the formula 400·log(points_A/points_B), where points_A and points_B are the number of points of engines A and B. Of course: points_A + points_B = n = number of games of the match.

I tried to put this rating difference in terms of (wins - loses)... I thought that I will not reach anything useful, but I achieve it, at last. It is indeed very easy:

Code: Select all

n = wins + loses + draws = w + l + d; d = n - w - l

(Rating difference) = rd = 400·log[(w + d/2)/(l + d/2)] = 400·log[(2w + d)/(2l + d)] = 400·log{[2w + (n - w - l)]/[2l + (n - w - l)]} = 400·log{[n + (w - l)]/[n - (w - l)]}
(Rating difference) = rd = 400·log{[n + (w - l)]/[n - (w - l)]}
So, in your first case (from Houdini POV), where n = 360 and (wins - loses) = 13, rd = 400·log[(360 + 13)/(360 - 13)] ~ 12.55 Elo, as I said before; when n = 422 and (wins - loses) = 20, then rd = 400·log[(422 + 20)/(422 - 20)] ~ 16.48 Elo (from Houdini POV), as I already wrote.

I think that you wrongly got those +14 Elo for Houdini due to roundings: when 360 games were played, the result was 186.5 - 173.5 in favour of Houdini, which is around 51.81% - 48.19% (which leads to ~ 12.55 Elo difference), but you rounded up to 52% - 48%, getting ~ 13.9 ~ 14 Elo difference. The same applies when 422 games are played: a 20-game lead means a provisional result of 221 - 201 (in favour of Houdini), which is more less 52.37% - 47.63% (so, around 16.48 Elo difference), instead of your same rounding of 52% - 48%, getting again ~ 13.9 ~ 14 Elo difference. Please correct me if my guesses are wrong.

Summarizing: if you are going to compute rating difference manually, I recommend you to do it with the number of points of each engine instead of the relative scores (I mean, the percentages): you will avoid nasty roundings that hurt the calculation. Please do not hesitate in asking doubts if you have them.

Regards from Spain.

Ajedrecista.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Rating difference in terms of (wins - loses).

Post by geots »

Ajedrecista wrote:Hello again:
geots wrote:Jesus, I realize that a 5 game margin (and I am making these figures up) after 20 games will represent a much much bigger elo difference than a 5 game margin after 300 games.

But what surprised me is if an engine leads by 13 games after 360 games, and in the next 62 games he increases the lead by 7 more games- to lead at 422 games by a total of 20 games- I would have thought that increasing the lead by 7 games the elo difference would have had to increase by at least 1 elo point. But it doesn't. It is +14 either way. There is no argument there for me to make- it either does or it doesn't. It just sort of surprises me- that's all. (Of course it wouldn't when you are up around 5000 games or something like that. Then 30 more wins wouldn't affect it.) But I thought it would make at least 1 elo point difference at 350 or 400 games. Oh well. No big deal.


george
For the first 360 games, I get a rating difference of ~ 12.55 Elo, not 14 Elo; if I am not wrong, a lead of 20 games after 422 games means a rating difference of 400·log(221/201) ~ 16.48 Elo, so there is a difference of 400·[log(221/201) - log(186.5/173.5)] ~ 3.93 Elo more less.

As you well know, rating difference is computed by the formula 400·log(points_A/points_B), where points_A and points_B are the number of points of engines A and B. Of course: points_A + points_B = n = number of games of the match.

I tried to put this rating difference in terms of (wins - loses)... I thought that I will not reach anything useful, but I achieve it, at last. It is indeed very easy:

Code: Select all

n = wins + loses + draws = w + l + d; d = n - w - l

(Rating difference) = rd = 400·log[(w + d/2)/(l + d/2)] = 400·log[(2w + d)/(2l + d)] = 400·log{[2w + (n - w - l)]/[2l + (n - w - l)]} = 400·log{[n + (w - l)]/[n - (w - l)]}
(Rating difference) = rd = 400·log{[n + (w - l)]/[n - (w - l)]}
So, in your first case (from Houdini POV), where n = 360 and (wins - loses) = 13, rd = 400·log[(360 + 13)/(360 - 13)] ~ 12.55 Elo, as I said before; when n = 422 and (wins - loses) = 20, then rd = 400·log[(422 + 20)/(422 - 20)] ~ 16.48 Elo (from Houdini POV), as I already wrote.

I think that you wrongly got those +14 Elo for Houdini due to roundings: when 360 games were played, the result was 186.5 - 173.5 in favour of Houdini, which is around 51.81% - 48.19% (which leads to ~ 12.55 Elo difference), but you rounded up to 52% - 48%, getting ~ 13.9 ~ 14 Elo difference. The same applies when 422 games are played: a 20-game lead means a provisional result of 221 - 201 (in favour of Houdini), which is more less 52.37% - 47.63% (so, around 16.48 Elo difference), instead of your same rounding of 52% - 48%, getting again ~ 13.9 ~ 14 Elo difference. Please correct me if my guesses are wrong.

Summarizing: if you are going to compute rating difference manually, I recommend you to do it with the number of points of each engine instead of the relative scores (I mean, the percentages): you will avoid nasty roundings that hurt the calculation. Please do not hesitate in asking doubts if you have them.

Regards from Spain.

Ajedrecista.



Jesus, what I do is compute it the same way the chessbase Fritz gui computes it. I learned from Peter Osterlund:

I take the two %s- lets say 52% and 48%:

I then compute as follows: .52 / .48 = Log" * 400 = (in this case) 13.9 and round off to 14. I checked and that is the way chessbase does it in their guis. It is not bayeselo, and Peter I think said it might not be quite as accurate- but close.


george

PS: If you answer back yourself I may not see it for a while. I can't keep my eyes open. Gotta get some sleep.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Rating difference in terms of (wins - loses).

Post by geots »

geots wrote:
Ajedrecista wrote:Hello again:
geots wrote:Jesus, I realize that a 5 game margin (and I am making these figures up) after 20 games will represent a much much bigger elo difference than a 5 game margin after 300 games.

But what surprised me is if an engine leads by 13 games after 360 games, and in the next 62 games he increases the lead by 7 more games- to lead at 422 games by a total of 20 games- I would have thought that increasing the lead by 7 games the elo difference would have had to increase by at least 1 elo point. But it doesn't. It is +14 either way. There is no argument there for me to make- it either does or it doesn't. It just sort of surprises me- that's all. (Of course it wouldn't when you are up around 5000 games or something like that. Then 30 more wins wouldn't affect it.) But I thought it would make at least 1 elo point difference at 350 or 400 games. Oh well. No big deal.


george
For the first 360 games, I get a rating difference of ~ 12.55 Elo, not 14 Elo; if I am not wrong, a lead of 20 games after 422 games means a rating difference of 400·log(221/201) ~ 16.48 Elo, so there is a difference of 400·[log(221/201) - log(186.5/173.5)] ~ 3.93 Elo more less.

As you well know, rating difference is computed by the formula 400·log(points_A/points_B), where points_A and points_B are the number of points of engines A and B. Of course: points_A + points_B = n = number of games of the match.

I tried to put this rating difference in terms of (wins - loses)... I thought that I will not reach anything useful, but I achieve it, at last. It is indeed very easy:

Code: Select all

n = wins + loses + draws = w + l + d; d = n - w - l

(Rating difference) = rd = 400·log[(w + d/2)/(l + d/2)] = 400·log[(2w + d)/(2l + d)] = 400·log{[2w + (n - w - l)]/[2l + (n - w - l)]} = 400·log{[n + (w - l)]/[n - (w - l)]}
(Rating difference) = rd = 400·log{[n + (w - l)]/[n - (w - l)]}
So, in your first case (from Houdini POV), where n = 360 and (wins - loses) = 13, rd = 400·log[(360 + 13)/(360 - 13)] ~ 12.55 Elo, as I said before; when n = 422 and (wins - loses) = 20, then rd = 400·log[(422 + 20)/(422 - 20)] ~ 16.48 Elo (from Houdini POV), as I already wrote.

I think that you wrongly got those +14 Elo for Houdini due to roundings: when 360 games were played, the result was 186.5 - 173.5 in favour of Houdini, which is around 51.81% - 48.19% (which leads to ~ 12.55 Elo difference), but you rounded up to 52% - 48%, getting ~ 13.9 ~ 14 Elo difference. The same applies when 422 games are played: a 20-game lead means a provisional result of 221 - 201 (in favour of Houdini), which is more less 52.37% - 47.63% (so, around 16.48 Elo difference), instead of your same rounding of 52% - 48%, getting again ~ 13.9 ~ 14 Elo difference. Please correct me if my guesses are wrong.

Summarizing: if you are going to compute rating difference manually, I recommend you to do it with the number of points of each engine instead of the relative scores (I mean, the percentages): you will avoid nasty roundings that hurt the calculation. Please do not hesitate in asking doubts if you have them.

Regards from Spain.

Ajedrecista.



Jesus, what I do is compute it the same way the chessbase Fritz gui computes it. I learned from Peter Osterlund:

I take the two %s- lets say 52% and 48%:

I then compute as follows: .52 / .48 = Log" * 400 = (in this case) 13.9 and round off to 14. I checked and that is the way chessbase does it in their guis. It is not bayeselo, and Peter I think said it might not be quite as accurate- but close.


george

PS: If you answer back yourself I may not see it for a while. I can't keep my eyes open. Gotta get some sleep.



Jesus- one thing before I head to get some sleep. I am not sure it means anything other than whatever. But in the last 153 games, the score for those games only is- Houdini leading: +39/-38/=76. I know very well the old saying: "If 'ifs and buts' were candy and nuts, oh what a Christmas we'd have!" But if...........................................


george