Another attempt at comparing Evals ELO-wise

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Lyudmil Tsvetkov »

[d]r2r2k1/pp3p2/2p1bPpp/4p3/1Pn1P2P/P1NqQ1P1/3B1P2/R2R2K1 w - - 0 22

another interesting position.
SF does not find here Qh6, with mate in 2 moves. instead, Qd3 queen exchange is preferred. (capturing a whole queen is more valuable of course than simply attacking the king shelter; but not, if the queen-capturable enemy queen is defended by another piece, meaning equal exchange)

so far so good.
will stop posting similar examples, as otherwise main SF developers will start lashing out at me...
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Rebel »

Lyudmil Tsvetkov wrote:[d]r2r2k1/pp3p2/2p1bPpp/4p3/1Pn1P2P/P1NqQ1P1/3B1P2/R2R2K1 w - - 0 22

another interesting position.
SF does not find here Qh6, with mate in 2 moves. instead, Qd3 queen exchange is preferred. (capturing a whole queen is more valuable of course than simply attacking the king shelter; but not, if the queen-capturable enemy queen is defended by another piece, meaning equal exchange)

so far so good.
will stop posting similar examples, as otherwise main SF developers will start lashing out at me...
They should do anyway :lol:

Basically, don't criticize what you don't understand, in this case the SF concept (model).
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Lyudmil Tsvetkov »

Rebel wrote:
Lyudmil Tsvetkov wrote:[d]r2r2k1/pp3p2/2p1bPpp/4p3/1Pn1P2P/P1NqQ1P1/3B1P2/R2R2K1 w - - 0 22

another interesting position.
SF does not find here Qh6, with mate in 2 moves. instead, Qd3 queen exchange is preferred. (capturing a whole queen is more valuable of course than simply attacking the king shelter; but not, if the queen-capturable enemy queen is defended by another piece, meaning equal exchange)

so far so good.
will stop posting similar examples, as otherwise main SF developers will start lashing out at me...
They should do anyway :lol:

Basically, don't criticize what you don't understand, in this case the SF concept (model).
above sentence was meant only for people with a sense of humour.

besides, we have been discussing eval prowess of current engines, so I do not understand where my post is off-topic.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Rebel »

Lyudmil Tsvetkov wrote:
Rebel wrote:
Lyudmil Tsvetkov wrote:[d]r2r2k1/pp3p2/2p1bPpp/4p3/1Pn1P2P/P1NqQ1P1/3B1P2/R2R2K1 w - - 0 22

another interesting position.
SF does not find here Qh6, with mate in 2 moves. instead, Qd3 queen exchange is preferred. (capturing a whole queen is more valuable of course than simply attacking the king shelter; but not, if the queen-capturable enemy queen is defended by another piece, meaning equal exchange)

so far so good.
will stop posting similar examples, as otherwise main SF developers will start lashing out at me...
They should do anyway :lol:

Basically, don't criticize what you don't understand, in this case the SF concept (model).
above sentence was meant only for people with a sense of humour.
And then you missed mine?
Lyudmil Tsvetkov wrote:besides, we have been discussing eval prowess of current engines, so I do not understand where my post is off-topic.
Who said your post was off-topic?
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Lyudmil Tsvetkov »

Rebel wrote:
Lyudmil Tsvetkov wrote:
Rebel wrote:
Lyudmil Tsvetkov wrote:[d]r2r2k1/pp3p2/2p1bPpp/4p3/1Pn1P2P/P1NqQ1P1/3B1P2/R2R2K1 w - - 0 22

another interesting position.
SF does not find here Qh6, with mate in 2 moves. instead, Qd3 queen exchange is preferred. (capturing a whole queen is more valuable of course than simply attacking the king shelter; but not, if the queen-capturable enemy queen is defended by another piece, meaning equal exchange)

so far so good.
will stop posting similar examples, as otherwise main SF developers will start lashing out at me...
They should do anyway :lol:

Basically, don't criticize what you don't understand, in this case the SF concept (model).
above sentence was meant only for people with a sense of humour.
And then you missed mine?
Lyudmil Tsvetkov wrote:besides, we have been discussing eval prowess of current engines, so I do not understand where my post is off-topic.
Who said your post was off-topic?
I hate personalising the thread, so it is better to stop somewhere here.

you said that I should not be criticising the SF concept/model, that I do not understand.
what is so difficult to understand? recursively reaching depth 0, SF starts doing quiescence search, as any other engine around, to only evaluate 'quiet' positions with no available captures, checks or other relevant threat moves.

without quiescence search and depth 1 SF probably plays around 1500 elo
without quiescence search and depth 10 SF probably plays around 2000 elo
with qs and depth 1 probably around same 2000 elo
with qs and depth 10 around 2500 elo, etc.

so, basically, qs should be worth at least around 500-1000 elo alone.

etc., etc., etc., too boring to discuss when minds do not quite meet.

I guess the best thing you could currently do is offer another engine that would play 200 elo stronger, but only on Fridays.
BrendanJNorman
Posts: 2526
Joined: Mon Feb 08, 2016 12:43 am
Full name: Brendan J Norman

Re: Another attempt at comparing Evals ELO-wise

Post by BrendanJNorman »

Lyudmil Tsvetkov wrote:I guess the best thing you could currently do is offer another engine that would play 200 elo stronger, but only on Fridays.
*Shots Fired, Shots Fired! Get Down! This Guy is the Rebel of the Decade!*

(see what I did there? :lol:)

Okay, back to work - nothing to see here folks. :x
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Rebel »

lkaufman wrote:
cdani wrote:
lkaufman wrote: Details like whether you include checks in qsearch, whether you prune "losing" moves in qsearch, etc. have a huge effect on one ply search results. Unless these things are all very similar, it's not a test of eval. Also check extension rules play a huge role even at one ply.
Of course. I tried to make the two executables equal:

* At root search do not prune/extend.
* At first qsearch detph do not generate checks, unless a capture is a check.
* Qsearch futility pruning is in effect.

Probably the two versions are not 100% equal, but they should be mostly.
I wouldn't be too concerned about the 90 elo gap you report from SF at one ply, even if the searches are identical. The evals of all of the top engines are totally unsuitable for one ply games. Similarly an absolutely optimum one-ply eval would probably be more than a hundred elo worse than what top engines use currently for games with reasonable time limits. Are your weights for things like potential checks, or threats to enemy pieces, significantly higher or lower than those of SF? If so that could account for the one-ply results.
Wouldn't it be interesting to port the eval of Komodo and SF to TSCP? Then play time control and fixed depth matches. You get all sorts of insights and probably can profit. Surely Mark is able to port SF also :wink:
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Rebel »

Lyudmil Tsvetkov wrote: I hate personalising the thread, so it is better to stop somewhere here.

you said that I should not be criticising the SF concept/model, that I do not understand.
what is so difficult to understand? recursively reaching depth 0, SF starts doing quiescence search, as any other engine around, to only evaluate 'quiet' positions with no available captures, checks or other relevant threat moves.

without quiescence search and depth 1 SF probably plays around 1500 elo
without quiescence search and depth 10 SF probably plays around 2000 elo
with qs and depth 1 probably around same 2000 elo
with qs and depth 10 around 2500 elo, etc.

so, basically, qs should be worth at least around 500-1000 elo alone.

etc., etc., etc., too boring to discuss when minds do not quite meet.
??

SF is a search engine with an excellent eval. What we are discussing is the eval part while you constantly want to involve search into the discussion.
Lyudmil Tsvetkov wrote:I guess the best thing you could currently do is offer another engine that would play 200 elo stronger, but only on Fridays.
:lol:

Your memory is good.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Another attempt at comparing Evals ELO-wise

Post by corres »

[quote="Laskos"]

Also, ELO as used in computer chess is not FIDE Elo and is not what Arpad Elo did (he used normal distribution, for once).

[/quote]

You are right, but why do not use "bayeselo" or "laskoselo" instead of "ELO".
It would be more precise mark of the difference between FIDE Elo and computer chess "Elo" used by you.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Another attempt at comparing Evals ELO-wise

Post by Lyudmil Tsvetkov »

BrendanJNorman wrote:
Lyudmil Tsvetkov wrote:I guess the best thing you could currently do is offer another engine that would play 200 elo stronger, but only on Fridays.
*Shots Fired, Shots Fired! Get Down! This Guy is the Rebel of the Decade!*

(see what I did there? :lol:)

Okay, back to work - nothing to see here folks. :x
tomorrow is Friday.