STS [1-10] Crab 1.0

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

swami
Posts: 6664
Joined: Thu Mar 09, 2006 4:21 am

STS [1-10] Crab 1.0

Post by swami »

Image

http://sites.google.com/site/strategict ... st-results

1000 Positions
10 seconds per position
Hardware: Q6600, 32 bits, 2 GB RAM, 2.4 GHZ. Arena 2.01 GUI.
Look

Re: STS [1-10] Crab 1.0

Post by Look »

Image

http://sites.google.com/site/strategict ... st-results

1000 Positions
10 seconds per position
Hardware: Q6600, 32 bits, 2 GB RAM, 2.4 GHZ. Arena 2.01 GUI.
Thanks, as expected slightly weaker than SF 1.7.1. I have a question though, have you tried to map elo ratings to your results?
swami
Posts: 6664
Joined: Thu Mar 09, 2006 4:21 am

Re: STS [1-10] Crab 1.0

Post by swami »

Look wrote:
Image

http://sites.google.com/site/strategict ... st-results

1000 Positions
10 seconds per position
Hardware: Q6600, 32 bits, 2 GB RAM, 2.4 GHZ. Arena 2.01 GUI.
Thanks, as expected slightly weaker than SF 1.7.1. I have a question though, have you tried to map elo ratings to your results?
Hi Adam,

No, I haven't found a way to assign elo to the results based on STS.
All I did was predict/guesstimate. So far it appears that it's easy to find out the estimated rating of the new engine or updates to any engine.

It would be interesting if someone could offer exact formula to do so.

You can review this thread for a related topic:
http://www.talkchess.com/forum/viewtopic.php?t=31700
Look

Re: STS [1-10] Crab 1.0

Post by Look »


It would be interesting if someone could offer exact formula to do so.

You can review this thread for a related topic:
http://www.talkchess.com/forum/viewtopic.php?t=31700
Ok, I am not sure, but I suppose positions in your suite are all checked only by Rybka right? In this case, it might be interesting to keep only those positions which at least another say, top 5 engine gives same move. And you can instead add new positions which satisfy that condition. Point is that some moves could be considered by an engine because of its style, but they should not be in a test suite.
swami
Posts: 6664
Joined: Thu Mar 09, 2006 4:21 am

Re: STS [1-10] Crab 1.0

Post by swami »

Look wrote:

It would be interesting if someone could offer exact formula to do so.

You can review this thread for a related topic:
http://www.talkchess.com/forum/viewtopic.php?t=31700
Ok, I am not sure, but I suppose positions in your suite are all checked only by Rybka right? In this case, it might be interesting to keep only those positions which at least another say, top 5 engine gives same move. And you can instead add new positions which satisfy that condition. Point is that some moves could be considered by an engine because of its style, but they should not be in a test suite.
No, It's not checked just by Rybka. But Stockfish/Ivanhoe/Zappa/Naum/Rybka

http://www.talkchess.com/forum/viewtopic.php?t=34778
Look

Re: STS [1-10] Crab 1.0

Post by Look »


It would be interesting if someone could offer exact formula to do so.

You can review this thread for a related topic:
http://www.talkchess.com/forum/viewtopic.php?t=31700
Ok, I am not sure, but I suppose positions in your suite are all checked only by Rybka right? In this case, it might be interesting to keep only those positions which at least another say, top 5 engine gives same move. And you can instead add new positions which satisfy that condition. Point is that some moves could be considered by an engine because of its style, but they should not be in a test suite.
No, Its not checked just by Rybka. But Stockfish/Ivanhoe/Zappa/Naum/Rybka

http://www.talkchess.com/forum/viewtopic.php?t=34778
Ok, There is some problems with the elo mapping. First, there are tactics too, one can not rely only on strategical positions. Then comes the low number of samples, maybe something like 500 or more engines would be more to point. After that one may try various ways for mapping, but note that, it would be an estimate and also for some engines could be quite misleading too.
swami
Posts: 6664
Joined: Thu Mar 09, 2006 4:21 am

Re: STS [1-10] Crab 1.0

Post by swami »

Look wrote:

It would be interesting if someone could offer exact formula to do so.

You can review this thread for a related topic:
http://www.talkchess.com/forum/viewtopic.php?t=31700
Ok, I am not sure, but I suppose positions in your suite are all checked only by Rybka right? In this case, it might be interesting to keep only those positions which at least another say, top 5 engine gives same move. And you can instead add new positions which satisfy that condition. Point is that some moves could be considered by an engine because of its style, but they should not be in a test suite.
No, Its not checked just by Rybka. But Stockfish/Ivanhoe/Zappa/Naum/Rybka

http://www.talkchess.com/forum/viewtopic.php?t=34778
Ok, There is some problems with the elo mapping. First, there are tactics too, one can not rely only on strategical positions. Then comes the low number of samples, maybe something like 500 or more engines would be more to point. After that one may try various ways for mapping, but note that, it would be an estimate and also for some engines could be quite misleading too.
Yes, you're right. At the moment, we could only do guesstimation.

My observation is that tactics rarely happens in computer vs computer games which is why most engines ratings closely correlate with their STS results. If tactics ever happens in engine games, it's usually when the engine is already very advantageous.

Tactics does happen significantly more in human - human games.
Look

Re: STS [1-10] Crab 1.0

Post by Look »

My observation is that tactics rarely happens in computer vs computer games which is why most engines ratings closely correlate with their STS results. If tactics ever happens in engine games, its usually when the engine is already very advantageous.

Tactics does happen significantly more in human - human games.
My observation tells me otherwise. Surely tactics at 3000 are different from 2000-2300 level, but they exist. Analyse the games specially at fast time controls carefully. If you want I may post CC games too. But as you know, There are no clear boundaries here, but one can decide based on various factors whether something is a tactical shot(or mistake) or a positional one.