Mystery engine at CCC

mig2004 · Post by **mig2004** » Fri Jul 21, 2023 4:45 am

Graham Banks wrote: ↑Fri Jul 21, 2023 4:40 am
mig2004 wrote: ↑Fri Jul 21, 2023 4:36 am
Ras wrote: ↑Thu Jul 20, 2023 12:53 pm Unbalanced openings are perfectly fine if each engine gets to play it with both sides against the same opponents. A better engine might convert the advantage into a win or hold the disadvantage to a draw, compared to drawing with advantage and losing with disadvantage. It just shouldn't be so unbalanced that the results are obvious, such as starting with a piece handicap.
That´s my consideration for using UHO´S. And they are not ¨contrived¨ opening lines. They are human played lines in the past by FIDE RATED masters with 2400 elo or more.
They're contrived to achieve decisive results, at the expense of having fair opening lines.

There are only 3 possible results in a chess game. All lines are then¨contrived¨, by your definition. All lines will achieve a ¨decisive" result.

Graham Banks · Post by **Graham Banks** » Fri Jul 21, 2023 4:47 am

mig2004 wrote: ↑Fri Jul 21, 2023 4:45 am
Graham Banks wrote: ↑Fri Jul 21, 2023 4:40 am
mig2004 wrote: ↑Fri Jul 21, 2023 4:36 am
Ras wrote: ↑Thu Jul 20, 2023 12:53 pm Unbalanced openings are perfectly fine if each engine gets to play it with both sides against the same opponents. A better engine might convert the advantage into a win or hold the disadvantage to a draw, compared to drawing with advantage and losing with disadvantage. It just shouldn't be so unbalanced that the results are obvious, such as starting with a piece handicap.
That´s my consideration for using UHO´S. And they are not ¨contrived¨ opening lines. They are human played lines in the past by FIDE RATED masters with 2400 elo or more.
They're contrived to achieve decisive results, at the expense of having fair opening lines.
There are only 3 possible results in a chess game. All lines are then¨contrived¨, by your definition. All lines will achieve a ¨decisive" result.

By decisive, I meant not drawn, but I suspect that you knew that anyway.

lkaufman · Post by **lkaufman** » Fri Jul 21, 2023 7:56 am

Peter Berger wrote: ↑Thu Jul 20, 2023 10:30 pm
lkaufman wrote: ↑Thu Jul 20, 2023 5:13 pm I agree with you that an eval of less than 0.70 from the opening is desirable, I just would characterize any evals above 0.50 as "unbalanced". No top human GM would intentionally play a defense that lead to a score worse than 0.50 against another top GM in an important classical game except perhaps for surprise value or due to needing to win with Black due to match score. Only rarely do they choose defenses that are worse than -.40. Top engines would rarely choose defenses worse than about -.30 on their own, even with MP randomness. But it is necessary to include openings that are close to your 0.70 line in order to avoid draws. You are in effect just doing a mild version of the UHO idea when you do this, which I think is fine. It's a good compromise between "correct" chess and the need to avoid 100% draws.
But then – isn’t it a bit unclear what these rating lists actually measure?
We have a clear idea when it is about the usual setup – two opponents facing each other, doing their very best from move 1. This experience mainly comes from a long history of human games.
As chess is so drawish, you need a bazillion of games to decide who is the stronger one in a match or tournament between strong chess engines.
So you introduce very uneaven setups with strange and unusual openings. Is it +that+ obvious, that you need the same qualities as in a „normal“ game this way?
Yes, you get clearer answers strength-wise, but how do they translate to the traditional setup if you do enough games?
As engines become ever stronger, the starting positions will become ever more lop-sided this way- at least this is what intuition suggests ( I have no data to prove this).
This might lead to a strange evolution. Human players are obviously mostly interested in „breaking the draw“ while doing analysis, getting unexpected wins from what they perceive as drawish positions. But math-wise it is just as valueable to get „undeserved“ draws. And maybe Stockfish has followed this path for a little too long already.

You are quite correct that forcing inferior openings will measure something different than letting the engines choose the openings or limiting the choices to those frequently seen in Elite GM Classical play. This is indeed a problem when rating lists use a variety of opening books, some of which may include many dubious lines while others only play the "best" lines. It means that the tester can influence the result significantly by his choice of opening book. But it is difficult to get everyone to agree on one book. If you use a book with only "best" lines, the top engine ratings will depend primarily on which ones are better at avoiding draws against much weaker engines, which may basically mean which ones do "Contempt" better, since almost all the games will be drawn between the top ones. I just ran 620 blitz (2' + 1") games of SF 16 vs 15.1 on two threads (to get variety) with no opening book; of course many openings were repeated (especially the Berlin), but there was decent variety and even within the Berlin games usually varied by about move ten, so I think it was a meaningful test. I got 16 wins for sf16, 4 wins for sf15.1, and 600 draws, so 97% draws. This was blitz and just two threads, imagine Rapid and even four threads, surely at least 99% draws. So either you measure the ability to find critical moves in positions near the win/draw line, or you measure the ability to trick weak engines; take your pick! For me the first is more interesting, although the second might be more useful for top players preparing for weaker ones. It would be nice though if we could settle on some "rule" for generating the unbalanced positions rather than having it depend on the judgment of the tester. I proposed chess 324 for this, which follows all normal chess rules except for the start positions. There are many possible solutions, but each has pros and cons.

Modern Times · Post by **Modern Times** » Fri Jul 21, 2023 8:38 am

lkaufman wrote: ↑Fri Jul 21, 2023 7:56 am I proposed chess 324 for this, which follows all normal chess rules except for the start positions. There are many possible solutions, but each has pros and cons.

Yes, I ran a one-off tournament 6 months ago:

https://ccrl.chessdom.com/ccrl/324-1/index.html

I never did any more work on it, because I don't know of a GUI where you can set up a gauntlet, with the gauntlet engine playing all 324 openings against each opponent. In that tournament I ran, it was actually individual engine vs engine matches played one after the other - horrendously cumbersome.

chessica · Post by **chessica** » Fri Jul 21, 2023 11:27 am

What is Mystery, Mystery or Torch?

Graham Banks · Post by **Graham Banks** » Fri Jul 21, 2023 11:45 am

chessica wrote: ↑Fri Jul 21, 2023 11:27 am What is Mystery, Mystery or Torch?

Torch.

Marek Soszynski · Post by **Marek Soszynski** » Fri Jul 21, 2023 2:52 pm

Steve Maughan wrote: ↑Fri Jul 14, 2023 3:57 am
Graham Banks wrote: ↑Fri Jul 14, 2023 3:24 am Torch is a brand-new chess engine built from the ground up by top chess engine developers...
Fascinating! I’m looking forward to testing it. Hopefully it’ll be available as a UCI engine and not just on chess.com

— Steve

"Torch is not yet publicly available but will be made available to certain relevant third parties for ratings. It will also be available through Chess.com’s analysis in the future."

The sales model may not be to make it available as an ordinary UCI engine — at all.

Eduard · Post by **Eduard** » Fri Jul 21, 2023 4:23 pm

ImNotStockfish wrote: ↑Thu Jul 20, 2023 11:11 pm
Eduard wrote: ↑Thu Jul 20, 2023 9:38 pm An engine that only gets 50% with normal openings but has +100 Elo with unbalanced openings, touting it as a 100 Elo better engine is strange. Then there is the question of what it looks like in position tests. If the +100 Elo engine is not better in position tests, then the 100 Elo makes absolutely no sense and suggests a distorted playing strength.

For this I have a test suite for 200 games with Ponder ON plus a position test suite, and then I go onto the server and play a few hundred games live.
A chess game is nothing more than a dozen "tests positions". A "test suite" is just a very very very small sample size of what Fishtest and tournamets like CCC and TCEC do, test engines in a lot of positions and the one that is able to solve more of them wins.

Stockfish might not do great at some small-sample selection of positions, but its the best at solving big-sample-size real-world test positions

Is it that hard to understand? I play engine prize tournaments and freestyle chess (analysis mode). I want to have the best engines for this, I use up to 3 engines in parallel. Results alone with unbalanced openings + bullet are totally unimportant to me. The engine must also achieve good results in my check (analysis, my own opening test and server test). If not, then I choose another engine.

ImNotStockfish · Post by **ImNotStockfish** » Fri Jul 21, 2023 4:29 pm

Eduard wrote: ↑Fri Jul 21, 2023 4:23 pm Is it that hard to understand? I play engine prize tournaments and freestyle chess (analysis mode). I want to have the best engines for this, I use up to 3 engines in parallel. Results alone with unbalanced openings + bullet are totally unimportant to me. The engine must also achieve good results in my check (analysis, my own opening test and server test). If not, then I choose another engine.

Is it that hard to understand? You are free to choose whatever engine you want for whatever reason you want just like other people like me are free to criticize them if we find those reasons to be nonsensical

Eduard · Post by **Eduard** » Fri Jul 21, 2023 4:45 pm

I'm just stating my point of view here as you do yours. I just want you to know that your tournaments on chess.com don't tell me anything about the engines I want to use. Have fun enjoying your tournaments with unbalanced openings, I don't find it fun to watch something like that.

Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC

Re: Mystery engine at CCC