The trouble with UHO

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: The trouble with UHO

Post by lkaufman »

Graham Banks wrote: Tue Dec 20, 2022 11:34 pm My conditions for a balanced opening line is that in the first 10 moves out of book, there should be at least one evaluation that is less than 0.70.
An evaluation no longer has a meaning without specifying the engine (or at least the definition used by the engine). Before NNUE, a "clean" pawn up (middlegame, no compensation) was supposed to be +1.00. With NNUE, such positions got higher and higher evals, exceeding 2.5 in Stockfish 15. Now with the new definition of 1.0 as the win/draw line, a clean pawn should get something like +1.3 or so in Stockfish 15.1 or Dragon 3.2. So 0.70 was quite a big advantage in older engines with UCI-based evals, very near the win/draw line and typical of UHO books. With Stockfish 15, 0.70 was just a good opening edge, quite often seen in GM-approved lines. With Stockfish 15.1 or Dragon 3.2, 0.70 is a big but not winning advantage, it means that the opening is not suitable for serious high-level competition, only perhaps for human blitz games. So which engine or engines are you basing this on? I would agree with you if you mean Stockfish 15 or perhaps Dragon 3 or 3.1, but otherwise I would call 0.70 unbalanced.
Komodo rules!
User avatar
Graham Banks
Posts: 45104
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: The trouble with UHO

Post by Graham Banks »

lkaufman wrote: Wed Dec 21, 2022 3:56 am
Graham Banks wrote: Tue Dec 20, 2022 11:34 pm My conditions for a balanced opening line is that in the first 10 moves out of book, there should be at least one evaluation that is less than 0.70.
An evaluation no longer has a meaning without specifying the engine (or at least the definition used by the engine). Before NNUE, a "clean" pawn up (middlegame, no compensation) was supposed to be +1.00. With NNUE, such positions got higher and higher evals, exceeding 2.5 in Stockfish 15. Now with the new definition of 1.0 as the win/draw line, a clean pawn should get something like +1.3 or so in Stockfish 15.1 or Dragon 3.2. So 0.70 was quite a big advantage in older engines with UCI-based evals, very near the win/draw line and typical of UHO books. With Stockfish 15, 0.70 was just a good opening edge, quite often seen in GM-approved lines. With Stockfish 15.1 or Dragon 3.2, 0.70 is a big but not winning advantage, it means that the opening is not suitable for serious high-level competition, only perhaps for human blitz games. So which engine or engines are you basing this on? I would agree with you if you mean Stockfish 15 or perhaps Dragon 3 or 3.1, but otherwise I would call 0.70 unbalanced.
With hand-picked lines, I use both SF and Komodo Dragon to assess the ensuing moves.

However, most of the book I use are based on drawn games between players rated 2600+.
gbanksnz at gmail.com
Krzysztof Grzelak
Posts: 1588
Joined: Tue Jul 15, 2014 12:47 pm

Re: The trouble with UHO

Post by Krzysztof Grzelak »

pohl4711 wrote: Wed Dec 21, 2022 3:46 am
lkaufman wrote: Wed Dec 21, 2022 1:30 am
dkappe wrote: Tue Dec 20, 2022 11:20 pm I find Pohl’s UHO openings very useful for testing. But there are some questions I have, in particular this one:

Is a test between two engines using the most common uho book predictive of their performance using a balanced book? It seems intuitively correct, but I don’t recall anyone doing an experiment on this.
The question needs clarification. It is extremely obvious that testing with uho books is not predictive of elo differences between unequal engines, it greatly exaggerates such differences (unless the differences are huge). I suppose you mean, is the engine that wins with UHO books also likely to win a long match against the same engine with balanced books, never mind the score? Based on all my testing, I would say the correlation (in this later sense) is high but not perfect. Especially with neural nets, some may be better trained on "normal" openings, and others on positions not likely to arise by choice. So search changes probably will benefit the same engine regardless of book, but net changes could well favor one book or the other. But then, you can also say that with long time controls and many cores and top engines, normal books will always show zero elo (roughly) since nearly every game will be a draw, in which case the question has no meaningful answer.
IMO, you said all, that can be said here. I dont think, that there is a need for more clarification. Especially because on my website, you find testruns with classical books compared with uho (and other unbalanced openings concepts) with many games and 2 different thinking-times,
3min+1sec here: https://www.sp-cc.de/anti-draw-openings.htm
5min+3sec here: https://www.sp-cc.de/uho_2022.htm

If there is any doubt, because of engine with nnue nets trained with uho, my AntiDraw openings collection offer a lot more opening concepts, containing unbalanced openings (Chess324, Drawkiller, NBC, NBSC etc.), just download the whole package and make your choice!
Please write a book with an assessment within limits 0.30 - 0.40. Not more.
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: The trouble with UHO

Post by Ozymandias »

pohl4711 wrote: Wed Dec 21, 2022 3:46 amunbalanced openings (Chess324, Drawkiller, NBC, NBSC etc.)
I see my favorite SALC has been relegated under the "etc" umbrella.
Krzysztof Grzelak
Posts: 1588
Joined: Tue Jul 15, 2014 12:47 pm

Re: The trouble with UHO

Post by Krzysztof Grzelak »

lkaufman wrote: Wed Dec 21, 2022 3:56 am
An evaluation no longer has a meaning without specifying the engine (or at least the definition used by the engine). Before NNUE, a "clean" pawn up (middlegame, no compensation) was supposed to be +1.00. With NNUE, such positions got higher and higher evals, exceeding 2.5 in Stockfish 15. Now with the new definition of 1.0 as the win/draw line, a clean pawn should get something like +1.3 or so in Stockfish 15.1 or Dragon 3.2. So 0.70 was quite a big advantage in older engines with UCI-based evals, very near the win/draw line and typical of UHO books. With Stockfish 15, 0.70 was just a good opening edge, quite often seen in GM-approved lines. With Stockfish 15.1 or Dragon 3.2, 0.70 is a big but not winning advantage, it means that the opening is not suitable for serious high-level competition, only perhaps for human blitz games. So which engine or engines are you basing this on? I would agree with you if you mean Stockfish 15 or perhaps Dragon 3 or 3.1, but otherwise I would call 0.70 unbalanced.
Little Christmas gift.

match Stockfish 15.1 - Lco 0.30 https://files.fm/u/ey7e6nybb
User avatar
pohl4711
Posts: 2843
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: The trouble with UHO

Post by pohl4711 »

Ozymandias wrote: Wed Dec 21, 2022 8:10 am
pohl4711 wrote: Wed Dec 21, 2022 3:46 amunbalanced openings (Chess324, Drawkiller, NBC, NBSC etc.)
I see my favorite SALC has been relegated under the "etc" umbrella.
SALC has the problem, that there are not many human games in the Megabase. Only balanced SALC-openings give enough games, but in these days of the superstrong nnue-engines, balanced SALC openings do not work anymore (too many draws). And unbalanced (human) SALC openings are impossible to build, because there are not enough games/lines...
But the good news is: You can try my (unbalanced) Drawkiller human openings instead:
"Drawkiller human 4,5,6 moves: One of the 4 possible Drawkiller "endpositions" (2x classic and 2x QR-switch) as
FEN-code followed by 4,5,6 human opening moves out of the Megabase 2020 (both players 2400+ Elo). Because the human
openings are played out of the normal chess starting position, not out of one of the 4 Drawkiller positions, a lot
of move-lines were illegal and had to be sorted out."

Because in Drawkiller, the kings are on the opposite lines of the board, and then (in Drawkiller human openings) some human moves are added, this openings are very similar to SALC and could be called "constructed SALCs". Here the number of lines is big enough for building good working unbalanced openings.
Look at the amazing testing results on my AntiDraw openings site:
https://www.sp-cc.de/anti-draw-openings.htm
(Very good Elo-spreading and amazing low draw-rates!)

Here as an example, the first line of the file dk_+0.90_+0.99_human_6mvs.pgn (containing 6878 different lines!)
[pgn]
[Event "Valjevo"]
[Site "Valjevo"]
[Date "2007.06.20"]
[Round "8"]
[White "Ivanisevic, Ivan"]
[Black "Roiz, Michael"]
[Result "1/2-1/2"]
[ECO "D15"]
[WhiteElo "2614"]
[BlackElo "2605"]
[Annotator "depth=26 eval=+097"]
[SetUp "1"]
[FEN "rnbqrbnk/pppppppp/8/8/8/8/PPPPPPPP/KNBRQBNR w - - 0 1"]
[PlyCount "12"]
[EventDate "2007.??.??"]

1. d4 d5 2. c4 c6 3. Nf3 Nf6 4. Nc3 a6 5. h3 e6 6. e3 Nbd7 1/2-1/2
[/pgn]

Of course, the played moves and the TAGs of this game refers to the played moves, taken from Megabase, were the game started from normal chess starting position, so just ignore the TAGs with player names, tournament etc... Only the endposition of this line matters: IMO these endpositions look very "SALCish"...You will get amazing games, when using these openings! So, Drawkiller human is "SALC - The Next Generation..." (I am Trekkie, sorry for this)
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: The trouble with UHO

Post by Ozymandias »

pohl4711 wrote: Wed Dec 21, 2022 12:14 pmSALC has the problem, that there are not many human games in the Megabase.
I seem to have enough with a subset of the v5 10m version. Just 3000 opening lines give me +18 for SF15.1 vs an intermediate SF dev between 14.1 and 15. Draw rate at 6s + 0.1s is 65%, I'll try to get 1m+1s results if Banskia GUI takes kindly to concurrency.

First I have to figure out which versions are those you tested mid August, topping the graph here:

Image

I want to test one of those against SF 15.1 too.
User avatar
pohl4711
Posts: 2843
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: The trouble with UHO

Post by pohl4711 »

Ozymandias wrote: Wed Dec 21, 2022 1:11 pm
pohl4711 wrote: Wed Dec 21, 2022 12:14 pmSALC has the problem, that there are not many human games in the Megabase.
I seem to have enough with a subset of the v5 10m version. Just 3000 opening lines give me +18 for SF15.1 vs an intermediate SF dev between 14.1 and 15. Draw rate at 6s + 0.1s is 65%, I'll try to get 1m+1s results if Banskia GUI takes kindly to concurrency.

First I have to figure out which versions are those you tested mid August, topping the graph here:

Image

I want to test one of those against SF 15.1 too.
You find this information on my website, above the ratinglist:

Latest update: 2022/12/20: Igel 3.2.0 (+33 Elo to Igel 3.1.0)
(best Stockfish Elo so far: Stockfish 220817 3814 SPCC-Elo)

So, the latest dev on abrok.eu from August, 17th is the best SF dev so far in my ratinglist-testruns
User avatar
Ozymandias
Posts: 1537
Joined: Sun Oct 25, 2009 2:30 am

Re: The trouble with UHO

Post by Ozymandias »

Looks like a good candidate to test against 15.1.