The number of plies in an opening is much less important than the exit position. A long line can produce high elo-spread because it forces engines into complications ; conversely a short line with something like the slav exchange will have a poor elo-spread. Obviously, forced wins and forced 3-folds lower the elo-spread and are pretty much pointless.
Different books will have a different elo-spread ; but there is no standard as to what the "correct" elo-spread is. As soon as you do something else than pure start position testing (which is also flawed because it explores only a tiny subset of lines and of the engine abilities), you're introducing subjective choices.
Time handicap tournament. LC0 and SF11
Moderators: hgm, Rebel, chrisw
-
- Posts: 550
- Joined: Tue Nov 19, 2019 8:48 pm
- Full name: Alayan Feh
-
- Posts: 343
- Joined: Sun Aug 25, 2019 8:33 am
- Full name: .
Re: Time handicap tournament. LC0 and SF11
No, at worst it makes weaker opening result in one win for each side, which has as big of an effect as playing drawish openings. I don't think any of these openings are lost and the initial score shown by the program is not the truth (or it'd be 0.0 or mate in N). How the program plays weaker openings from both sides is important. If you only play openings that are as even as possible, your tests will give you too many draws and you won't learn much about how the program plays.Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm This is wrong and will defeat the whole purpose of the test. Bad openings cannot be cured by playing them for both sides.
That is a widely spread illogical opinion. The only thing it does, is to help the suspected weaker program to push its score
further to equality.
Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm Also long openings generally make those statistically tests unreliable for various reasons.
I noticed e.g. 120+ and 140+ games with early 3 time reps below move 30! in your pgn files. (same effect, pushing the weaker one towards equality)
Sometimes even directly after book end. IMO long books should be abolished at all for serious and statistical tests.
I think the opposite is true. Leaving openings too early results in repeated games, which messes up the stats, and is unrealistic because it won't happen in real games.
I don't understand the problem you're seeing. A program doesn't get more time than it has unless something is really broken on both sides.Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm Moreover I have some doubts now how to set the tc at all. Most programs now have too clever time management and I noticed that
in crucial positions sometimes the program, which should use half of the time actually used more time than the other.
Just for curiosity I did a match myself until today between SF and SFx2 (half time) on my slow hardware with 1 cpu each, at a very fast tc with given time per move. 1move/0.5s vs. 1move/0.25s (128MB) in cutechess-cli with a 6 plies general book and the diff was around 160 rating points.
(Need to calculate average depth for midgame to compare)
Even here I find artifacts of assymmetric time usage and I am not sure how much noise this adds to the outcome.
Yes, of course, it can be negative to see further (unless you reach a mate). It's just normal behavior that happens sometimes with all programs.Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm There is also an effect I completely forgot (but it is very rare) and we talked about long ago here.
Sometimes it is even negative to see further than your opponent, because you see more and more how worse your position might become
and defend against something the other won't see at all and play suboptimal against the time handicapped program until it even wins.
Yes, that's the whole idea for these time-handicapped self-play tests. One thing I'm hoping to see is how the Elo (Ordo) curve flattens and where for SF and LC0.
Contempt is set 0.
-
- Posts: 4622
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Time handicap tournament. LC0 and SF11
You did not understand most of what I had written, I don't know why I should invest my pretty time in more explaining.mmt wrote: ↑Wed Feb 19, 2020 2:20 am(1)No, at worst it makes weaker opening result in one win for each side, which has as big of an effect as playing drawish openings. I don't think any of these openings are lost and the initial score shown by the program is not the truth (or it'd be 0.0 or mate in N). How the program plays weaker openings from both sides is important. If you only play openings that are as even as possible, your tests will give you too many draws and you won't learn much about how the program plays.Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm This is wrong and will defeat the whole purpose of the test. Bad openings cannot be cured by playing them for both sides.
That is a widely spread illogical opinion. The only thing it does, is to help the suspected weaker program to push its score
further to equality.
Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm Also long openings generally make those statistically tests unreliable for various reasons.
I noticed e.g. 120+ and 140+ games with early 3 time reps below move 30! in your pgn files. (same effect, pushing the weaker one towards equality)
Sometimes even directly after book end. IMO long books should be abolished at all for serious and statistical tests.(2)I think the opposite is true. Leaving openings too early results in repeated games, which messes up the stats, and is unrealistic because it won't happen in real games.
(3)I don't understand the problem you're seeing. A program doesn't get more time than it has unless something is really broken on both sides.Guenther wrote: ↑Tue Feb 18, 2020 1:58 pm Moreover I have some doubts now how to set the tc at all. Most programs now have too clever time management and I noticed that
in crucial positions sometimes the program, which should use half of the time actually used more time than the other.
Just for curiosity I did a match myself until today between SF and SFx2 (half time) on my slow hardware with 1 cpu each, at a very fast tc with given time per move. 1move/0.5s vs. 1move/0.25s (128MB) in cutechess-cli with a 6 plies general book and the diff was around 160 rating points.
(Need to calculate average depth for midgame to compare)
Even here I find artifacts of assymmetric time usage and I am not sure how much noise this adds to the outcome.
...snip...
BTW do you ever look at the games? Just a few corrections on your wrong assumptions above.
(1) I can show you dozens of examples of lost openings out of that opening file.
(2) This is complete nonsense, if the opening file contains enough lines there are no repeated games.
(actually it is even funny you claimed this, as you had several repeated games in your test matches,
it seems your opening file besides lost positions, also contains just 761 openings which puts some pressure
on randomization for 500 start positions.)
(3)You don't understand, probably you never looked at the games at all. Even if you get twice as time over the whole game in average,
time can be accumulated completely different (and will! - also depending on the exact type of tc).
Some games will have a very few crucial moments, which decide the outcome and it is not nice, if the side with half of the time
spends here suddenly more time than the other (which for whatever reason now has less time saved or did not grasp
the crucial situation) for a very few moves and wins just because of this.
I guess you'll try an ovyron now, but don't hold your breath for another answer from my side.
appendix (1) some examples why SF/2 could win due to lopsided openings in your test
Code: Select all
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.12"]
[Round "31"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "B99"]
[Opening "Sicilian"]
[Time "05:11:27"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "94"]
1. e4 c5
2. Nf3 d6
3. d4 cxd4
4. Nxd4 Nf6
5. Nc3 a6
6. Bg5 e6
7. f4 Be7
8. Qf3 Qc7
9. O-O-O Nbd7
10. Bd3 h6
11. Bh4 b5
12. e5 {+2.69/29 7} Bb7 {-3.02/28 4}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.12"]
[Round "139"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "B99"]
[Opening "Sicilian"]
[Time "11:07:09"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "104"]
1. e4 c5
2. Nf3 d6
3. d4 cxd4
4. Nxd4 Nf6
5. Nc3 a6
6. Bg5 e6
7. f4 Be7
8. Qf3 Qc7
9. O-O-O Nbd7
10. Bd3 h6
11. Bh4 b5
12. e5 {+2.35/30 7} Bb7 {-2.77/27 4}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.13"]
[Round "369"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "A40"]
[Opening "Englund Gambit"]
[Time "12:26:26"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "132"]
1. d4 e5
2. dxe5 Nc6
3. Nf3 Qe7
4. Bg5 {+1.68/23 2} f6 {-2.23/26 7}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.13"]
[Round "443"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "E94"]
[Opening "King's Indian"]
[Time "17:08:26"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "108"]
1. d4 Nf6
2. c4 g6
3. Nc3 Bg7
4. e4 d6
5. Nf3 O-O
6. Be2 e5
7. O-O Nbd7
8. Be3 c6
9. d5 {+1.29/23 1} cxd5 {-1.61/29 16}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.13"]
[Round "449"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "B01"]
[Opening "Scandinavian"]
[Time "17:28:49"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "82"]
1. e4 d5
2. exd5 Qxd5
3. Nc3 Qd6
4. d4 Nf6
5. Nf3 a6
6. g3 Bg4 {-1.56/27 11}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.13"]
[Round "473"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "E97"]
[Opening "King's Indian"]
[Time "19:04:05"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "136"]
1. d4 Nf6
2. c4 g6
3. Nc3 Bg7
4. e4 d6
5. Nf3 O-O
6. Be2 e5
7. O-O Nc6
8. d5 Ne7
9. b4 Nh5
10. Re1 f5
11. Ng5 Nf6 {-1.00/26 5}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "585"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "C41"]
[Opening "Philidor"]
[Time "04:41:52"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "110"]
1. e4 e5
2. Nf3 d6
3. d4 f5
4. Bc4 {+2.32/21 3} Nc6 {-3.07/27 16}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "609"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "C63"]
[Opening "Spanish"]
[Time "06:01:56"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "172"]
1. e4 e5
2. Nf3 Nc6
3. Bb5 f5
4. d3 {+0.53/20 1} fxe4 {-1.30/22 2}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "645"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "B92"]
[Opening "Sicilian"]
[Time "08:25:51"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "148"]
1. e4 c5
2. Nf3 d6
3. d4 cxd4
4. Nxd4 Nf6
5. Nc3 a6
6. Be2 e5
7. Nb3 Be7
8. Be3 O-O
9. g4 Be6
10. g5 Nfd7 {-1.08/24 2}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "679"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "B03"]
[Opening "Alekhine"]
[Time "10:50:49"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "108"]
1. e4 Nf6
2. e5 Nd5
3. d4 d6
4. c4 Nb6
5. f4 dxe5
6. fxe5 c5
7. d5 e6
8. Nc3 exd5
9. cxd5 c4
10. d6 {+1.77/21 1} Nc6 {-2.27/24 3}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "721"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "C61"]
[Opening "Spanish"]
[Time "13:44:52"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "106"]
1. e4 e5
2. Nf3 Nc6
3. Bb5 Nd4
4. Nxd4 exd4
5. O-O c6 {-1.19/24 2}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "805"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "C11"]
[Opening "French"]
[Time "18:41:29"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "98"]
1. e4 e6
2. d4 d5
3. Nc3 Nf6
4. e5 Nfd7
5. f4 c5
6. Nf3 Nc6
7. Be3 Qb6
8. Na4 Qa5+
9. c3 cxd4
10. b4 Nxb4 {-1.20/26 3}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.14"]
[Round "853"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "B03"]
[Opening "Alekhine"]
[Time "21:24:53"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "122"]
1. e4 Nf6
2. e5 Nd5
3. d4 d6
4. c4 Nb6
5. f4 dxe5
6. fxe5 c5
7. d5 e6
8. Nc3 exd5
9. cxd5 c4
10. d6 {+1.47/20 2} Nc6 {-2.16/25 7}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.15"]
[Round "915"]
[White "SF 2"]
[Black "SF"]
[Result "1-0"]
[ECO "A02"]
[Opening "Bird Opening"]
[Time "01:26:12"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "130"]
1. f4 e5
2. fxe5 d6 {-1.10/24 4}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.12"]
[Round "112"]
[White "SF"]
[Black "SF 2"]
[Result "0-1"]
[ECO "C47"]
[Opening "Four Knights"]
[Time "09:43:05"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "111"]
1. e4 e5
2. Nf3 Nc6
3. Nc3 Nf6
4. Nxe5 Nxe5 {+1.69/22 1}
[Event "SF vs SF half q5"]
[Site "MAIN"]
[Date "2020.02.13"]
[Round "462"]
[White "SF"]
[Black "SF 2"]
[Result "0-1"]
[ECO "C37"]
[Opening "KGA"]
[Time "18:32:01"]
[TimeControl "20+2"]
[Termination "normal"]
[PlyCount "123"]
1. e4 e5
2. f4 exf4
3. Nf3 g5
4. Bc4 g4 {+1.03/23 1}
5. Ne5 {-1.39/27 9} Qh4+ {+1.36/18 0}
https://rwbc-chess.de
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
-
- Posts: 550
- Joined: Tue Nov 19, 2019 8:48 pm
- Full name: Alayan Feh
Re: Time handicap tournament. LC0 and SF11
Time management is doing what it can with what it has. There is nothing to fix there, how the engine deals with its time is its business and this doesn't taint the comparison of how the version with more time do against the one with less time. Each manage its time just like it would against any other opponent.Guenther wrote: ↑Wed Feb 19, 2020 8:29 am (3)You don't understand, probably you never looked at the games at all. Even if you get twice as time over the whole game in average,
time can be accumulated completely different (and will! - also depending on the exact type of tc).
Some games will have a very few crucial moments, which decide the outcome and it is not nice, if the side with half of the time
spends here suddenly more time than the other (which for whatever reason now has less time saved or did not grasp
the crucial situation) for a very few moves and wins just because of this.
That being said, I'd avoid a too high increment-to-base ratio.
-
- Posts: 4622
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Time handicap tournament. LC0 and SF11
Yawn, yes, but you can choose time controls which guarantee more or less of exact time proportions you wanna have tested.Alayan wrote: ↑Wed Feb 19, 2020 8:48 amTime management is doing what it can with what it has. There is nothing to fix there, how the engine deals with its time is its business and this doesn't taint the comparison of how the version with more time do against the one with less time. Each manage its time just like it would against any other opponent.Guenther wrote: ↑Wed Feb 19, 2020 8:29 am (3)You don't understand, probably you never looked at the games at all. Even if you get twice as time over the whole game in average,
time can be accumulated completely different (and will! - also depending on the exact type of tc).
Some games will have a very few crucial moments, which decide the outcome and it is not nice, if the side with half of the time
spends here suddenly more time than the other (which for whatever reason now has less time saved or did not grasp
the crucial situation) for a very few moves and wins just because of this.
https://rwbc-chess.de
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
-
- Posts: 343
- Joined: Sun Aug 25, 2019 8:33 am
- Full name: .
Re: Time handicap tournament. LC0 and SF11
Yep, that's the answer when you don't have an argument.
First, even if there were some losing ones it doesn't matter much, like I explained and is probably beneficial. Second, go for it. Let's see the winning lines .
You've missed the point. If you _leave the opening book early_, you'll obviously have fewer positions than if you stay with it longer.Guenther wrote: ↑Wed Feb 19, 2020 8:29 am (2) This is complete nonsense, if the opening file contains enough lines there are no repeated games.
(actually it is even funny you claimed this, as you had several repeated games in your test matches,
it seems your opening file besides lost positions, also contains just 761 openings which puts some pressure
on randomization for 500 start positions.)
Sorry, but this makes absolutely no sense. If the engine wastes time where it shouldn't have wasted time, it's its own fault. Who cares if it's nice or not.Guenther wrote: ↑Wed Feb 19, 2020 8:29 am (3)You don't understand, probably you never looked at the games at all. Even if you get twice as time over the whole game in average,
time can be accumulated completely different (and will! - also depending on the exact type of tc).
Some games will have a very few crucial moments, which decide the outcome and it is not nice, if the side with half of the time
spends here suddenly more time than the other (which for whatever reason now has less time saved or did not grasp
the crucial situation) for a very few moves and wins just because of this.