Dylan Sharp Vs. Harvey Williamson (G4)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Jouni
Posts: 3293
Joined: Wed Mar 08, 2006 8:15 pm

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Jouni »

What kind of hardware for 60 plies? I only got some 40 :( .
Jouni
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by zullil »

Jouni wrote: Fri Oct 25, 2019 4:34 pm What kind of hardware for 60 plies? I only got some 40 :( .
20 physical cores at 3.2 GHz + 64 GB hash table. And since both players have been following Stockfish's mainline, everything of importance has been hashed at high depth already.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by zullil »

zullil wrote: Fri Oct 25, 2019 3:28 pm
jp wrote: Fri Oct 25, 2019 2:17 pm
Ovyron wrote: Fri Oct 25, 2019 10:28 am I hope Zullil is saving the Depth 60 PVs and can confirm what I say: every time he reports a score it's coming from an entirely different PV from the last report and the first thing Stockgfish dev does when starting the next iteration is realizing how dumb the moves at the tail of the variation are and has to find new ones that don't suck as badly
Of course the tails of the variations should not be trusted. It's better to go to depth 60 not because you demand that the 60th ply makes sense but because its root choice at depth 60 is probably more reliable than its root choice at depth 30.
I'm not saving any of the PV's. As I said, I could care less about anything other than the move at hand. FWIW, the PV's are like 60 plies long, but have been stable to maybe 30 plies for the last 12 plies of this game. Just waiting for someone to reveal a move that is different from Stockfish's choice. My hunch is that, if White ever does that, he loses.
I believe it's also the case that positions at an end of a Stockfish pv output need not be "quiet" positions, so the evaluations are actually coming from additional quiescent searching, which is not included in the pv output at all. In any case, as jp said, tails of pvs are typically crap.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by zullil »

zullil wrote: Fri Oct 25, 2019 3:28 pm
jp wrote: Fri Oct 25, 2019 2:17 pm
Ovyron wrote: Fri Oct 25, 2019 10:28 am I hope Zullil is saving the Depth 60 PVs and can confirm what I say: every time he reports a score it's coming from an entirely different PV from the last report and the first thing Stockgfish dev does when starting the next iteration is realizing how dumb the moves at the tail of the variation are and has to find new ones that don't suck as badly
Of course the tails of the variations should not be trusted. It's better to go to depth 60 not because you demand that the 60th ply makes sense but because its root choice at depth 60 is probably more reliable than its root choice at depth 30.
I'm not saving any of the PV's. As I said, I could care less about anything other than the move at hand. FWIW, the PV's are like 60 plies long, but have been stable to maybe 30 plies for the last 12 plies of this game. Just waiting for someone to reveal a move that is different from Stockfish's choice. My hunch is that, if White ever does that, he loses.
[d]r1bqkb1r/pp3ppp/2n3n1/2p3P1/3pN3/5N2/PP2PPBP/R1BQ1RK1 w kq - 2 10

Stockfish-dev has -1.35 at depth 63. The PV extends from White's 10th move to White's 48th move. At least the first twenty plies of the PV are unchanged from the line Stockfish has shown for many moves.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by MikeB »

zullil wrote: Fri Oct 25, 2019 10:59 pm
zullil wrote: Fri Oct 25, 2019 3:28 pm
jp wrote: Fri Oct 25, 2019 2:17 pm
Ovyron wrote: Fri Oct 25, 2019 10:28 am I hope Zullil is saving the Depth 60 PVs and can confirm what I say: every time he reports a score it's coming from an entirely different PV from the last report and the first thing Stockgfish dev does when starting the next iteration is realizing how dumb the moves at the tail of the variation are and has to find new ones that don't suck as badly
Of course the tails of the variations should not be trusted. It's better to go to depth 60 not because you demand that the 60th ply makes sense but because its root choice at depth 60 is probably more reliable than its root choice at depth 30.
I'm not saving any of the PV's. As I said, I could care less about anything other than the move at hand. FWIW, the PV's are like 60 plies long, but have been stable to maybe 30 plies for the last 12 plies of this game. Just waiting for someone to reveal a move that is different from Stockfish's choice. My hunch is that, if White ever does that, he loses.
[d]r1bqkb1r/pp3ppp/2n3n1/2p3P1/3pN3/5N2/PP2PPBP/R1BQ1RK1 w kq - 2 10

Stockfish-dev has -1.35 at depth 63. The PV extends from White's 10th move to White's 48th move. At least the first twenty plies of the PV are unchanged from the line Stockfish has shown for many moves.
It's too bad no one is trying Lc0 ;>)
Image
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by Ovyron »

zullil wrote: Fri Oct 25, 2019 10:59 pmStockfish-dev has -1.35 at depth 63. The PV extends from White's 10th move to White's 48th move. At least the first twenty plies of the PV are unchanged from the line Stockfish has shown for many moves.
So it jumped from -1.50 to -1.35 in one depth? So who knows if in the next one it'll go back to -1.50 or fall further to -1.20? That's what I call unreliable.

My eval has been more stable, after 1...g5 it has stayed at -1.18 all the game, after 5...Ne7 it briefly improved, but after covering the holes it went back to -1.18 again (surprisingly exact considering the score is coming from a very different line.) I'm currently in the process of failing high, to -1.14. I have said such evals are meaningless but at this scale -0.99 and below are the scores of "trivially drawn" positions, positions Harvey wouldn't be able to beat against unassisted Stockfish, so I'm some 15 centipawns away from that.

It's quite a shame that zullil is using 20 physical cores at 3.2 GHz + 64 GB hash to reach depth 63 and then goes and discards what he gets (what does that tell you about the value of those PVs? Are they discarded because they're worthless? :D .) It's like a scientist that has a nice laboratory but does not know how to run experiments.
Your beliefs create your reality, so be careful what you wish for.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by zullil »

zullil wrote: Fri Oct 25, 2019 10:59 pm
[d]r1bqkb1r/pp3ppp/2n3n1/2p3P1/3pN3/5N2/PP2PPBP/R1BQ1RK1 w kq - 2 10

Stockfish-dev has -1.35 at depth 63. The PV extends from White's 10th move to White's 48th move. At least the first twenty plies of the PV are unchanged from the line Stockfish has shown for many moves.
-1.56 at depth 64. A long wait for one additional iteration.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by zullil »

Ovyron wrote: Sat Oct 26, 2019 7:01 am
It's quite a shame that zullil is using 20 physical cores at 3.2 GHz + 64 GB hash to reach depth 63 and then goes and discards what he gets (what does that tell you about the value of those PVs? Are they discarded because they're worthless? :D .) It's like a scientist that has a nice laboratory but does not know how to run experiments.
On the contrary, I'm running my experiment properly. I'm trying to see if whatever you are doing produces better results than simply running Stockfish unassisted. So I'm providing the "control", so to speak.

By the way, I don't understand your interest in tails of PVs or "precise" evaluations. Whatever Stockfish is looking at 80-120 plies from the root, it's not changing the first dozen moves of the PV. Stockfish is a chess engine. It's designed to play a "best" move in the current root position. I'm content to just let it do its thing.

If this experiment reveals that whatever you are doing produces better results, I'll be surprised. But since that would be the unexpected outcome, I'm actually hoping it occurs.
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by jp »

zullil wrote: Sat Oct 26, 2019 12:15 pm If this experiment reveals that whatever you are doing produces better results, I'll be surprised. But since that would be the unexpected outcome, I'm actually hoping it occurs.
To increase the chances of that, we should ask these two to play games from positions that we suspect engines don't handle well.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Dylan Sharp Vs. Harvey Williamson (G4)

Post by zullil »

jp wrote: Sat Oct 26, 2019 12:29 pm
zullil wrote: Sat Oct 26, 2019 12:15 pm If this experiment reveals that whatever you are doing produces better results, I'll be surprised. But since that would be the unexpected outcome, I'm actually hoping it occurs.
To increase the chances of that, we should ask these two to play games from positions that we suspect engines don't handle well.
Without grant support, I would not be able to compensate the "test subjects". So you'll need to ask them. :wink: