Page 6 of 8

Re: Komodo vs. Larry K on chess.com

Posted: Mon Sep 02, 2019 2:37 pm
by jp
Ovyron wrote: Mon Sep 02, 2019 9:43 am Interesting. For more than 3 years I have a 0.19 hard limit for a Stockfish scale advantage, that is, I haven't been able to break it and if Stockfish shows >0.19 it means black has already made some lousy moves that allowed white an edge this big. So if "0.30 SF" is considered normal then I guess people consider lousy black defenses normal.
No GM thinks differences of 0.1 or 0.2 in SF evals are meaningful.

Re: Komodo vs. Larry K on chess.com

Posted: Mon Sep 02, 2019 5:32 pm
by leavenfish
lkaufman wrote: Sun Sep 01, 2019 6:08 pm
jp wrote: Sun Sep 01, 2019 1:09 pm
Marcus9 wrote: Sun Sep 01, 2019 1:07 pm They are rare, and most are with openings that benefit black, such as king's gambit, I'm talking about engines with similar elo
I looked at the TCEC archive and among Komodo, Houdini, Leela, SF, there are Black wins and out of tame openings that they evaluate as even.
Of course if the opening eval is near zero (I suppose that's what you mean by "even"), then color means nothing. But with normal openings played by strong grandmasters, White is always noticeably better after 6-8 moves or so. When the openings are limited to those seen in top GM play, I suspect that the White to Black win ratio between the top engines in long games on big hardware is huge. Giving White more time to increase wins is a reasonable idea, especially for human play (the Armageddon idea), but if the time ratio gets too big, like 10 to 1, it seems rather silly.
Chess being a game of mistakes...and openings having little to do with the mistakes you see in Super GM play...the logical (non-radical) approach to less draws is...less time for them to think.

Increment for an older person like myself is great (although the 3-5 sec delay in the normal weekend Swiss is terribly small) but +30 added like I see these guys getting - ON TOP of there 1 game a day and slow time control (and openings worked out already)...it's just silly. Even a 5 sec delay (not adding 30 sec!) should prevent flagging due to too little time. If one is that low on time, it is often because their position is already desperate so +30 sec is really largely pointless...a really bad invention.

Being a game of mistakes...and at that level, after the opening, just tighten up the time controls (no, not blitz level) and I think a lot of the perceived problems would work themselves out.

And I've been thinking about the ideas of giving black more points for a win. No, in a closed tournament with equal blacks/whites...that might have the unintended consequence of encouraging WHITE to play less dynamically! Think about it...

Re: Komodo vs. Larry K on chess.com

Posted: Mon Sep 02, 2019 5:41 pm
by lkaufman
Ovyron wrote: Mon Sep 02, 2019 9:43 am
lkaufman wrote: Mon Sep 02, 2019 5:03 am I would say a normal White edge is maybe 0.25 Komodo scale, 0.30 Stockfish scale, 0.40 Lc0 scale.
Interesting. For more than 3 years I have a 0.19 hard limit for a Stockfish scale advantage, that is, I haven't been able to break it and if Stockfish shows >0.19 it means black has already made some lousy moves that allowed white an edge this big. So if "0.30 SF" is considered normal then I guess people consider lousy black defenses normal.
Sf changed its scale rather noticeably in the past year or so. What used to be a +.20 might now get close to a .30.

Re: Komodo vs. Larry K on chess.com

Posted: Mon Sep 02, 2019 5:42 pm
by lkaufman
jp wrote: Mon Sep 02, 2019 2:37 pm
Ovyron wrote: Mon Sep 02, 2019 9:43 am Interesting. For more than 3 years I have a 0.19 hard limit for a Stockfish scale advantage, that is, I haven't been able to break it and if Stockfish shows >0.19 it means black has already made some lousy moves that allowed white an edge this big. So if "0.30 SF" is considered normal then I guess people consider lousy black defenses normal.
No GM thinks differences of 0.1 or 0.2 in SF evals are meaningful.
I'm a GM and I think a 0.2 diff in SF eval is very meaningful. 0.1 not so much...I'm sure I'm not the only GM with this opinion.

Re: Komodo vs. Larry K on chess.com

Posted: Mon Sep 02, 2019 5:56 pm
by jp
lkaufman wrote: Mon Sep 02, 2019 5:42 pm I think a 0.2 diff in SF eval is very meaningful. 0.1 not so much...I'm sure I'm not the only GM with this opinion.
That's interesting. I'm a bit surprised you find it meaningful. Maybe your involvement in computer chess makes you think differently from others. The usual talk is almost opposite. Amateurs with their engines will make noise about small differences in evaluation, and GMs will say those differences should be ignored.

Here's something that makes me doubt the meaningfulness. When you watch an engine analysing, you'll see, even when it stays with the same PV, the evals changing often by 0.2 or more as it calculates; or it'll keep switching between different moves that are within 0.2 or 0.3 of each other.

Re: Komodo vs. Larry K on chess.com

Posted: Mon Sep 02, 2019 7:59 pm
by leavenfish
Seriously Larry - you think 1/10th of a pawn (.2) is "Very meaningful" compared to .1 in Stockfish? Being so close to 'zero' (and perhaps given their large contempt for play purposes) one might think it closer to negligible than very meaningful. And are you speaking of pure evaluation or chances in OTB play or even computer vs computer play?

I am watching Komodo MTCS (not Stockfish so, different animal) churn away right now on a position at 3 pv and 3 core.

1 .02
2 .09 (okay, not a full 1/10th...)
3 .39

Now to me, the difference between 1 and 2 is really tiny (negligible really - might even change with an eval tweaking in your next (or a previous) iteration)...given their relation to .39, well that I notice more...and would be more likely to toss it aside.

Re: Komodo vs. Larry K on chess.com

Posted: Tue Sep 03, 2019 12:22 am
by lkaufman
leavenfish wrote: Mon Sep 02, 2019 7:59 pm Seriously Larry - you think 1/10th of a pawn (.2) is "Very meaningful" compared to .1 in Stockfish? Being so close to 'zero' (and perhaps given their large contempt for play purposes) one might think it closer to negligible than very meaningful. And are you speaking of pure evaluation or chances in OTB play or even computer vs computer play?

I am watching Komodo MTCS (not Stockfish so, different animal) churn away right now on a position at 3 pv and 3 core.

1 .02
2 .09 (okay, not a full 1/10th...)
3 .39

Now to me, the difference between 1 and 2 is really tiny (negligible really - might even change with an eval tweaking in your next (or a previous) iteration)...given their relation to .39, well that I notice more...and would be more likely to toss it aside.
I'm not talking about a score of 0.2, but a score difference of 0.2, so Contempt doesn't enter into the discussion. If the top move is evaluated as 0.2 higher than the second move, the probability that it is really the better move is very high, maybe 90 % or so. That is true regardless of who is playing. Of course if the players are weak players, then playing the better move isn't likely to matter much, they make too many errors.

Re: Komodo vs. Larry K on chess.com

Posted: Tue Sep 03, 2019 11:08 am
by Vinvin
lkaufman wrote: Fri Aug 30, 2019 6:31 am
Vinvin wrote: Fri Aug 30, 2019 2:53 am
lkaufman wrote: Thu Aug 29, 2019 8:01 pm ... since the top shogi engines hadn't yet proven overwhelming superiority vs. the top humans in the way they have in chess.
...
Hi Larry,
Do you have some news about top human vs computers in shogi ?
Was there some matches in last years ?
How top humans evaluate top computers ?

I read here https://en.wikipedia.org/wiki/Computer_ ... n_3_(2014) : in 2014 top engines were already stronger than top humans.
And from here : https://en.wikipedia.org/wiki/Computer_shogi#Floodgate : computers has improved by 1300 (4550-3250) rating points since 2014.

Thanks,
Vincent
Yes, time flies, I hadn't realized it was already five years since the engines clobbered the elite pros in a match. We just never had a serious match (like say 5 games with 6 hours per side or more time limit) between the number 1 pro and an engine the way we did with chess computers and Kasparov. I'm confident that the top engine could now give the top human pro lance handicap and win a match, but lance handicap is much less than f7 handicap in chess. In other words, the gap is now huge, but much less than in chess. When you talk about gaining 1300 elo in shogi, you have to compare that to what the elo gain would be in the same interval in chess with only wins and losses rated, it would probably be in the same ballpark.
It's probably already too late to have an interesting match between the very top professional players and the top computers.
Even with the constraint of 1 CPU (24 cores) and 1 hour for the computer (vs 6 hours for the human), the engine would be 1000 to 1500 Elo over the human world champion.
Who would want to play this ?

BTW, do top humans players use engines to train ?
Are there some unofficial matches between computer and human ?

In the milestone section, I find this :

https://en.wikipedia.org/wiki/Computer_shogi#Milestones

Code: Select all

    2005: at the Amateur Ryuo tournament, program Gekisashi defeated Eiji Ogawa in a 40-minute game of the first knockout round.
    2005: Program Gekisashi defeated amateur 6-dan Masato Shinoda in a 40-minute exhibition game.
    2007: highest rating for a computer on Shogi Club 24 is 2744 for YSS.[72]
    2008: May, computer program Tanase Shogi beat Asahi Amateur Meijin title holder Yukio Kato. 75 moves played in a 15-minute exhibition game.
    2008: May, computer program Gekisashi beat Amateur Meijin Toru Shimizugami. 100 moves played in a 15-minute exhibition game.[73]
    2008: November, Gekisashi beat Amateur Meijin Shimizugami in a 1-hour game with 1-minute byoyomi.[74]
    2010: October, first time a computer beat a shogi champion. Akara beat the women's Osho champion Shimizu in 6 hours and 3 minutes.
    2011: May, highest rated player on Shogi Club 24 is computer program Ponanza, rated 3211.[citation needed]
    2011: December, highest rated player on Shogi Club 24 is computer program Bonkras, rated 3364 after 2116 games.[citation needed]
    2012: January, Bonkras defeated the 1993 Meijin Yonenaga. They played 113 moves with main time 3 hours and then 1 minute per move.[6]
    2013: 20 April, GPS Shogi defeated Hiroyuki Miura, ranked 15. Game was 102 moves with main time 4 hours then 1 minute per move.[75]
    2013: 12 May, highest rated player on Shogi Club 24 is computer program Ponanza, rated 3453.[citation needed]
    2014: 12 April, Ponanza defeated Yashiki Nobuyuki, ranked 12. Game was 130 moves with main time 5 hours then 1 minute per move.[76]
    2016: 10 April, Ponanza defeated Takayuki Yamasaki, 8-dan. Game was 85 moves. Takayuki used 7 hours 9 minutes.[77]
    2017: 20 May, Ponanza defeated Meijin Amahiko Satō in 2 games.[78][79]
    2017: 5 December, Google DeepMind's AlphaZero convincingly defeats 2017 World Computer Shogi Champion program elmo [80][81]
Vincent

Re: Komodo vs. Larry K on chess.com

Posted: Tue Sep 03, 2019 6:27 pm
by lkaufman
I suppose the 2017 two game match will have to do as the final competitive match between top engine and top pro with no handicap. Of course there are plenty of unofficial games between pros and engines. For me the interesting question is whether a super engine can give a top pro bishop handicap in a serious match successfully. Bishop handicap is a huge part of shogi history. Every year, the top pro played (plays? not sure if it continues) the top amateur a bishop handicap game. I don't think we're at that point yet, for one thing NN engines are poor at giving handicaps unless they train that way, so we're talking about the top normal engine. But that should be the goal for shogi programmers. The hard part will be getting the pros to play with a handicap on the record, as losing might be too embarrassing. In chess it hasn't stopped MVL and Nakamura from playing with handicaps.

Re: Komodo vs. Larry K on chess.com

Posted: Wed Sep 04, 2019 12:49 am
by Uri Blass
lkaufman wrote: Tue Sep 03, 2019 12:22 am
leavenfish wrote: Mon Sep 02, 2019 7:59 pm Seriously Larry - you think 1/10th of a pawn (.2) is "Very meaningful" compared to .1 in Stockfish? Being so close to 'zero' (and perhaps given their large contempt for play purposes) one might think it closer to negligible than very meaningful. And are you speaking of pure evaluation or chances in OTB play or even computer vs computer play?

I am watching Komodo MTCS (not Stockfish so, different animal) churn away right now on a position at 3 pv and 3 core.

1 .02
2 .09 (okay, not a full 1/10th...)
3 .39

Now to me, the difference between 1 and 2 is really tiny (negligible really - might even change with an eval tweaking in your next (or a previous) iteration)...given their relation to .39, well that I notice more...and would be more likely to toss it aside.
I'm not talking about a score of 0.2, but a score difference of 0.2, so Contempt doesn't enter into the discussion. If the top move is evaluated as 0.2 higher than the second move, the probability that it is really the better move is very high, maybe 90 % or so. That is true regardless of who is playing. Of course if the players are weak players, then playing the better move isn't likely to matter much, they make too many errors.
The term better move is not defined.
Better move based on what?
You can decide based on analysis by the computer for higher depth but the computer may be biased by the same wrong evaluation.

For example it is going to insist that Kxh1 is better than Ra2 in the following position but I consider both moves as equal.

[d]7k/8/8/1p1p1p1p/pPpPpPpP/P1P1P1P1/8/R5Kb w - - 0 1