From what I read, orqha/dolphin beats Elmo by about the same margin as AlphaZero did, so what is the basis for your statement that it "is no AlphaZero"? It's different, sure, but apparently just as strong, without using a GPU. I've played lots of games with it, and while I can usually win if I'm careful with rook and bishop handicap, I almost invariably lose with rook and lance handicap. Against the top human pros, I've generally scored well with just a rook handicap. I would love to see a top pro take a bishop handicap from dolphin.aphirst wrote: ↑Fri Mar 20, 2020 9:43 am
As you can see, I get lost very quickly, so there's currently no chance of me working out what the equivalent would be for FIDE chess. I'd be very interested to see this tried out though, as NNUE-shogi engines are now obscenely powerful - orqha1018+dolphin1 is no AlphaZero but it's still a monster.
The Stockfish of shogi
Moderators: hgm, Rebel, chrisw
-
- Posts: 5960
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: The Stockfish of shogi
Komodo rules!
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
Re: The Stockfish of shogi
If its results are as good on much weaker hardware, then it's actually much stronger than A0.
There were complaints from the Elmo/shogi community that were similar to the complaints from the SF/chess community about the conditions of the Elmo-A0 games too.
-
- Posts: 8
- Joined: Thu Mar 19, 2020 3:24 pm
- Full name: Adam Hirst
Re: The Stockfish of shogi
lkaufman? As in, Larry Kaufman? I'm honoured, I'm sure.
On qhapaq's site the listed ELO for dolphin1/orqha1018 is 4393. AlphaZero[1] Shogi's ELO is apparently around 4400, and DeepMind's MuZero[2] apparently has a shogi ELO of about 4700. I had to eyeball those last two since I can't see an explict ELO figure given - presumably I'm overlooking something. If someone has exact figures, I hope they can correct me.
[1] https://arxiv.org/abs/1712.01815
[2] https://arxiv.org/abs/1911.08265
(As an aside, MuZero seems to not have the rules pre-programmed. Interesting! I hadn't heard of it until I started searching for AlphaZero ELO.)
I'm certainly not qualified to judge the fairness of the ELO evaluations, and I'm aware that's a fiercely-contended (and rightfully so) aspect, but this would seem to suggest that orqha1018/dolphin1 is at least approximately level with the 2017 AlphaZero publication, but currently outclassed by MuZero by a similar degree to how AlphaZero outclassed Elmo (et al). I will of course defer to the experts - that's the reason I came here, after all.
It's entirely possible that my understanding of its strength is outdated. I was under the impression that the AlphaZero published ELO was still higher even than orqha1018/dolphin1, though with a much narrower gap.lkaufman wrote:what is the basis for your statement that it "is no AlphaZero"? It's different, sure, but apparently just as strong, without using a GPU
On qhapaq's site the listed ELO for dolphin1/orqha1018 is 4393. AlphaZero[1] Shogi's ELO is apparently around 4400, and DeepMind's MuZero[2] apparently has a shogi ELO of about 4700. I had to eyeball those last two since I can't see an explict ELO figure given - presumably I'm overlooking something. If someone has exact figures, I hope they can correct me.
[1] https://arxiv.org/abs/1712.01815
[2] https://arxiv.org/abs/1911.08265
(As an aside, MuZero seems to not have the rules pre-programmed. Interesting! I hadn't heard of it until I started searching for AlphaZero ELO.)
I'm certainly not qualified to judge the fairness of the ELO evaluations, and I'm aware that's a fiercely-contended (and rightfully so) aspect, but this would seem to suggest that orqha1018/dolphin1 is at least approximately level with the 2017 AlphaZero publication, but currently outclassed by MuZero by a similar degree to how AlphaZero outclassed Elmo (et al). I will of course defer to the experts - that's the reason I came here, after all.
-
- Posts: 5960
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: The Stockfish of shogi
Yes, I'm Larry. As you say, the elos quoted for AlphaZero and Dolphin are virtually the same, but I think that the Dolphin rating may be on less than optimal hardware, I don't think it was based on anything like a 32 or 64 core threadripper, so it may be that given the best pc you can buy for around $10k or so Dolphin is already closer to MuZero (which I hadn't heard about) than to AlphaZero.aphirst wrote: ↑Fri Mar 20, 2020 8:02 pm lkaufman? As in, Larry Kaufman? I'm honoured, I'm sure.
It's entirely possible that my understanding of its strength is outdated. I was under the impression that the AlphaZero published ELO was still higher even than orqha1018/dolphin1, though with a much narrower gap.lkaufman wrote:what is the basis for your statement that it "is no AlphaZero"? It's different, sure, but apparently just as strong, without using a GPU
On qhapaq's site the listed ELO for dolphin1/orqha1018 is 4393. AlphaZero[1] Shogi's ELO is apparently around 4400, and DeepMind's MuZero[2] apparently has a shogi ELO of about 4700. I had to eyeball those last two since I can't see an explict ELO figure given - presumably I'm overlooking something. If someone has exact figures, I hope they can correct me.
[1] https://arxiv.org/abs/1712.01815
[2] https://arxiv.org/abs/1911.08265
(As an aside, MuZero seems to not have the rules pre-programmed. Interesting! I hadn't heard of it until I started searching for AlphaZero ELO.)
I'm certainly not qualified to judge the fairness of the ELO evaluations, and I'm aware that's a fiercely-contended (and rightfully so) aspect, but this would seem to suggest that orqha1018/dolphin1 is at least approximately level with the 2017 AlphaZero publication, but currently outclassed by MuZero by a similar degree to how AlphaZero outclassed Elmo (et al). I will of course defer to the experts - that's the reason I came here, after all.
Komodo rules!
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
Re: The Stockfish of shogi
You should be very cautious interpreting what DM claim. There's been endless discussion on this board and outside it about the conditions of AZ's games against SF8. What is less known is that similar complaints were made by computer Shogi people about the conditions of the Elmo-AZ games.aphirst wrote: ↑Fri Mar 20, 2020 8:02 pm On qhapaq's site the listed ELO for dolphin1/orqha1018 is 4393. AlphaZero[1] Shogi's ELO is apparently around 4400, and DeepMind's MuZero[2] apparently has a shogi ELO of about 4700. I had to eyeball those last two since I can't see an explict ELO figure given - presumably I'm overlooking something. If someone has exact figures, I hope they can correct me.
If dolphin1/orqha1018 played other Shogi engines in "fairer" conditions, that alone would put it higher than AZ.
You should not look at [1] either. If you want to read it, at least look at the published paper, which has big differences.
-
- Posts: 8
- Joined: Thu Mar 19, 2020 3:24 pm
- Full name: Adam Hirst
Re: The Stockfish of shogi
I'm aware, hence the disclaimers I put into my previous post.jp wrote:You should be very cautious interpreting what DM claim.
OHH, I wasn't aware that the arxiv preprint was substantially different. Do I have the correct link [1] for the paper now? If so, I can make a clarification edit to my previous post.jp wrote:You should not look at [1] either. If you want to read it, at least look at the published paper, which has big differences.
[1] https://science.sciencemag.org/content/362/6419/1140
EDIT: It seems I'm past the cooldown for being able to edit my previous post anyway. I suspect the "report" feature is inappropriate for this. I apologise for any confusion this will provide to future readers.
-
- Posts: 188
- Joined: Sun Dec 25, 2016 4:59 pm
Re: The Stockfish of shogi
I don't see any indication of a 300 point improvement for MuZero over AlphaZero in the MuZero paper, for the following reasons (in ascending order of strength, burying the lede )aphirst wrote: ↑Fri Mar 20, 2020 8:02 pm
On qhapaq's site the listed ELO for dolphin1/orqha1018 is 4393. AlphaZero[1] Shogi's ELO is apparently around 4400, and DeepMind's MuZero[2] apparently has a shogi ELO of about 4700. I had to eyeball those last two since I can't see an explict ELO figure given - presumably I'm overlooking something. If someone has exact figures, I hope they can correct me.
First, the authors only say that MuZero "matched" the performance of AlphaZero in Shogi, which seems very unlike them if they actually exceeded AlphaZero's performance by 300 points.
Second, AlphaZero's rating is graphed along with MuZero's (it's the horizontal orange line) and I see no indication at all that MuZero's blue line ever exceeded the AlphaZero orange line at all, much less by 300 points.
Third, the json for the graphs is published with the paper (board_game_elos.json in ancillary files). It shows that in Shogi, the AlphaZero line was 4666, while MuZero maxed out at 4646.
-
- Posts: 8
- Joined: Thu Mar 19, 2020 3:24 pm
- Full name: Adam Hirst
Re: The Stockfish of shogi
Evidently I eyeballed it wrong. Thanks for providing clarification!
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am