Stockfish 13 crushed and other 1500 elo engines

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

ydebilloez
Posts: 163
Joined: Tue Jun 27, 2017 11:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz

Stockfish 13 crushed and other 1500 elo engines

Post by ydebilloez »

I configured stockfish 11/13 to play with UCI_LimitStrength and elo set to 1500. However, my engine (1200 elo) beats stockfish easily with that setting. Looking at the game, stockfish when limited blunders a lot of times and gives a piece away. So we have an issue in stockfish.

When I play against human players of around 1500, they don't get certain concepts such as i.e. passed pawn or weak vs strong bishops. They blunder pieces away from time to time but most often overlook moves in a complex situation or cannot handle attacks on different fronts at the same time... but not the kind of errors Stockfish makes. A fix for stockfish would be nice. It would be nice to include concepts according to the chess learning curve and the elo level that goes with it.

I tried with other engines that have the limit strength options but behave more like a 2000 elo engine when setting to 1500.

Here comes the problem that lead me to this. I am looking for xboard/uci engines for linux in the 1400 range, 1500 range, 1600 range (1700, 1800 will come later when my engine improves).
Yves De Billoëz @ macchess belofte chess
Once owner of a Mephisto I, II, challenger, ... chess computer.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish 13 crushed and other 1500 elo engines

Post by mvanthoor »

ydebilloez wrote: Mon Sep 20, 2021 11:03 am I configured stockfish 11/13 to play with UCI_LimitStrength and elo set to 1500. However, my engine (1200 elo) beats stockfish easily with that setting. Looking at the game, stockfish when limited blunders a lot of times and gives a piece away. So we have an issue in stockfish.

When I play against human players of around 1500, they don't get certain concepts such as i.e. passed pawn or weak vs strong bishops. They blunder pieces away from time to time but most often overlook moves in a complex situation or cannot handle attacks on different fronts at the same time... but not the kind of errors Stockfish makes. A fix for stockfish would be nice. It would be nice to include concepts according to the chess learning curve and the elo level that goes with it.

I tried with other engines that have the limit strength options but behave more like a 2000 elo engine when setting to 1500.

Here comes the problem that lead me to this. I am looking for xboard/uci engines for linux in the 1400 range, 1500 range, 1600 range (1700, 1800 will come later when my engine improves).
Are you looking for engines in those ranges when talking about FIDE ratings, or CCRL ratings?

In case of the first, it's almost impossible. FIDE is not correlated to the CCRL list. In case of looking for engines around a certain CCRL rating: just find a few engines in the CCRL list, and then try to search for it on the internet. You may possibly be able to find it using the Way Back machine if the site is down, or find alternative links through Ghünter's UCI / XBoard Chronology. I found many engines that way. (At least, Windows executables.)

You said that 17-1800 will come later, when your engine improves. I'm a bit hesitant to say this, but your engine and rewrites seem to be in the 1200-1300 range, and get stuck there. Do you have any idea why? I haven't even been able to write an engine in that range. The first version of Rustic is 1675. It only has the (bitboard) move generator, make/ummake move, MVV-LVA move sorting, and for evaluation, material count and hand-written PST's. A transposition table with TT move priority increased strength to 1815 CCRL (1840 in my own test). So even if you implement only the bare minimum, 1800 should be doable.

Is your engine missing essential features such as move sorting?
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Stockfish 13 crushed and other 1500 elo engines

Post by lkaufman »

mvanthoor wrote: Mon Sep 20, 2021 12:28 pm

Are you looking for engines in those ranges when talking about FIDE ratings, or CCRL ratings?

In case of the first, it's almost impossible. FIDE is not correlated to the CCRL list.
I think that you picked the wrong word here, FIDE and CCRL ratings are obviously highly "correlated". I suppose you mean they are not calibrated to be equal to each other. Actually I think they were intended to be equal at the 2700 or 2800 level long ago, but with faster hardware that is probably no longer accurate and there is the question of whether the human scale is more compressed than the CCRL scale, and if so by how much? My opinion is that CCRL ratings run lower than FIDE ratings, more so at low end. They are probably comparable at some super-human level like 3500, and probably at 1500 CCRL level you need to add something like 300 or so to predict the FIDE rating at a Rapid time control vs humans. But I'm open to being convinced that I'm either overrating or underrating the engines with this guideline.
Komodo rules!
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish 13 crushed and other 1500 elo engines

Post by mvanthoor »

lkaufman wrote: Tue Sep 21, 2021 1:50 am
mvanthoor wrote: Mon Sep 20, 2021 12:28 pm

Are you looking for engines in those ranges when talking about FIDE ratings, or CCRL ratings?

In case of the first, it's almost impossible. FIDE is not correlated to the CCRL list.
I think that you picked the wrong word here, FIDE and CCRL ratings are obviously highly "correlated". I suppose you mean they are not calibrated to be equal to each other.
Yes. That is what I meant. Thanks for the correction.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
ydebilloez
Posts: 163
Joined: Tue Jun 27, 2017 11:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz

Re: Stockfish 13 crushed and other 1500 elo engines

Post by ydebilloez »

mvanthoor wrote: Mon Sep 20, 2021 12:28 pm Are you looking for engines in those ranges when talking about FIDE ratings, or CCRL ratings?

In case of the first, it's almost impossible. FIDE is not correlated to the CCRL list. In case of looking for engines around a certain CCRL rating: just find a few engines in the CCRL list, and then try to search for it on the internet. You may possibly be able to find it using the Way Back machine if the site is down, or find alternative links through Ghünter's UCI / XBoard Chronology. I found many engines that way. (At least, Windows executables.)

You said that 17-1800 will come later, when your engine improves. I'm a bit hesitant to say this, but your engine and rewrites seem to be in the 1200-1300 range, and get stuck there. Do you have any idea why? I haven't even been able to write an engine in that range. The first version of Rustic is 1675. It only has the (bitboard) move generator, make/ummake move, MVV-LVA move sorting, and for evaluation, material count and hand-written PST's. A transposition table with TT move priority increased strength to 1815 CCRL (1840 in my own test). So even if you implement only the bare minimum, 1800 should be doable.

Is your engine missing essential features such as move sorting?
It took me some time to finish my belofte concepts wiki (http://macchess.internetcontact.be/belo ... oncepts.md) document that will illustrate my answer.

Yes looking for CCRL 1400-1800 elo range, not looking for Fide elo for now. lkaufman explained it as you acknowledged.

But I am looking for Linux binaries, as my main machine is linux only. (I have the windows version belofte also running through wine so I could do a try with the other windows engines, but is would be nicer to have linux engines.) Engines that correctly use a limitation in strength are fine as well. Below the list of linux engines I installed.

Code: Select all

Adamant
Aice 0.922
Alfil 12
Alouette 0.0.8
Alouette 0.0.9
Arasan
Arminius 2017-01-01
Cassandre 0.26
Cicada 0.1
Cinnamon 1.0
Counter 3.4
Crafty 23.4
Danasah 6.50
DarkTemplar 0.1
Deepov 0.4.1
Dumb 1.4
Elturco 0.89.2
Elturco 0.90.1
EnkoChess 290818
Faille 1.4.4
Fairymax 5.0b
Fimbulwinter 5.05
Fruit 2.1
Gaviota 1.0
Ges 1.32
Glaurung 2.2
Gnuchess 6.2.5
Herman 2.7.1
Hippocampe 0.4.2
Hippocampe 0.4.2.0.2
Hoichess 0.10.3
Hoichess 0.21.0.2
Invincible 2.04
Iota 0.1
K2 0.87
Kace 0.82
Komodo 9.02
Little Wing 0.2
Little Wing 0.3
MCSC 16h
Megalodon 1.0
MinimalChess 0.2
Monchester 1.0
Moustique 0.3
Nalwald 1.8
Neg 1.2
Neg 1.3
Neophyte 0.1
Olithink 5.3.3
Phalanx XXII-pg
Pigeon 1.5.1
Polyglot 2.0.4
PyChess 0.12.2
Ram 2.0
Random 0.0.9
Raven 0.30
Raven 0.40
Ronja
Rustic alpha 1.1
Sachy 0.2
Sachy 0.2.01
Samchess
Sjaakii 1.4.1
Sjeng 11.2
Sloppy 0.1.0
Sloppy 0.1.1
Sloppy 0.2.0
Sloppy 0.2.2
Spike 1.2
Squared chess 1.0
Squared chess 1.1
Stockfish
Sungorus 1.4
Toga II 3.0
Toledo Nanochess
Tscp 1.81e
Zahak 0.2.1
Zahak 0.3.0
Zoe 0.1
Zzzzzz 3.51
As you will see in the belofte chronology, I started version 2.0 as a 700 elo engine, and am now at 1210. My target is at 2000 elo, while keeping the code readable. I do not have the make/unmake move but copy board with move applied... I run 200-300K NPS in bruteforce. (Up from 20K in prior versions)
I will give a try at make/unmake at some time to see where it goes with NPS. In the upcoming version, I am replacing the material evaluation with an inline update of the score.
I have no bitboards, no TT, no MVV-LVA and my PST's are not yet tuned.
mvanthoor wrote: Mon Sep 20, 2021 12:28 pm I haven't even been able to write an engine in that range.
Everyone excels at something. I do in writing engines in the 1200 range :-)

This is all about marketing. I got remarks in the past that belofte excels in being bad. In marketing, it is not important how they speak about you, but that they speak about you. :-) No kidding, during testing, belofte was able to win against Rustic 1.0 (remember the time control bug). And beating Rustic 1.1 is one of my goals for the upcoming releases :-). You'll better watch out or belofte might even become world champion.
Yves De Billoëz @ macchess belofte chess
Once owner of a Mephisto I, II, challenger, ... chess computer.
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish 13 crushed and other 1500 elo engines

Post by mvanthoor »

ydebilloez wrote: Wed Sep 22, 2021 1:04 am This is all about marketing. I got remarks in the past that belofte excels in being bad. In marketing, it is not important how they speak about you, but that they speak about you. :-) No kidding, during testing, belofte was able to win against Rustic 1.0 (remember the time control bug).
Not only that; but I see Rustic's current development version lose games it should have drawn, and draw games it should have won. I'm not too worried about that, because know that this is caused by a massive lack of chess knowledge. When I start fleshing out the evaluation function, this will get much better quite quickly.
And beating Rustic 1.1 is one of my goals for the upcoming releases :-). You'll better watch out or belofte might even become world champion.
I've been watching your engine and rewrites for some time. See if you can get out of the 1200-1300 range and into the 1500-1800 range. No bugs, average speed, alpha/beta and qsearch, MVV-LVA sorting, material counting and a decent set of PST's is enough to reach at least CCRL Blitz 1500. If your engine is very fast and has good PST's, 1650-1700 is possible.

Some day I hope to see your engine beating the older Rustic versions.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
ydebilloez
Posts: 163
Joined: Tue Jun 27, 2017 11:01 pm
Location: Lubumbashi
Full name: Yves De Billoëz

Re: Stockfish 13 crushed and other 1500 elo engines

Post by ydebilloez »

ydebilloez wrote: Mon Sep 20, 2021 11:03 am I configured stockfish 11/13 to play with UCI_LimitStrength and elo set to 1500. However, my engine (1200 elo) beats stockfish easily with that setting. Looking at the game, stockfish when limited blunders a lot of times and gives a piece away. So we have an issue in stockfish.

...

I am looking for xboard/uci engines for linux in the 1400 range, 1500 range, 1600 range (1700, 1800 will come later when my engine improves).
These two questions remain unanswered.

1) Have others observed the problems of UCI_LimitStrength in stockfish?
2) Can we configure stockfish or any other engine to play at a certain level? Some have hinted at Nodes as stockfish is deterministic when limiting on the number of nodes.

The problem has been asked several times in different ways (reference engines) but I have not seen any answer.
Yves De Billoëz @ macchess belofte chess
Once owner of a Mephisto I, II, challenger, ... chess computer.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Stockfish 13 crushed and other 1500 elo engines

Post by lkaufman »

ydebilloez wrote: Sun Dec 19, 2021 3:53 pm
ydebilloez wrote: Mon Sep 20, 2021 11:03 am I configured stockfish 11/13 to play with UCI_LimitStrength and elo set to 1500. However, my engine (1200 elo) beats stockfish easily with that setting. Looking at the game, stockfish when limited blunders a lot of times and gives a piece away. So we have an issue in stockfish.

...

I am looking for xboard/uci engines for linux in the 1400 range, 1500 range, 1600 range (1700, 1800 will come later when my engine improves).
These two questions remain unanswered.

1) Have others observed the problems of UCI_LimitStrength in stockfish?
2) Can we configure stockfish or any other engine to play at a certain level? Some have hinted at Nodes as stockfish is deterministic when limiting on the number of nodes.

The problem has been asked several times in different ways (reference engines) but I have not seen any answer.
The levels 15 and up (=1500 elo and up) in Komodo Dragon 2.5 are primarily node-based and should not make blunders that a one ply search would be expected to avoid; lower levels will make increasingly large blunders as the level drops below 14. I believe the levels are at least in the ballpark of how humans with such elo ratings would perform in Rapid (15' + 10") chess.
Komodo rules!
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish 13 crushed and other 1500 elo engines

Post by mvanthoor »

ydebilloez wrote: Mon Sep 20, 2021 11:03 am When I play against human players of around 1500, they don't get certain concepts such as i.e. passed pawn or weak vs strong bishops. They blunder pieces away from time to time but most often overlook moves in a complex situation or cannot handle attacks on different fronts at the same time... but not the kind of errors Stockfish makes.
Elo-comparisons are difficult. I've often played against engines in the 1200-1400 Elo CCRL range, and at around 1500 Elo, they become hard to beat for me. Even when playing against many of the 1200-1300 Elo engines I often have the feeling that they are MUCH stronger than a 1200 Elo human.

Someone once posted that the CCRL list is compressed, and calibrated at 2800 Elo. The formula to convert from CCRL Elo to FIDE Elo would be (according to that post) FIDE = (CCRL * 0.7) + 840.

That would indeed set the calibration to 2800 Elo, because (2800 CCRL * 0.7) + 840 = 2800 FIDE.

If that is (still) correct, a 1200 CCRL engine would be 1680 Elo. As UCI_LimitStrength is intended to simulate human / FIDE Elo, it would explain why your 1200 CCRL defeats Stockfish set to UCI_LimitStrength = 1500. Try it at 1680 - 1700.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
User avatar
lithander
Posts: 880
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Stockfish 13 crushed and other 1500 elo engines

Post by lithander »

I play a bit on chess.com for fun and to improve. They have a lot of bots there that are supposed to not only play like humans in general but like very specific humans such as chess streamers. (https://www.chess.com/play/computer) But to be honest at my level (~1000 rating) I don't think they play like a human at all. It just seems like playing a strong computer that has a 10% chance of making a random move. Blunders that are obvious that I can't believe any human of whatever skill-level would make them. Like moving a piece just to lose it immediately to a pawn with no compensation.

Compare that to a game against a human opponent I played today:

[pgn][Event "Live Chess"]
[Site "Chess.com"]
[Date "2021.12.19"]
[Round "?"]
[White "Hayk0202"]
[Black "bitsquid"]
[Result "0-1"]
[ECO "A40"]
[WhiteElo "1043"]
[BlackElo "1067"]
[TimeControl "1800"]
[EndTime "2:32:11 PST"]
[Termination "bitsquid won by checkmate"]

1. d4 Nc6 2. d5 Nb4 3. c4 Na6 4. a3 Nc5 5. b4 Na6 6. Be3 d6 7. b5 Nb8 8. g3 a6
9. Nc3 c5 10. bxc6 bxc6 11. Qa4 Bd7 12. Bg2 c5 13. Qb3 g6 14. Qb7 Bg7 15. Bd2
Qa5 16. Rc1 Qxa3 17. Qxa8 Bxc3 18. Bxc3 Qxc1# 0-1[/pgn]

While playing I have a pretty strong sense of what the other guy is thinking and planning.

When he pushed the queens pawn and attacked the knight I was already "out of my opening" and just moved the knight forward and to safety. He continued chasing the knight and after 8 moves we were in a very strange position where I had nothing developed and his pieces where rushing in on blacks queen side. I tried to push my own pawns and resist him but forgot about en-passent which was the first real mistake in the game (but a typical human one to make!)

In move 11 he brought the queen out trying to pressure my queen side and when I lost confidence that I could stop him from invading there and saw how he would threaten and take my rook I failed to find a good response. Instead (typical human I think) I thought well I'll just start to counter-attack and pin your knight to your own rook. Move 13. g6 was in preparation of that idea and considered a blunder I'm sure but unlike the random blunders of engines it was one made with a plan in mind. So he did what I expected and 14. Qb7 attacks my rook and it's considered a "great" move by chess.com analysis. However I'm not sure why it's great at that point? Nothing too complicated, easy enough to see this coming. And now I do the counter attack on his knight/rook.

Now another typical human thing (in my experience) happens and that's that we both focus on this new front. Pile up pieces threatening and protecting the knight. He says "okay, let's trade. I'll win in the end!" and his queen takes my rook. My bishop takes the knight. His bishop takes my bishop (move 18) almost instantly because that's the material exchange he was tracking in his head. When I placed my queen there originally I just wanted another attacker on the knight. But when the exchanged happened I was hopeful already: "Please take with the bishop!" And he did. It just felt so natural that I was anticipating this error, hoping for it (bot's blunders always are a surprise coming out of nowhere)

And all the while after the queen took my rook there was a mate in two looming! And any AI with a few plies deep search would have focused on preventing that even though that would certainly lost the game at that point. Only a human can try to distract another human the way I (almost accidentally) did. So, I was laughing loud enough in the end that my wife asked what's up and I explained it to her and she got the humor albeit not really playing chess. Never would something entertaining like that have happened against a bot.

Or would it? If you know an engine that is modeled around replicating the human traits at work in this game I would really love to play it.
Last edited by lithander on Sun Dec 19, 2021 6:40 pm, edited 4 times in total.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess