An engine for beginners?

bob · Post by **bob** » Fri Mar 04, 2011 8:33 pm

Don wrote:
bob wrote:
Don wrote:
kaissa wrote:Hello,

I need a chess engine that will play so weak that a beginner should have a chance once in a while. It should not play like 2600 for some time and then drop a rook. I need it to play at 1000-1400 all the way. Any chance?

Thanks for your time,
If you want it to play like a 1400 player, it must make a couple of blunders per game in addition to playing something like a 2 or 3 ply search.

However, programs have to much good evaluation to similar the crappy positional play of a weaker player. So it would be playing more like a much stronger player who is not thinking very clearly tactically and making blunders once in a while!

I think you can simulate this somewhat by randomizing the evaluation function. I think Bob Hyatt has some good ideas on how to realistically weaken the play.
Crafty has this feature built in, but it is not yet clear how well (or actually how poorly) it works. It is quite hard to take a program with a lot of knowledge, and a very fast / deep search, and make it play like a beginner. A beginner is characterized by two distinct features.. (1) he is tactically weak and is not going to be tossing out mates in 15 or 20 and (b) he is positionally weak and won't understand pawn structure, king safety, mobility, space, etc.

Some previous attempts have relied on depth limits. But what you get is a program with GM-like knowledge and idiot-like tactics. That doesn't feel like a beginner because it won't wreck its pawn structure, and so forth.

In Crafty, the "skill command" does both. It introduces a random component into the evaluation which slowly eliminates the concepts of material value and positional value. And at the same time, it reduces the search depth (by slowing the program down, not by making it move quicker) so that we don't see the "beal effect" of a random evaluation still playing good chess.

The only problem is, it is very difficult to calibrate the skill command to Elo. At one point, skill 70 would reduce elo by 200. Skill 50 would reduce it by 400. But that is still way too strong for a beginner. Skill 1 should play like an absolute idiot, but I don't have any opponents weak enough to get a feel for exactly how weak it is...

Would be an interesting experiment to do. And would be a nice publication to produce something that is hardware independent while being able to say something like "skill 1450" and get a 1450-level opponent no matter what the hardware platform.

Not so easy, however. But interesting.
I have pondered this and I think the right approach may be to write a separate evaluation function for weaker play. It could have a rudimentary form of pawn structure but nothing sophisticated. It could have just a little bit of king safety and some generally missing terms that are concepts of more sophisticated players. And it could have the classical 1,3,5,9 piece terms since beginners primarily use those. So it might not be aware of the bishop pair for instance and make some positionally bad trades.

Evaluation is so huge in Komodo that I could lower the strength by 300+ ELO just by building a simple but not ridiculous evaluation function and it might simulate weaker players betters. In addition to low depths, we could use overly aggressive LMR or LMR without re-search on later ply and it might even make blunders on it's own without my having to artificially induce them.

I suspect that won't work. The searches are so large, they are almost self-correcting. You would think a purely random eval would toss pieces right and left. But not if the depth is great and pruning is limited... I don't want the thing to make a 2-move blunder and then sniff out a mate in 15 a couple of moves later, as an example...

yanquis1972 · Post by **yanquis1972** » Fri Mar 04, 2011 8:42 pm

Don wrote:
yanquis1972 wrote:
Don wrote:
bob wrote:
Don wrote:
kaissa wrote:Hello,

I need a chess engine that will play so weak that a beginner should have a chance once in a while. It should not play like 2600 for some time and then drop a rook. I need it to play at 1000-1400 all the way. Any chance?

Thanks for your time,
If you want it to play like a 1400 player, it must make a couple of blunders per game in addition to playing something like a 2 or 3 ply search.

However, programs have to much good evaluation to similar the crappy positional play of a weaker player. So it would be playing more like a much stronger player who is not thinking very clearly tactically and making blunders once in a while!

I think you can simulate this somewhat by randomizing the evaluation function. I think Bob Hyatt has some good ideas on how to realistically weaken the play.
Crafty has this feature built in, but it is not yet clear how well (or actually how poorly) it works. It is quite hard to take a program with a lot of knowledge, and a very fast / deep search, and make it play like a beginner. A beginner is characterized by two distinct features.. (1) he is tactically weak and is not going to be tossing out mates in 15 or 20 and (b) he is positionally weak and won't understand pawn structure, king safety, mobility, space, etc.

Some previous attempts have relied on depth limits. But what you get is a program with GM-like knowledge and idiot-like tactics. That doesn't feel like a beginner because it won't wreck its pawn structure, and so forth.

In Crafty, the "skill command" does both. It introduces a random component into the evaluation which slowly eliminates the concepts of material value and positional value. And at the same time, it reduces the search depth (by slowing the program down, not by making it move quicker) so that we don't see the "beal effect" of a random evaluation still playing good chess.

The only problem is, it is very difficult to calibrate the skill command to Elo. At one point, skill 70 would reduce elo by 200. Skill 50 would reduce it by 400. But that is still way too strong for a beginner. Skill 1 should play like an absolute idiot, but I don't have any opponents weak enough to get a feel for exactly how weak it is...

Would be an interesting experiment to do. And would be a nice publication to produce something that is hardware independent while being able to say something like "skill 1450" and get a 1450-level opponent no matter what the hardware platform.

Not so easy, however. But interesting.
I have pondered this and I think the right approach may be to write a separate evaluation function for weaker play. It could have a rudimentary form of pawn structure but nothing sophisticated. It could have just a little bit of king safety and some generally missing terms that are concepts of more sophisticated players. And it could have the classical 1,3,5,9 piece terms since beginners primarily use those. So it might not be aware of the bishop pair for instance and make some positionally bad trades.

Evaluation is so huge in Komodo that I could lower the strength by 300+ ELO just by building a simple but not ridiculous evaluation function and it might simulate weaker players betters. In addition to low depths, we could use overly aggressive LMR or LMR without re-search on later ply and it might even make blunders on it's own without my having to artificially induce them.
what you describe in the first paragraph is exactly what chessmaster does & has been doing for years upon years.
I was not aware of that. So you are saying chessmaster slows the pace down significantly and also has a separate evaluation function? Are you sure? By separate evaluation function I don't just mean a switch that turns off pawn structure and king safety, but a complete rethink with more naive concepts.

If that's what it does, I think it's the right idea.

Another twist on this is that you could also interpolate the naive evaluation with the strong evaluation. Depending on settings you could elegantly scale the playing quality from poor to strong, but using 90% poor evaluation, 10% good evaluation. Having any concept represented at all is half the battle so mixing in 10% of the good will have a bigger impact that then next 10% so I believe this interpolation should not be linear.

Don

well, as i understand it, which is very rudimentarily, yes. there are specific sliders for the eval functions, piece values & speed, ranging from 1 to 100.

Don · Post by **Don** » Fri Mar 04, 2011 9:05 pm

bob wrote:
Don wrote:
bob wrote:
Don wrote:
kaissa wrote:Hello,

I need a chess engine that will play so weak that a beginner should have a chance once in a while. It should not play like 2600 for some time and then drop a rook. I need it to play at 1000-1400 all the way. Any chance?

Thanks for your time,
If you want it to play like a 1400 player, it must make a couple of blunders per game in addition to playing something like a 2 or 3 ply search.

However, programs have to much good evaluation to similar the crappy positional play of a weaker player. So it would be playing more like a much stronger player who is not thinking very clearly tactically and making blunders once in a while!

I think you can simulate this somewhat by randomizing the evaluation function. I think Bob Hyatt has some good ideas on how to realistically weaken the play.
Crafty has this feature built in, but it is not yet clear how well (or actually how poorly) it works. It is quite hard to take a program with a lot of knowledge, and a very fast / deep search, and make it play like a beginner. A beginner is characterized by two distinct features.. (1) he is tactically weak and is not going to be tossing out mates in 15 or 20 and (b) he is positionally weak and won't understand pawn structure, king safety, mobility, space, etc.

Some previous attempts have relied on depth limits. But what you get is a program with GM-like knowledge and idiot-like tactics. That doesn't feel like a beginner because it won't wreck its pawn structure, and so forth.

In Crafty, the "skill command" does both. It introduces a random component into the evaluation which slowly eliminates the concepts of material value and positional value. And at the same time, it reduces the search depth (by slowing the program down, not by making it move quicker) so that we don't see the "beal effect" of a random evaluation still playing good chess.

The only problem is, it is very difficult to calibrate the skill command to Elo. At one point, skill 70 would reduce elo by 200. Skill 50 would reduce it by 400. But that is still way too strong for a beginner. Skill 1 should play like an absolute idiot, but I don't have any opponents weak enough to get a feel for exactly how weak it is...

Would be an interesting experiment to do. And would be a nice publication to produce something that is hardware independent while being able to say something like "skill 1450" and get a 1450-level opponent no matter what the hardware platform.

Not so easy, however. But interesting.
I have pondered this and I think the right approach may be to write a separate evaluation function for weaker play. It could have a rudimentary form of pawn structure but nothing sophisticated. It could have just a little bit of king safety and some generally missing terms that are concepts of more sophisticated players. And it could have the classical 1,3,5,9 piece terms since beginners primarily use those. So it might not be aware of the bishop pair for instance and make some positionally bad trades.

Evaluation is so huge in Komodo that I could lower the strength by 300+ ELO just by building a simple but not ridiculous evaluation function and it might simulate weaker players betters. In addition to low depths, we could use overly aggressive LMR or LMR without re-search on later ply and it might even make blunders on it's own without my having to artificially induce them.
I suspect that won't work. The searches are so large, they are almost self-correcting. You would think a purely random eval would toss pieces right and left. But not if the depth is great and pruning is limited... I don't want the thing to make a 2-move blunder and then sniff out a mate in 15 a couple of moves later, as an example...

The searches are not large, for the very lowest "weak" levels it will slow down the search so much that it will only be doing 2 or 3 ply searches. To get 1300 ELO play for instance you cannot do more than 3 ply. And instead of reducing we could simply forward prune - so it would definitely miss some tactics.

However if we go very deep you are right, there will be no silly tactical blunders but there will be plenty of tactical errors at 5-7 ply if we forward prune after looking at the first N moves.

I'm not sure we actually want to purposely create blunders just to get weak play. Even for a 1200 ELO player you don't want to win just because the program hung the queen outright just so you could win! If you attack the computer it should move the piece the piece that is attacked. The goal should be limit the strength, not to proactively look for blunders.

UncombedCoconut · Post by **UncombedCoconut** » Fri Mar 04, 2011 9:25 pm

Don wrote:Having any concept represented at all is half the battle so mixing in 10% of the good will have a bigger impact that then next 10% so I believe this interpolation should not be linear.

Indeed, to reach the lower end of the spectrum it should not be continuous, nor should it scale the terms uniformly. A beginner will learn about development, then king safety, then passed pawns... for more subtle positional considerations, it's unclear and perhaps a matter of personality. Another behavior which is probably common is to over-value a recently learned idea. A player might happily make a bad trade to give his opponent doubled pawns, and be worse off than before he learned about them.

I believe Dann had a large database of "junk" games. I wonder whether it could be mined to discover patterns of mis-evaluations... and also whether blitz games or longer games would be more revealing.

UncombedCoconut · Post by **UncombedCoconut** » Fri Mar 04, 2011 9:32 pm

Don wrote:To get 1300 ELO play for instance you cannot do more than 3 ply. And instead of reducing we could simply forward prune - so it would definitely miss some tactics.

Don't forget about the "I have an exciting idea" extension: when it fires, a human can reach a deeper depth with a very selective search.

(Better yet, do forget about it. Tord's experience shows that you can be extremely clever in modeling human mistakes, and still get poor results. Er, good results?)

Don · Post by **Don** » Fri Mar 04, 2011 9:47 pm

UncombedCoconut wrote:
Don wrote:Having any concept represented at all is half the battle so mixing in 10% of the good will have a bigger impact that then next 10% so I believe this interpolation should not be linear.
Indeed, to reach the lower end of the spectrum it should not be continuous, nor should it scale the terms uniformly. A beginner will learn about development, then king safety, then passed pawns... for more subtle positional considerations, it's unclear and perhaps a matter of personality. Another behavior which is probably common is to over-value a recently learned idea. A player might happily make a bad trade to give his opponent doubled pawns, and be worse off than before he learned about them.

I agree, but for practical reasons I cannot take this concept too far.

I like the scalability of introducing a timing delay between every move because it makes the program play the same regardless of the performance of the computer (more or less.) For example if I introduce a 1/1000 delay for every move, you have a program that basically does 1000 nodes per second - on a slow computer or a very fast one. In fact it will never do more than 1000 nodes per second regardless of the speed of the computer. But as the level increases you have to gradually introduce the superior evaluation function - you cannot do that unless you specify that the program always plays game in 5 minutes or something fixed. Otherwise, it's very difficult to calibrate the levels. What you would like to do is set the program to play at 1700 ELO strength and it figures out how much to slow the program down and how much to interpolate the scores - this is two separate variables.

I believe Dann had a large database of "junk" games. I wonder whether it could be mined to discover patterns of mis-evaluations... and also whether blitz games or longer games would be more revealing.

bob · Post by **bob** » Fri Mar 04, 2011 10:10 pm

Don wrote:
bob wrote:
Don wrote:
bob wrote:
Don wrote:
kaissa wrote:Hello,

I need a chess engine that will play so weak that a beginner should have a chance once in a while. It should not play like 2600 for some time and then drop a rook. I need it to play at 1000-1400 all the way. Any chance?

Thanks for your time,
If you want it to play like a 1400 player, it must make a couple of blunders per game in addition to playing something like a 2 or 3 ply search.

However, programs have to much good evaluation to similar the crappy positional play of a weaker player. So it would be playing more like a much stronger player who is not thinking very clearly tactically and making blunders once in a while!

I think you can simulate this somewhat by randomizing the evaluation function. I think Bob Hyatt has some good ideas on how to realistically weaken the play.
Crafty has this feature built in, but it is not yet clear how well (or actually how poorly) it works. It is quite hard to take a program with a lot of knowledge, and a very fast / deep search, and make it play like a beginner. A beginner is characterized by two distinct features.. (1) he is tactically weak and is not going to be tossing out mates in 15 or 20 and (b) he is positionally weak and won't understand pawn structure, king safety, mobility, space, etc.

Some previous attempts have relied on depth limits. But what you get is a program with GM-like knowledge and idiot-like tactics. That doesn't feel like a beginner because it won't wreck its pawn structure, and so forth.

In Crafty, the "skill command" does both. It introduces a random component into the evaluation which slowly eliminates the concepts of material value and positional value. And at the same time, it reduces the search depth (by slowing the program down, not by making it move quicker) so that we don't see the "beal effect" of a random evaluation still playing good chess.

The only problem is, it is very difficult to calibrate the skill command to Elo. At one point, skill 70 would reduce elo by 200. Skill 50 would reduce it by 400. But that is still way too strong for a beginner. Skill 1 should play like an absolute idiot, but I don't have any opponents weak enough to get a feel for exactly how weak it is...

Would be an interesting experiment to do. And would be a nice publication to produce something that is hardware independent while being able to say something like "skill 1450" and get a 1450-level opponent no matter what the hardware platform.

Not so easy, however. But interesting.
I have pondered this and I think the right approach may be to write a separate evaluation function for weaker play. It could have a rudimentary form of pawn structure but nothing sophisticated. It could have just a little bit of king safety and some generally missing terms that are concepts of more sophisticated players. And it could have the classical 1,3,5,9 piece terms since beginners primarily use those. So it might not be aware of the bishop pair for instance and make some positionally bad trades.

Evaluation is so huge in Komodo that I could lower the strength by 300+ ELO just by building a simple but not ridiculous evaluation function and it might simulate weaker players betters. In addition to low depths, we could use overly aggressive LMR or LMR without re-search on later ply and it might even make blunders on it's own without my having to artificially induce them.
I suspect that won't work. The searches are so large, they are almost self-correcting. You would think a purely random eval would toss pieces right and left. But not if the depth is great and pruning is limited... I don't want the thing to make a 2-move blunder and then sniff out a mate in 15 a couple of moves later, as an example...
The searches are not large, for the very lowest "weak" levels it will slow down the search so much that it will only be doing 2 or 3 ply searches. To get 1300 ELO play for instance you cannot do more than 3 ply. And instead of reducing we could simply forward prune - so it would definitely miss some tactics.

However if we go very deep you are right, there will be no silly tactical blunders but there will be plenty of tactical errors at 5-7 ply if we forward prune after looking at the first N moves.

I'm not sure we actually want to purposely create blunders just to get weak play. Even for a 1200 ELO player you don't want to win just because the program hung the queen outright just so you could win! If you attack the computer it should move the piece the piece that is attacked. The goal should be limit the strength, not to proactively look for blunders.

OK, perhaps I misunderstood. And, in fact, that is what I am doing as well. I slow the search down artificially with a spin loop in Evaluate() that executes more and more dummy iterations as skill level is reduced. The randomness also ramps up. At skill 50, 1/2 the eval is normal, 1/2 is a number between + and 1 one whole pawn. This is not enough to just throw rooks away, but it can now sac a knight for a pawn or two, and it will certainly toss a pawn here and there. And the reduced depth with extensions that have been turned way down causes it to overlook shallow tactics. But it won't just hang a rook. Yet I have seen a GM player make a queen move where his opponent replied and he could either instantly lose the queen, or get mated if it made any attempt to save it. This is a WC cycle match a few years back. That is a bit harder to factor in, although I suppose there could be a second tier of randomness where a random number chooses the eval (uniformly distributed) and a non-uniform PRNG provides the material value to scale. SO that rather than one pawn all the time, it could be one pawn 90% of the time, two pawns 5% of the time, 3 pawns 3% of the time, and a whole rook 2%. Those are obviously ad hoc numbers and would likely not be optimal, but it would help me to actually drop a knight some small percentage of the time.

slobo · Post by **slobo** » Fri Mar 04, 2011 10:59 pm

NATIONAL12 wrote:
kaissa wrote:Hello,

I need a chess engine that will play so weak that a beginner should have a chance once in a while. It should not play like 2600 for some time and then drop a rook. I need it to play at 1000-1400 all the way. Any chance?

Thanks for your time,
i think many on this forum believe Rybka would meet your needs.

sorry,just a bit of fun on my part.

Rybka before using the Fruit 2.1 code, of course.

muxecoid · Post by **muxecoid** » Sat Mar 05, 2011 12:53 am

UncombedCoconut wrote:
Don wrote:To get 1300 ELO play for instance you cannot do more than 3 ply. And instead of reducing we could simply forward prune - so it would definitely miss some tactics.
Don't forget about the "I have an exciting idea" extension: when it fires, a human can reach a deeper depth with a very selective search.
(Better yet, do forget about it. Tord's experience shows that you can be extremely clever in modeling human mistakes, and still get poor results. Er, good results?)

The thread is interesting read. I think the main point raised there is that humans scale differently with increase of thinking times.

I theoretically know a little about pawn weaknesses but only if time control is above 20 minutes per game I actually think about pawn structure during the play. When I play bullet game there is only one component to my eval - king safety (not even looking at material in 1 min per game bullet). In blitz game material becomes part of my eval. As controls get to 10 minutes per game I find time to think about mobility.

An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?

Re: An engine for beginners?