obvious/easy move

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: obvious/easy move - final results

Post by Don »

AlvaroBegue wrote:
Pio wrote: What I think is interesting is to manually study the games of your own engine against other engines since that will show flaws in your own engine.
That's how we did basically all development of Ruy-López in the 90s. And we'll probably continue to do that as a source of ideas. But you need the intensive testing to filter what modifications to accept.
We have come to appreciate over time that you must be pretty fussy about what changes to accept. So we don't easily accept changes but actually run multiple tests. If they don't all show improvement we run even more. Of course it depends on the type of change - some changes are relatively "safe" compared to other.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Pio
Posts: 334
Joined: Sat Feb 25, 2012 10:42 pm
Location: Stockholm

Re: obvious/easy move - final results

Post by Pio »

Hi Álvaro and Don!

Actually what I meant in my previous post (see below) was that I think testing against other opponents should mostly be done to find errors in your evaluation/search and self-testing should be done to confirm if a change was good or not.

If I had a working chess engine I would confirm a change only if I had run lots of games in really fast time-controls/few-nodes for lots of different really fast time-controls/few-nodes to see if the change could be an improvement for long time controls. To see if the change might be good on long time controls you could for example just extrapolate your findings from the many different short time tests by using a plausible function and do a maximum likelihood estimation for the parameters of that function. That would give you an indication if the change will scale and work on long time controls.
Hi Robert!

I think both you and Don are right. I would primarily test the same way as Don does because:

1) You need less games to be sure of an improvement of your new version A' relative your old version A

2) You (Robert) make an extra assumption saying that your opponents will not change. The more the opponents change in the future towards your version A the worse your assumption was that the opponents will not make progress. I guess this is more dangerous if you test against weaker (in terms of evaluation and search algorithms but not speed) opponents than yourself since they will probably converge to your chess engine in the future

3) You will ruthlessly exploit the weaknesses of your pool of opponents but they might not be as representative to all chess-engines of today and tomorrow as you might think. Lets say you have 20 opponents and you test 100 different ideas. Lets say each idea (a simplified and not correct assumption) has a 50 percentage chance to be better against every opponent and 50 percentage chance to be worse against every opponent. For one of those 100 ideas you will probably be really lucky and it will work for maybe 17 opponents in your pool and will look like a brilliant idea even though it was not a general improvement at all. Of course you might do the same for some ideas that are great but unfortunately will not work for your pool of opponents

What I think is interesting is to manually study the games of your own engine against other engines since that will show flaws in your own engine.

If you really would like to beat the opponents in your pool you could do asymmetrical evaluation/search. Lets say you have made a small change "a" to your engine version A and your new engine A' = A + a works bad against A but really good against your pool of opponents. Why not change your engine's evaluation/search to A + a but model the opponent's evaluation/search to A - a where I with -a mean the inverse of a. If that does not work do the opposite and model your engine's evaluation/search to A - a and the opponents to A + a. Just an idea.

Good luck with your testing!!!
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: obvious/easy move - final results

Post by Don »

Pio wrote:Hi Álvaro and Don!

Actually what I meant in my previous post (see below) was that I think testing against other opponents should mostly be done to find errors in your evaluation/search and self-testing should be done to confirm if a change was good or not.

If I had a working chess engine I would confirm a change only if I had run lots of games in really fast time-controls/few-nodes for lots of different really fast time-controls/few-nodes to see if the change could be an improvement for long time controls. To see if the change might be good on long time controls you could for example just extrapolate your findings from the many different short time tests by using a plausible function and do a maximum likelihood estimation for the parameters of that function. That would give you an indication if the change will scale and work on long time controls.
Hi Robert!

I think both you and Don are right. I would primarily test the same way as Don does because:

1) You need less games to be sure of an improvement of your new version A' relative your old version A

2) You (Robert) make an extra assumption saying that your opponents will not change. The more the opponents change in the future towards your version A the worse your assumption was that the opponents will not make progress. I guess this is more dangerous if you test against weaker (in terms of evaluation and search algorithms but not speed) opponents than yourself since they will probably converge to your chess engine in the future

3) You will ruthlessly exploit the weaknesses of your pool of opponents but they might not be as representative to all chess-engines of today and tomorrow as you might think. Lets say you have 20 opponents and you test 100 different ideas. Lets say each idea (a simplified and not correct assumption) has a 50 percentage chance to be better against every opponent and 50 percentage chance to be worse against every opponent. For one of those 100 ideas you will probably be really lucky and it will work for maybe 17 opponents in your pool and will look like a brilliant idea even though it was not a general improvement at all. Of course you might do the same for some ideas that are great but unfortunately will not work for your pool of opponents

What I think is interesting is to manually study the games of your own engine against other engines since that will show flaws in your own engine.

If you really would like to beat the opponents in your pool you could do asymmetrical evaluation/search. Lets say you have made a small change "a" to your engine version A and your new engine A' = A + a works bad against A but really good against your pool of opponents. Why not change your engine's evaluation/search to A + a but model the opponent's evaluation/search to A - a where I with -a mean the inverse of a. If that does not work do the opposite and model your engine's evaluation/search to A - a and the opponents to A + a. Just an idea.

Good luck with your testing!!!
Actually, we DO perform some gauntlet testing and used to do mostly gauntlet testing but now we depends 95% on self-testing. It's probably more superstition on our part than anything else but I don't like to go weeks without having run against a "known" opponent. It also is beneficial for our morale, psychologically we need to see progress against other engines and not just a chain of improvements. But the gauntlets are just an occasional thing.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Rebel
Posts: 6946
Joined: Thu Aug 18, 2011 12:04 pm

Re: obvious/easy move - final results

Post by Rebel »

Don wrote: The easy move in Komodo is very powerful and saves a lot of time. When it was implemented we had to make our normal time control significantly more aggressive to compensate.

The basic concept is that EVERY move is an easy move (if the program doesn't change it's mind) and it's just a matter of degrees.
Bells are ringing from the past. I re-activated some old code from the 90's and what did not work back then due to the low depths of that time now works smoothly. The buts are well known but statistically it works. Thanks Don for triggering my memory, post of the month.

My coding is simple, when the best move of the last 3 iterations hasn't changed and the score does not fluctuate too much then divide the time by 2.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: obvious/easy move - final results

Post by Don »

Rebel wrote:
Don wrote: The easy move in Komodo is very powerful and saves a lot of time. When it was implemented we had to make our normal time control significantly more aggressive to compensate.

The basic concept is that EVERY move is an easy move (if the program doesn't change it's mind) and it's just a matter of degrees.
Bells are ringing from the past. I re-activated some old code from the 90's and what did not work back then due to the low depths of that time now works smoothly. The buts are well known but statistically it works. Thanks Don for triggering my memory, post of the month.

My coding is simple, when the best move of the last 3 iterations hasn't changed and the score does not fluctuate too much then divide the time by 2.
It turns out that what happened on previous iterations is relevant. So the way you are doing this is a good idea and I'm not surprised that it helps.

What we are doing however is different. We used to consider whether the best move changed and I'm not sure but I think this completely replaces that. I think that it is often the case that a different way of doing something often captures the same principles and that may be the case here. Our method is very grainy however so we could end up dividing the time by almost any value from 1.0 to about 20.

I am convinced a lot more could be done but it's not trivial estimating how much time is a reasonable amount of time to spend on a move.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: obvious/easy move - final results

Post by bob »

Pio wrote:Hi Robert!

I think both you and Don are right. I would primarily test the same way as Don does because:

1) You need less games to be sure of an improvement of your new version A' relative your old version A

2) You (Robert) make an extra assumption saying that your opponents will not change. The more the opponents change in the future towards your version A the worse your assumption was that the opponents will not make progress. I guess this is more dangerous if you test against weaker (in terms of evaluation and search algorithms but not speed) opponents than yourself since they will probably converge to your chess engine in the future

3) You will ruthlessly exploit the weaknesses of your pool of opponents but they might not be as representative to all chess-engines of today and tomorrow as you might think. Lets say you have 20 opponents and you test 100 different ideas. Lets say each idea (a simplified and not correct assumption) has a 50 percentage chance to be better against every opponent and 50 percentage chance to be worse against every opponent. For one of those 100 ideas you will probably be really lucky and it will work for maybe 17 opponents in your pool and will look like a brilliant idea even though it was not a general improvement at all. Of course you might do the same for some ideas that are great but unfortunately will not work for your pool of opponents

What I think is interesting is to manually study the games of your own engine against other engines since that will show flaws in your own engine.

If you really would like to beat the opponents in your pool you could do asymmetrical evaluation/search. Lets say you have made a small change "a" to your engine version A and your new engine A' = A + a works bad against A but really good against your pool of opponents. Why not change your engine's evaluation/search to A + a but model the opponent's evaluation/search to A - a where I with -a mean the inverse of a. If that does not work do the opposite and model your engine's evaluation/search to A - a and the opponents to A + a. Just an idea.

Good luck with your testing!!!
I'd rather compete/test against a pool, vs against a single program. You can easily tune your eval to take advantage of a single opponent. But will that change translate to better performance against other players? That's my concern, because I ran into multiple cases where self-test said "this is good" but then a cluster test said "not so fast, this is worse."

I occasionally do self-testing when I am out of town with no internet access, and want to test a change. And most of the time the self-testing is verified by the cluster results. But not always. And that gave me a reason to be cautious with this kind of testing.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: obvious/easy move - final results

Post by bob »

Rebel wrote:
Don wrote: The easy move in Komodo is very powerful and saves a lot of time. When it was implemented we had to make our normal time control significantly more aggressive to compensate.

The basic concept is that EVERY move is an easy move (if the program doesn't change it's mind) and it's just a matter of degrees.
Bells are ringing from the past. I re-activated some old code from the 90's and what did not work back then due to the low depths of that time now works smoothly. The buts are well known but statistically it works. Thanks Don for triggering my memory, post of the month.

My coding is simple, when the best move of the last 3 iterations hasn't changed and the score does not fluctuate too much then divide the time by 2.
One other "limit" used in Cray Blitz. The iteration-to-iteration time ratio should not shift dramatically, indicating that either you are beginning to recognize a new best move, or the current move is getting harder to prove best. Hsu actually used the inverse of this idea to trigger a "panic time" event to use more time when a move was not "easy".

I've experimented, off and on, with using the node counts for each move to trigger this. If one move is clearly "bigger" than the rest, the move can be considered easy. Been testing this off and on for a month to see if it is dependable.
JBNielsen
Posts: 267
Joined: Thu Jul 07, 2011 10:31 pm
Location: Denmark

Re: obvious/easy move - final results

Post by JBNielsen »

I have made my own version of play-easy-moves-fast.

I simply calculate how much the moves scores in average differs from the score of the best move.

To spice things up I have added these rules for each move:
1) if the difference is more than 75 I add 100 to the difference
2) if the difference is more than 25 I add 25 to the difference

If the total difference is more than 500 the difference is restricted to 500.

This will result in an average difference between 0 and 500.
And this allows my engine to end its calculation when 90% or 10% (gradually) of the time for the move is used.

I do not look at the result of previous iterations, but I only use this play-easy-moves-fast gradually the first iterations.

If I have a mate-score I use a default average difference of 50.

I have not tried to measure any elo-effect by this.
But it looks fine when I watch dabbaba play matches.

PS. Sorry for returning to the subject in this thread :wink:
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: obvious/easy move - final results

Post by bob »

JBNielsen wrote:I have made my own version of play-easy-moves-fast.

I simply calculate how much the moves scores in average differs from the score of the best move.

To spice things up I have added these rules for each move:
1) if the difference is more than 75 I add 100 to the difference
2) if the difference is more than 25 I add 25 to the difference

If the total difference is more than 500 the difference is restricted to 500.

This will result in an average difference between 0 and 500.
And this allows my engine to end its calculation when 90% or 10% (gradually) of the time for the move is used.

I do not look at the result of previous iterations, but I only use this play-easy-moves-fast gradually the first iterations.

If I have a mate-score I use a default average difference of 50.

I have not tried to measure any elo-effect by this.
But it looks fine when I watch dabbaba play matches.

PS. Sorry for returning to the subject in this thread :wink:
How do you know "the moves scores in average" that you compare to the score of the first move??? Static eval? Root move ordering only?
JBNielsen
Posts: 267
Joined: Thu Jul 07, 2011 10:31 pm
Location: Denmark

Re: obvious/easy move - final results

Post by JBNielsen »

bob wrote:
JBNielsen wrote:I have made my own version of play-easy-moves-fast.

I simply calculate how much the moves scores in average differs from the score of the best move.

To spice things up I have added these rules for each move:
1) if the difference is more than 75 I add 100 to the difference
2) if the difference is more than 25 I add 25 to the difference

If the total difference is more than 500 the difference is restricted to 500.

This will result in an average difference between 0 and 500.
And this allows my engine to end its calculation when 90% or 10% (gradually) of the time for the move is used.

I do not look at the result of previous iterations, but I only use this play-easy-moves-fast gradually the first iterations.

If I have a mate-score I use a default average difference of 50.

I have not tried to measure any elo-effect by this.
But it looks fine when I watch dabbaba play matches.

PS. Sorry for returning to the subject in this thread :wink:
How do you know "the moves scores in average" that you compare to the score of the first move??? Static eval? Root move ordering only?
As I wrote in an earlier post here ( http://www.talkchess.com/forum/viewtop ... 5&t=46605 )
My engine is probably not written the standard way. I started it 18 years ago based on what I had read and my intuition and logic.
When a white move at the root is refused by a given black move, I store the score for the move that gave the cuf-off.