Easy engine to use for testing

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Easy engine to use for testing

Post by lucasart »

stevemulligan wrote:
lucasart wrote: Even with a fairly simple but bugfree search, your engine should be a lot stronger than that, without eval (only material and basic parametric piece on square tables).
Hang on a sec. I have hash table, qsearch, null move, rotated bb's for move gen. The only thing I'm missing is aspiration search which I didn't add because it looked very tricky to debug and from what I understood, it didn't add a lot of Elo.

Are you saying that an eval with mobility like you describe in this thread and PST that you describe in this thread should be able to get > 2000 Elo?? If so then I have to start over :(
Yes
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Easy engine to use for testing

Post by Don »

stevemulligan wrote:
lucasart wrote: Even with a fairly simple but bugfree search, your engine should be a lot stronger than that, without eval (only material and basic parametric piece on square tables).
Hang on a sec. I have hash table, qsearch, null move, rotated bb's for move gen. The only thing I'm missing is aspiration search which I didn't add because it looked very tricky to debug and from what I understood, it didn't add a lot of Elo.

Are you saying that an eval with mobility like you describe in this thread and PST that you describe in this thread should be able to get > 2000 Elo?? If so then I have to start over :(
You don't have to start over, you just have to debug. There is probably some simple bugs.

I suspect the hash table or the search, a simple >= when it should be > or vice versa in comparisons against alpha or beta could break the search.

Play some fixed depth games between a version with hash table turned off. The score should be
close, otherwise there is a bug in the hash table implementation.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
yl84
Posts: 21
Joined: Tue Sep 07, 2010 6:37 pm

Re: Easy engine to use for testing

Post by yl84 »

Adam Hair wrote:
stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.
Here is a list of engines: http://adamsccpages.blogspot.com/p/also ... t.html?m=1
The engines with more than 300 games are relatively stable. Tscp is a common engine to measure against at this level. MSCP is also a good engine. Perhaps Crafty, set at an appropriate skill level, would be a good choice.

Most authors, if they are not conducting self-testing, will use a pool of 8 to 10 reliable opponents. I can go over my notes and see which of these engines appear to be most stable.

Adam
Hi,
Adam I'm also interested if you can include my engine for testing in your rating list. My chess engine Milady is relatively weak. Now it is the version 3.04 the newest, available for download at http://milady-chess.blogspot.fr/
Cheers
Yves
User avatar
stevemulligan
Posts: 117
Joined: Wed Jul 20, 2011 2:54 pm
Location: Ottawa, Canada

Re: Easy engine to use for testing

Post by stevemulligan »

Don wrote: I suspect the hash table or the search

Play some fixed depth games between a version with hash table turned off.
With the new simplified eval at fixed depth 5 vs Warrior, 1000 games, I get -201 with hash on and -198 with hash off. Guess I know where to start then :) Thanks Don!
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Easy engine to use for testing

Post by Don »

stevemulligan wrote:
Don wrote: I suspect the hash table or the search

Play some fixed depth games between a version with hash table turned off.
With the new simplified eval at fixed depth 5 vs Warrior, 1000 games, I get -201 with hash on and -198 with hash off. Guess I know where to start then :) Thanks Don!
So you are losing by 200 in either case? I don't think you have shown that the hash table is the problem so why are you thanking me? Have no fear, you have made progress because you have eliminated the hash table as the cause of the problem. One less thing to worry about.

Next step? I strongly suggest that you write a simple perft function and run it against known positions if you have not already done that. That is the your best first "sanity test" to run. Do you have a single move generator or multiple generators, for example do you have a separate capture generator? You have to check out each one out separately if you do. Make a version of your program that only uses the full generator that has been checked out by perft and do the fixed depth test again.

By the say, you don't have to run two tests, for example the hash test could have been a single test, the version of your program with the hash on and the version without hash. Your answer will much more resolved with much less fewer games.

Once you prove that is not the problem you move on to the thing you next suspect the most. Sometimes you have to be very creative in designing a test but there is always a way.

When you get to the search part I would suspect the quies search logic the most - but honestly it could be just any anything in your program so you have to use the process of elimination. But you clearly have a bug somewhere so don't be too hasty to say, "it couldn't be that" no matter what it is.



Here is perft from the chessprogramming wiki


typedef unsigned long long u64;

u64 Perft(int depth)
{
MOVE move_list[256];
int n_moves, i;
u64 nodes = 0;

if (depth == 0) return 1;

n_moves = GenerateMoves(move_list);
for (i = 0; i < n_moves; i++) {
MakeMove(move_list);
nodes += Perft(depth - 1);
UndoMove(move_list);
}
return nodes;
}
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Easy engine to use for testing

Post by Adam Hair »

yl84 wrote:
Adam Hair wrote:
stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.
Here is a list of engines: http://adamsccpages.blogspot.com/p/also ... t.html?m=1
The engines with more than 300 games are relatively stable. Tscp is a common engine to measure against at this level. MSCP is also a good engine. Perhaps Crafty, set at an appropriate skill level, would be a good choice.

Most authors, if they are not conducting self-testing, will use a pool of 8 to 10 reliable opponents. I can go over my notes and see which of these engines appear to be most stable.

Adam
Hi,
Adam I'm also interested if you can include my engine for testing in your rating list. My chess engine Milady is relatively weak. Now it is the version 3.04 the newest, available for download at http://milady-chess.blogspot.fr/
Cheers
Yves
Hi Yves,

I am sorry that I did not see your request to me at Open Chess until today. After I finish my experiment concerning move selection, Elo, and search depth, I will test Milady for the Also-Rans list. It maybe a couple of weeks or more before I do.

Adam
User avatar
stevemulligan
Posts: 117
Joined: Wed Jul 20, 2011 2:54 pm
Location: Ottawa, Canada

Re: Easy engine to use for testing

Post by stevemulligan »

Don wrote:So you are losing by 200 in either case? I don't think you have shown that the hash table is the problem
Shouldn't I gain at least 60 Elo when the hash table is enabled? Right now on or off I get the same score.
Next step? I strongly suggest that you write a simple perft function and run it against known positions if you have not already done that. That is the your best first "sanity test" to run. Do you have a single move generator or multiple generators, for example do you have a separate capture generator?
I have 3 generators.. Caps, NonCaps & Check evasions. My perft command uses all 3 generators and my internal perft tests all pass. (126 positions up to depth 6)

I'm at a loss on what to try next...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Easy engine to use for testing

Post by Don »

stevemulligan wrote:
Don wrote:So you are losing by 200 in either case? I don't think you have shown that the hash table is the problem
Shouldn't I gain at least 60 Elo when the hash table is enabled? Right now on or off I get the same score.
With fixed depth searches you will gain very little from hash tables. With deeper searches you will gain more but it will mostly speed. But it does get confusion with LMR and other junk. I would suggest in your debugging process to turn most things OFF and then gradually turn them back on to study their impact. A simple experiment is to turn off LMR and see how much ELO at fixed depth you lose. Mabye 6 or 7 ply searches are better if they are fast enough to get meaningful sample quickly.
Next step? I strongly suggest that you write a simple perft function and run it against known positions if you have not already done that. That is the your best first "sanity test" to run. Do you have a single move generator or multiple generators, for example do you have a separate capture generator?
I have 3 generators.. Caps, NonCaps & Check evasions. My perft command uses all 3 generators and my internal perft tests all pass. (126 positions up to depth 6)

I'm at a loss on what to try next...
It's likely you have a search bug - and that takes in a lot. But your evaluation could also have a bug in it - even if it's simple it could be buggy.

Presumably you have a switch to make it easy to turn things off for debugging. Turn of LMR, and any sort of pruning and test this against a version of your program that is unmodified at fixed depth. The ELO should not vary too much except for LMR which takes a bite out of the strength but should be providing a massive compensating speedup. But it's still useful to see the impact.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
stevemulligan
Posts: 117
Joined: Wed Jul 20, 2011 2:54 pm
Location: Ottawa, Canada

Re: Easy engine to use for testing

Post by stevemulligan »

Don wrote: Presumably you have a switch to make it easy to turn things off for debugging. Turn of LMR, and any sort of pruning and test this against a version of your program that is unmodified at fixed depth.
I don't have any LMR. The most advanced I get is null move and extending the depth while in check.

Is it possible my code is just "slow" and that's why I can't get above 2000?
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Easy engine to use for testing

Post by michiguel »

stevemulligan wrote:
Don wrote: Presumably you have a switch to make it easy to turn things off for debugging. Turn of LMR, and any sort of pruning and test this against a version of your program that is unmodified at fixed depth.
I don't have any LMR. The most advanced I get is null move and extending the depth while in check.

Is it possible my code is just "slow" and that's why I can't get above 2000?
If the value of you eval parameters are a bit off, the performance could have a big hit. For instance, few years ago I got an improvement *removing* the passed pawn code. I got a bigger one putting it back with better numbers. That was because my code was very elaborated, but the numbers were not good, so it was counterproductive. Passers in a5 and b5 preferred not to move because there were covering each other...

So, my advice to a beginner is to write very simple and straightforward eval code the first time, until things can be tuned appropriately.

Maybe this does not apply to you in this case.

Miguel