Easy engine to use for testing

Adam Hair · Post by **Adam Hair** » Tue Jun 05, 2012 3:26 am

stevemulligan wrote:If I have a bunch of old laptops, can I run a few games on those machines and then merge all the PGN files together to save time? Or do the specs on the machines need to be identical for that?

Identical specs are best, and necessary if you are looking for small changes in Elo, such as Bob is with Crafty. In your case, your engine is young and has much opportunity for gaining Elo. You could run the Crafty 19.17 benchmark ( here is a link to it : http://www.mediafire.com/?m25vbp8iy3qbrxd ) on each laptop. Use your chosen time control on the fastest computer. Then, use the chosen time control times the ratio of the benchmark on each laptop to the fastest laptop.

Just to be clear, let's say the chosen time control is 40/120+1 and the bechmark on the fastest laptop is 40. If the benchmark on another laptop is 50, then use the time control 40/144+1.2 on that laptop. Given the sort of gains you may experience at this point in your engine's development, you should be able to confirm positive changes, provided you play enough games. I guess that would be a function of how many laptops you have. Anyway, the CCRL lists consists of games played on various computers where the time controls are determined by the Crafty benchmark. It is not ideal, but it is better than using too few games to measure changes.

By the way, if your engine can handle it, you could decrease the time control you are using. The gain in the number of games you can accumulate will more than offset anything that might be missed by using the shorter time control. Most (but not all) gains found at short time controls do not vanish at long time controls.

emadsen · Post by **emadsen** » Tue Jun 05, 2012 8:19 pm

Running a 1000 games against Erik's RumbleMinze since it's a c# engine like mine

Steve, I'm interested to know your results... I usually test against MicroMax 4.8, Roce, Predateur 2.1, TSCP, Vapor, IQ23, Predateur 0.1.5, MicroMax 1.6, and Toledo NanoChess. As Bob mentioned, this gives a good range of opponents- for a weak engine. Not sure how strong your engine is.

Don · Post by **Don** » Wed Jun 06, 2012 1:45 am

stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.

In the development of Komodo we always used the strongest opponents but gave them huge handicaps - gradually reducing those handicaps as Komodo got stronger. This is smart because it means you are wasting almost no time testing your opponents, the CPU time will be spent almost solely on your own program. At least until you start getting relatively close. In the Doch days Glaurung and Spike were more than what Doch could handle.

You don't need to use a top 5 program of course or you might get dpressed, but find some that are perhaps 500 ELO stronger than yours and give them the appropriate handicaps to make the matches fair. You will get a baseline for the handicap level and improve from there until at some point you will discover that you will have to reduce the handicap. After 2 or 3 iteration of this you may have to move to a stronger program!

stevemulligan · Post by **stevemulligan** » Fri Jun 08, 2012 9:25 pm

emadsen wrote:Steve, I'm interested to know your results...

Well it took a lot longer than 24 hours, about 3-4 days and I didn't finish all 1000 games. I ran my engine with incorrect settings (EGTB on, Opening Book on, 84meg hash) so I need to start over after I tweak a few things. I think the ELO difference at the end was around -140 for me.

RumbleMinze stalled on me though, after about 800 games. It was using around 650megs of ram when I last checked it. Pwned was around 250megs.

Which target did you use when you compiled RumbleMinze? (x86, x64 or Any CPU) I want to match the target arch before I try again.

emadsen · Post by **emadsen** » Sat Jun 09, 2012 5:33 am

Which target did you use when you compiled RumbleMinze? (x86, x64 or Any CPU)

Any CPU.

My hash managment is rather simple. Your GUI / tournament manager must leave the engine EXE running rather than restart it with each game. This exposes a problem with memory management I don't see when running tournaments in the Shredder or Fritz GUIs.

emadsen · Post by **emadsen** » Sat Jun 09, 2012 8:58 pm

Actually I do see the hash memory leak when running tournaments in the Shredder and Fritz GUIs. I've added the bug to my to-do list.

emadsen · Post by **emadsen** » Sat Jun 09, 2012 9:10 pm

In the development of Komodo we always used the strongest opponents but gave them huge handicaps - gradually reducing those handicaps as Komodo got stronger. This is smart because it means you are wasting almost no time testing your opponents, the CPU time will be spent almost solely on your own program.

Don,

Thanks. That is very good advice. I've started testing my engine against Shredder, Hiarcs, and Junior with UCI_LimitStrength and UCI_Elo set. Unfortunately the Fritz and Shredder GUIs do not honor UCI_LimitStrength and UCI_Elo settings in engine-engine matches. However, Little Blitzer does, and is a better solution for rapid testing.

I'm checking out Cute Chess CLI now, which seems an even better solution for testing...

Don · Post by **Don** » Sat Jun 09, 2012 10:33 pm

emadsen wrote:
In the development of Komodo we always used the strongest opponents but gave them huge handicaps - gradually reducing those handicaps as Komodo got stronger. This is smart because it means you are wasting almost no time testing your opponents, the CPU time will be spent almost solely on your own program.
Don,

Thanks. That is very good advice. I've started testing my engine against Shredder, Hiarcs, and Junior with UCI_LimitStrength and UCI_Elo set. Unfortunately the Fritz and Shredder GUIs do not honor UCI_LimitStrength and UCI_Elo settings in engine-engine matches. However, Little Blitzer does, and is a better solution for rapid testing.

I'm checking out Cute Chess CLI now, which seems an even better solution for testing...

Cool! Another benefit of this is that you will be automatically provided with milestones if your program continues to improve. You will remember that you used to have to give a large handicap to program X, but now it's not even a worthy opponent and it will give you a better sense of progress than just looking at a dry ELO rating.

jdart · Post by **jdart** » Sat Jun 09, 2012 10:58 pm

I found using a time handicap to be not very effective, but if the engine actually has a strength limit option (like Stockfish), that is a good solution. Mine is doing pretty well against Stockfish at strength 15 (range is 0-20).

--Jon

lucasart · Post by **lucasart** » Sun Jun 10, 2012 4:35 am

stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.

Even with a fairly simple but bugfree search, your engine should be a lot stronger than that, without eval (only material and basic parametric piece on square tables). You should probably try to fix your search rather than write anymore code in the eval. A bugfree search with material + piece on square table + linear mobility should get you above 2300 elo already.

Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing

Re: Easy engine to use for testing