Are there any testbeds that try to systematically test engine understanding of basic positional factors (and tradeoffs between them) such as:
king safety
activity
exchange saqs
pawn saqs
weak squares
piece saqs
center
weak pawns
when to trade
etc. etc.?
Something where you can run the test really quickly (such as 1 second per position) and see where your engine needs work?
I looked at SBD, and it seems like it has a compatible goal, but does not cover that large a variety of positional themes (when to castle kingside is covered quite well ... perhaps too well ... ).
-Sam
positional testbed
Moderator: Ras
-
- Posts: 6662
- Joined: Thu Mar 09, 2006 4:21 am
Re: positional testbed
Actually, I'm currently working on it, It's nearly half complete 
The Pgn beta that I'm working on consists of all kinds of positional chapterwise tests. Now in a process of adding more test-games into certain chapters.
I'm not sure If I should keep it private or releas it as GPL once I get them done
What's your suggestion? 

The Pgn beta that I'm working on consists of all kinds of positional chapterwise tests. Now in a process of adding more test-games into certain chapters.
I'm not sure If I should keep it private or releas it as GPL once I get them done


-
- Posts: 4406
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: positional testbed
I think for many of these things you really want to directly test eval rather than run a search and see if you have the best move. Search tests lots of things but the root cause of many failures to find the best move is a mis-evaluation of a position.
And the root cause of mis-evaluating a position can be a bug. But it can also be failure to code something essential or having the wrong score weight.
Unfortunately it is hard to automate this. I have a little collection of FEN files that have various degrees of king safety and I have used that to test that part, but its not automated - it's eyeball the position and see if the score looks way off.
I've also used a trick from Bob Hyatt to help find bugs - flip the board, first vertically (changing the side to move) and the horizontally, and see if the score changes.
And the root cause of mis-evaluating a position can be a bug. But it can also be failure to code something essential or having the wrong score weight.
Unfortunately it is hard to automate this. I have a little collection of FEN files that have various degrees of king safety and I have used that to test that part, but its not automated - it's eyeball the position and see if the score looks way off.
I've also used a trick from Bob Hyatt to help find bugs - flip the board, first vertically (changing the side to move) and the horizontally, and see if the score changes.
-
- Posts: 1154
- Joined: Fri Jun 23, 2006 5:18 am
Re: positional testbed
Yes, it is one of my pet peeves that automatic testing is just for best move...one of my main tuning mechanisms is just eyeballing a position, deciding who I think has the edge, and tweaking things until my engine agrees. Generally such positions are found just watching my engine play and every once in a while it becomes clear from the eval my engine has no idea what is going on. I am sure most engine developers had situations like this...staring at the screen and shouting at his stupid engine "why did you just do that trade, don't you realize that endgame is lost!" and then trying to figure out how to mess with your eval to fix things.I think for many of these things you really want to directly test eval rather than run a search
An automated test mechanism that gave points for eval ranges (score is < -0.2, between -0.2 and 0.2, above 0.2 kind of thing) would be very helpful. Testing it with a short search (instead of just eval) seems fine to me, so such a tool does not have to be integrated with engine. I suppose I could write one and it would save me effort in the long run, but is sounds like a big bother.
-Sam
-
- Posts: 2851
- Joined: Wed Mar 08, 2006 10:01 pm
- Location: Irvine, CA, USA
Re: positional testbed
Why would you want to keep it private? Anyway, if you're just talking about a test suite I think the GPL is a bit awkward. A Creative Commons license might be better.swami wrote:Actually, I'm currently working on it, It's nearly half complete :)
The Pgn beta that I'm working on consists of all kinds of positional chapterwise tests. Now in a process of adding more test-games into certain chapters.
I'm not sure If I should keep it private or releas it as GPL once I get them done :P What's your suggestion? :wink:
-
- Posts: 6662
- Joined: Thu Mar 09, 2006 4:21 am
Re: positional testbed
I was just joking about the GPL thingyDirt wrote:Why would you want to keep it private? Anyway, if you're just talking about a test suite I think the GPL is a bit awkward. A Creative Commons license might be better.swami wrote:Actually, I'm currently working on it, It's nearly half complete
The Pgn beta that I'm working on consists of all kinds of positional chapterwise tests. Now in a process of adding more test-games into certain chapters.
I'm not sure If I should keep it private or releas it as GPL once I get them doneWhat's your suggestion?

Was there even a license for test suites... I didn't know that!
