UnEvI: separating eval from search?

Onno Garms · Post by **Onno Garms** » Sat Apr 30, 2011 7:01 pm

What do you think about the idea to separate eval from search? I mean,
have eval in an exchangeable DLL that is loaded dynamically according
to the engine settings.

I searched the forum a bit. Not surprisingly, this idea is not new:
http://talkchess.com/forum/viewtopic.php?t=32397
Were there any more threads on this topic?
However it seems that nobody has yet set out yet to define a protocol to
make such a separation efficient.

Would this make sense? Some thoughts on this:

1. Modularization is important for software development in
general. For example, separating the GUI from the engine through the
Winboard or UCI protocol makes it possible for the user to combine his
favorite engine with his favorite GUI or to compare engines by
automated matches. For developers this makes it possible to write just
either a GUI or an engine if they are not interested in or don't have
time for both. The same could be true for search and eval. So the
idea is to introduce UnEvI, unified evaluation interface.

2. Assuming that the technical issues are solvable, the core question
is, if it makes sense to combine search from one author with eval from
another.

There is the theory that an engine is much more than the sum of eval
and search. A simple example is that one search might decide to call
eval when in check. The eval author might not have expected this, and
even if he has, it is not clear if he penalize the player who is
in check or just ignores the check.

There might be more subtile dependencies. There was a discussion what
would happen if the Ippo-eval was pasted into Stockfish. Or the other
way round. Nobody knows. And apparently nobody tries because of the
required effort and because of legal reasons due to the reverse
engineering that is involved in the Ippo-series. A unified interface
would make the effort more worthwhile (as it makes the modules
combinable with any engine that supports UnEvI) and move the legal
trouble to the people who work on the Ippo-series anyway.

Still the outcome might be that the result of combining search from
one engine with eval from another is crap, but at least we know that
then. Or does anybody know now?

I will go on with the technical issues. In case of major objections
concerning this point 2, you can ignore the rest of this posting.

3. An I/O-based interface as Winboard or UCI would certainly be much
too slow for eval. Also a protocol that just required FEN parsing is
not a good idea. Even making functions such as Board::piece_on_square
(Square q) virtual might be too expensive.

I think that an abstract board interface that encapsulates the whole
representation, whether bitboard or mailbox, slows down "board feature
functions" (see following point) considerably. Even with two bitboard
engines this might be a problem: One might have orthogonal and
diagonal sliders' bitboards and compute bishop, rook, queen from this,
while the other stores the pieces bitboards. Knowing which function is
fast should influence your implementation of board feature functions.

4. By "board feature functions" I mean for example
- square_is_under_knight_attack(Board& b, Square s)
- has_pawn_on_7th(Board& b)
- move_is_check (Board& b, Move m)
Such functions:
- have to be called not just from the eval, but also from the search
(to decide on extensions or reductions for example)
- can only be implemented efficiently when the internal board
representation is known
- are specific to the engine. One engine might need "has_pawn_on_7th".
Another might not need it, but need "move_is_fancy" instead.
For this reason, they cannot be part of the interface.

5. Summarizing 3 and 4, I think that a too generic interface is not
practical. Only an interface that returns directly what is physically
stored (getter) would be useful. Having realized that, a functional
interface (class Board with member functions) does not make
sense. Better simply pass a data structure.

6. However the internal representation might be similar enough if
authors commit on some standards. I think that converting an existing
bitboard engine to these standards should be possible without
affecting its design or introducing inefficiencies. (However, I have
not studied the board representations of many engines, so I might be
completely wrong. I just observed that Onno and Stockfish have come up
with a similar representation independently.)
Suggested standards:
- Use bitboards
- Map the squares to 0..63 this way: a1=0, b1=1, ..., a2=8, ...
(Other mappings would require conversion of published magics, so I
think most engines use above mapping.)
- have an array piece[64] in addition to the bitboards, using this
encoding:
000=empty,
001=white pawn, 002=white knight, ... 006=white king,
011=black pawn, ...
(octal numbers, so 011 = 0x9)
Should we choose an int8 (smallest possible) or better int32 (used for
enums in most compilers)?
- Which bitboards should be passed?
- Two of all_pieces, white_pieces, black_pieces are sufficient.
I think we should pass all three. If an engine has only two,
computation of the third is not too expensive.
- We might have the bitboards for the pieces (pawn, knight, bishop,
rook, queen, king) once for each color or only one per piece-type.
Again, conversion seems affordable compared to the cost of eval.
But still we should have the more common / better structure in the
interface. Which one is that?
(Onno has per-color-bitboards, Stockfish only has one per piece.)
- Similarly: Should we have bishop, rook, queen (3 bitboard) or
should we have orthogonal_sliders and diagonal_sliders (2 bitboards)?
(Stockfish has the first, Onno has the latter.)
- Some standards for side_to_move, castle_rights, ep_square.
- Scale of evaluation.
Centipawns would be easy. I had converted Onno to milli-pawns shortly
before I quit (part of my anti-0.00-attempt); this makes things more
complicated because the int16 range is not sufficient when one side
has extreme advantage. With units of 256ths of pawns (1/0x100) as in
Stockfish the int16 range should be sufficient.
- Additional parameters
- Some evals get alpha and beta for lazy evaluation.
- Stockfish has an additional return parameter "margin".
- hashes, especially pawn hash?
- What else? (Worst case: There is a specific parameter for each
engine.)

7. Both sides should try to work on the Board structure directly instead
of copying into it and out of it. If this is absolutely impossible
they could still use things like the pieces array and some of the bitboards.

There will remain a slight performance overhead:
- bitboard adaption if required
- eval must calculate PST by looping over the pieces. An engine with
integrated eval can just keep track of the PST.

8. I think a C interface (not C++) is the best compromise for simplicity,
speed, and compatibility. We should not restrict the authors in their
choice of the programming language, and most advanced languages
provide C interfaces while linking to C++ is at least much more difficult.
On the other hand, some advanced technology such as SOAP, XML-RPC,
even C# is certainly an overkill.

An interface file, UnEvI.h, should be available under the Creative
Commons License.

9. While things as enums for bitboards or pieces are certainly useful
for implementation, the UnEvI interface should be simple and not
obtrude the programmer such names.

10. It would be nice to have a sample implementation of eval under the
Creative Commons License too. A good candidate might be a PST only
evaluation based on the values from the Chess Programming Wiki.

11. We would need additional mechanisms to allow the eval DLL to send
its parameters (such as contempt, maybe weights) to the search engine.
The search engine should pass them as engine options. But those are
minor technical issues that certainly can be solved.

12. One might think of future extensions by passing an array of
pointers to extension structs, but for a first discussion, keep it
simple.

13. To simplify implementation of thread safety, a pointer can be passed
back and forth. The eval DLL can create eval objects with this
or just have it point to the thread id. If the eval.dll always returns
different pointers, it can rely on not having
two calls with the same pointer simultaneously.

14. How much additional effort would it mean for an other of an engine
(say Foo) to support UnEvI when he still wants to write both eval and
search? I think not much, neither in terms of implementation time nor
in terms of object oriented overhead. He can use the following design:

1a. Design a class Board, with basic functions such as
Int64 Board::white_pieces_bb ();
Place this class in the directory "eval".
This class only needs one initialization from UnEvIBoard
and getters (accessors, const-functions) otherwise.

1b. Write board feature functions, such as
has_pawn_on_7th (const Board&)
place them in the directory "base". They use class Board.

1c. Complete the directory "eval" and compile eval+base to your
UnEvI-Foo.DLL

2a. Write a class Board with the same interface as in step 1a,
either by copying (maybe without the initialization from UnEvIBoard)
or with a different implementation.
Add some functions as do_move(Move) or set_piece(Square)
Place the class in directory "search".

2b. Complete the directory "search" and compile search+base to
your EngineFoo.EXE

Alternatively, place class Board in base and derive SearchBoard
for the EXE. However, resist the "nice" design of an abstract base class
for performance reasons.

So here is a first suggestion:

Code: Select all

// UnEvI.h
// ###################################################################

typedef void* UnEvI;

UnEvI create_UnEvI ();

void  destroy_UnEvI (UnEvI);

typedef struct
{
  Int64 all_pieces,
        colored_pieces[2],
        pawns[2],
        knights[2],
        bishops[2],
        rooks[2],
        queens[2],
        kings[2],
  int   square[],
  int   side2move,
  int   ep_square,
  int   castle, // maybe some more
} UnEvIBoard;


int   eval (UnEvI,
            UnEvIBoard* p_board,
            int         p_alpha,
            int         p_beta,
            int*        r_margin);


// eval implementation example 1
// ###################################################################

// if you don't need a pointer for thread safety

static int dummy;


UnEvI create_UnEvI ()
{
  return &dummy;
}


UnEvI destroy_UnEvI (UnEvI)
{
}

int eval (UnEvI,
          UnEvIBoard* p_board,
          int         p_alpha,
          int         p_beta,
          int*        r_margin)
{
  // hack in you eval right here, ignoring first parameter
}


// eval implementation example 2
// ###################################################################

// more advanced, uses a hack to create an objeoct of class Board

enum Piece;
enum Square;
enum Color;

class Board
{
private:
  UnEvIBoard d_unevi_board;
  // don't add any members as you cannot initialize them with the hack
  
public:
  Piece    piece        (Square p_sq) const { return static_cast<Piece>(d_unevi_board.square[p_sq]); }
  Bitboard orth_sliders (Color p_c)   const { return d_unevi_board.rooks[p_c] | d_unevi_board.queens[p_c]; }
};

class Eval
{
private:
  PawnHash d_pawn_hash;
  
public:
  int operator() (const Board&, int p_alpha, int p_beta, int* r_margin);
};


UnEvI create_UnEvI ()
{
  return new Eval();
}


void destroy_UnEvI (UnEvI p_unevi)
{
  delete reinterpret_cast<Eval*> (p_unevi);
}


int eval (UnEvI       p_unevi,
          UnEvIBoard* p_board,
          int         p_alpha,
          int         p_beta,
          int*        r_margin)
{
  Eval* eval = reinterpret_cast<Eval*> (p_unevi);
  return (*eval) (*reinterpret_cast<Board*>(p_board), p_alpha, p_beta, r_margin); // reinterpret_cast is a hack
}


// eval implementation example 3
// ###################################################################

// Better implementation, but with additional pointer overhead
// In example 2, replace the following

class Board
{
private:
  const UnEvIBoard* d_unevi_board;
  Bitboard d_my_fancy_member;

public:
  Board (const UnEvIBoard* p_unevi_board)
    : d_unevi_board (p_unevi_board)
  {
    d_my_fancy_member[white] = p_unevi_board->rooks[white] | p_unevi_board->queens[white];
  }
};


int eval (UnEvI       p_unevi,
          UnEvIBoard* p_board,
          int         p_alpha,
          int         p_beta,
          int*        r_margin)
{
  Board board (p_board);
  Eval* eval = reinterpret_cast<Eval*> (p_unevi);
  return (*eval) (board, p_alpha, p_beta, r_margin);
}


// search implementation example
// ###################################################################

class Board
{
private:
  UnEvIBoard d_unevi_board;
  Int64 d_hash_key;

public:
  const UnEvIBoard& unevi_board() const;
  Piece piece   (Square p_sq) { return static_cast<Piece>(d_unevi_board.square[p_sq]); }
  void  do_move (Move p_move);
};
  

class SearchThread
{
private:
  UnEvI d_unevi;

public:
  SearchThread ()
    : d_unevi (create_UnEvI())
  {}
    
  ~SearchThread () { destroy_UnEvi (d_unevi); }

  void search (Board& p_board)
  {
    // ...
    eval (d_unevi, const_cast<UnEvIBoard&>(p_board.unevi_board(), ...); // eval gets const UnEvIBoard, but C does not know const, so we have to cast
    // ...
  }
}

sje · Post by **sje** » Sat Apr 30, 2011 8:20 pm

Some issues:

1) Not every platform is a Windows platform, so using Windows DLLs is not a general solution.

2) If the evaluator can be placed in a library, why not the search as well?

3) What happens if the evaluator and the search are the same routine, or if they are separate but mutually recursive?

4) Sixteen bit long integers are not big enough to hold scores for some programs, either because of resolution deficiency or because of the lack of space for special (mate/lose) scores. Also, on modern processors, 16 bit integer arithmetic is not measurably faster than 32 bit or even 64 bit arithmetic.

5) Programs that don't use the binary position format of the specification will have to translate on every call and this just might cost more than FEN encoding/decoding.

Onno Garms · Post by **Onno Garms** » Sat Apr 30, 2011 9:52 pm

sje wrote:Some issues:

1) Not every platform is a Windows platform, so using Windows DLLs is not a general solution.

ACK. Unfortunately I don't know about dynamic loading mechanisms on other platforms. On Windows, it would be possible to select the DLL after the EXE has been started, while the EXE is NOT linked against the DLL, neigther statically not dynamically. I hope there is some similar mechanism for Linux?

2) If the evaluator can be placed in a library, why not the search as well?

You mean that engine.exe uses search.dll and eval.dll? Well, I think this would mostly reduce engine.exe to a converter of Winboard/UCI to the interface of search.dll, and the interface of search.dll would be very similar to Winboard/UCI, i.e. relatively complex. The only advantage I can see now is that the library loading mechanism must be implemented only in one engine.exe instead of having every author of a search have implement it.

(Sorry for using Windows suffixes again...)

3) What happens if the evaluator and the search are the same routine, or if they are separate but mutually recursive?

You mean that pieces of evaluation are scattered over the search? Or that eval calls search? Can you give examples of engines doing that?

4) Sixteen bit long integers are not big enough to hold scores for some programs, either because of resolution deficiency or because of the lack of space for special (mate/lose) scores. Also, on modern processors, 16 bit integer arithmetic is not measurably faster than 32 bit or even 64 bit arithmetic.

I know the latter. Where I wrote int I meant 32 bit. Onno uses 32 bit eval. The issue with 16 bit comes up when I store the values in the transposition table. In earlier versions I could simply truncate to 16 bit, now I have to scale deadly won or lost positions.

Assuming that 16 bit are enough when the eval unit is 1/256, your statement that 16 bit are not enough for some programs implies that they use finer values. So if one attempts to find a standard, something finer than 1/256 should be used?

Has such a fine eval been shown to be useful? For Onno, changing from centipawns to millipawns didn't make much difference.

5) Programs that don't use the binary position format of the specification will have to translate on every call and this just might cost more than FEN encoding/decoding.

In deed, that is the major problem. But:
1. For a well designed engine it should not be too difficult to change something like the mapping of pieces to integers in the internal representation, which reduces the need of conversion.
2. If the engine has something like class Board, it should be easy to add UnEvIBoard as a private member and use members of UnEvIBoard for all previous members of Board that are found there (hopefully the piece array, most likely the ep square). If the program has other bitboards, conversion is just a few & | ^, so affordable.

If the program maps the squares to 0..63 in a different way, or only has completely different bitboards such as 45 degree rotated, there is trouble, in deed.

I'm sure that only some programs can be adapted to use UnEvI. I don't know too many source codes and made this post to learn how big the differences in board representation are. I think that both Onno and Stockfish could be converted, though their board representation was developed independently.

EDIT: Moreover I think that while heavy conversation might be too expensive for a call of eval, it is almost certainly cheaper than FEN parsing.

Daniel Shawul · Post by **Daniel Shawul** » Sat Apr 30, 2011 10:14 pm

ome issues:

1) Not every platform is a Windows platform, so using Windows DLLs is not a general solution.

ACK. Unfortunately I don't know about dynamic loading mechanisms on other platforms. On Windows, it would be possible to select the DLL after the EXE has been started, while the EXE is NOT linked against the DLL, neigther statically not dynamically. I hope there is some similar mechanism for Linux?

Scorpio egbbs work that way. There is an equivalent statically/dynamically linked libraries for unix & mac. I chose to use this method to disassociate the probing code from the engine's code. And it works quite well.

"Dirty" engine uses this method for eval & search. Two geographically separted persons work on the eval and search separately. Makes sense too.

What is your reason for doing this ? You do realize you have to share the board representation completely. This could be very difficult even for two signicantly different version of the same engine let alone two completely different engines.
And also there will be a significant penalty in speed. You have to "setboard" at every node. This could be hundred times slower. For egbbs it was worth it one because disk probes are slow anyway and another 5 or less pieces on the board means easy to set up.

Also there isn't a lot of interesting stuff in eval(). More interesting is experiments with search and same eval. There is already tournaments of such kind with material only eval. And it doesn't need dlls. If you want to do this for search , specifiying what to be used there sounds much easier. For example use NullMove R=3, LMR = 1 etc...

I think you should think about your goal carefully before worrying about implementation details. Maybe you have mentioned it in your post but I didn't read all of it.

Onno Garms · Post by **Onno Garms** » Sat Apr 30, 2011 10:47 pm

Daniel Shawul wrote: What is your reason for doing this ?

Make it possible to test the same search with two different evals or the same eval with two different searches to compare just the evals or just the searches. (Or confirm the theory that an engine is considerably more than the sum of search and eval.) Make it possible for authors to concentrate on one without having to find a partner for the other.

You do realize you have to share the board representation completely. This could be very difficult even for two signicantly different version of the same engine let alone two completely different engines.
And also there will be a significant penalty in speed. You have to "setboard" at every node. This could be hundred times slower. For egbbs it was worth it one because disk probes are slow anyway and another 5 or less pieces on the board means easy to set up.

Why both? I think either a significant penalty in speed or a shared board representation.

Also there isn't a lot of interesting stuff in eval(). More interesting is experiments with search and same eval. There is already tournaments of such kind with material only eval. And it doesn't need dlls.

Sure, such a simple eval can easily be implated into any program and doesn't need dlls. I am not the eval guy either, but I thought some people would object your opinion that there isn't a lot of interesting stuff in eval().

I think you should think about your goal before worrying about implementation details.

OK, some details go too far. Other details have been considered to find out if there is an implementation that reduces the speed penalty to a minimum while leaving the amount that must be shared on an acceptable level.

Onno Garms · Post by **Onno Garms** » Sat Apr 30, 2011 10:54 pm

Daniel Shawul wrote: "Dirty" engine uses this method for eval & search. Two geographically separted persons work on the eval and search separately. Makes sense too.

I read about that (at least that eval is in a dll there; I didn't know that search and eval were developed by different persons). What are their experiences? Was there a need to change the interface? Is their interface public?

Daniel Shawul · Post by **Daniel Shawul** » Sat Apr 30, 2011 11:16 pm

You do realize you have to share the board representation completely. This could be very difficult even for two signicantly different version of the same engine let alone two completely different engines.
And also there will be a significant penalty in speed. You have to "setboard" at every node. This could be hundred times slower. For egbbs it was worth it one because disk probes are slow anyway and another 5 or less pieces on the board means easy to set up.

Why both? I think either a significant penalty in speed or a shared board representation.

No. You have to 'setboard' in any case. You are making moves in search but not eval. So the next time you want to do your eval() it will be a different board all in all. So you have to either update both states in parallel or 'setboard' every time. This is significant with 32 pieces.
You are proposing an eval() method which could be as slow as egbb probes say in RAM.

By using same board representation ,you only avoid an extra board conversion . Scorpio & Egbbprobe do infact use the same board representation but I still have to 'setboard' during probe.

Sven · Post by **Sven** » Sun May 01, 2011 12:41 am

Daniel Shawul wrote:

You do realize you have to share the board representation completely. This could be very difficult even for two signicantly different version of the same engine let alone two completely different engines.
And also there will be a significant penalty in speed. You have to "setboard" at every node. This could be hundred times slower. For egbbs it was worth it one because disk probes are slow anyway and another 5 or less pieces on the board means easy to set up.

Why both? I think either a significant penalty in speed or a shared board representation.
No. You have to 'setboard' in any case. You are making moves in search but not eval. So the next time you want to do your eval() it will be a different board all in all. So you have to either update both states in parallel or 'setboard' every time. This is significant with 32 pieces.
You are proposing an eval() method which could be as slow as egbb probes say in RAM.

By using same board representation ,you only avoid an extra board conversion . Scorpio & Egbbprobe do infact use the same board representation but I still have to 'setboard' during probe.

As I understood the idea was to pass a pointer to the board data structure, owned by the search, to the evaluation function - so not much different to what is currently done in most engines.

I understand your comment such that you think the evaluation function would maintain an own copy of the board, implying that each eval call would have to perform a 'setboard'. But the board state would not exist twice.

Sven

UncombedCoconut · Post by **UncombedCoconut** » Sun May 01, 2011 1:21 am

Onno Garms wrote:ACK. Unfortunately I don't know about dynamic loading mechanisms on other platforms. On Windows, it would be possible to select the DLL after the EXE has been started, while the EXE is NOT linked against the DLL, neigther statically not dynamically. I hope there is some similar mechanism for Linux?

Sure. Functionally, dlopen() = LoadLibrary()/LoadLibraryEx(), dlsym() = GetProcAddress(), dlclose() = FreeLibrary(). These work on most Unix-like systems, including Mac OS X.

Daniel Shawul · Post by **Daniel Shawul** » Sun May 01, 2011 3:00 am

Ok.the binary data is going to be shared as well. But that will exclude many existing engines, unless they rewrite their make/unmake to follow exactly same format. Even with the same "board representation", say bitboards, it could be difficult with different orientation . It has to be compiled for 32bit/64bit platforms. Precomputed stuff such as magic attack/mobility tables will not work otherwise. Other board representations such as piece lists are out of the question ofcourse. I had to provide 6 dlls for the egbb for windows/linux/mac 32/64 bit. If you share data , the engine has to be compiled for 6 platforms as well.

Edit: I finished reading it and it seems he tried to address many of them. But there are problems even making stockfish & onno with similar board structure work together.

UnEvI: separating eval from search?

UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?

Re: UnEvI: separating eval from search?