How to speed up my engine

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

lauriet
Posts: 199
Joined: Sun Nov 03, 2013 9:32 am

How to speed up my engine

Post by lauriet »

My engine gets to about 8 ply in 60 seconds. Following the addage of 'write it first and then optimize it' I would now like to improve things. So where should the effort go....move generation or search.
At the moment it has an 0x88 board and I have to scan the whole board to find a piece to generate its move. I could change this to a piece table that gives me the piece location which would reduce the scan to the 16 pieces.

My search has null move and LMR only.
Move ordering is PV move first, captures next, killers next, then the rest.
Im thinking about changing to PVS.

Im wandering what sort of 'bang for my buck' I will get for various changes.
I know bit boards would be an advantage but that would be a big rewrite, so I will leave that to my next engine.

Can anyone suggest what sort of changes and what improvement to expect.

Thanks
Laurie
lauriet
Posts: 199
Joined: Sun Nov 03, 2013 9:32 am

Re: How to speed up my engine

Post by lauriet »

Sorry, for got to add iterative deepening to the list of whats implemented.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How to speed up my engine

Post by bob »

lauriet wrote:My engine gets to about 8 ply in 60 seconds. Following the addage of 'write it first and then optimize it' I would now like to improve things. So where should the effort go....move generation or search.
At the moment it has an 0x88 board and I have to scan the whole board to find a piece to generate its move. I could change this to a piece table that gives me the piece location which would reduce the scan to the 16 pieces.

My search has null move and LMR only.
Move ordering is PV move first, captures next, killers next, then the rest.
Im thinking about changing to PVS.

Im wandering what sort of 'bang for my buck' I will get for various changes.
I know bit boards would be an advantage but that would be a big rewrite, so I will leave that to my next engine.

Can anyone suggest what sort of changes and what improvement to expect.

Thanks
Laurie
Since your comment is based on search (depth=8 in 60 seconds) that's the place to start. And not by optimizing the code for speed, but by adding things like null-move search, making sure you are using reasonable move ordering strategies, then off into LMR and forward pruning ideas...

When you say you have null-move and LMR, what does that mean? How do you set the R value for null move? static or dynamic based on depth and such? What about the reduction amounts for LMR?
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How to speed up my engine

Post by hgm »

Well, Fairy-Max is also 0x88 mailbox, it also scans the whole board to find its pieces, it has the pretty minimal reductions of R=2 for null move and 1 ply for late moves (but all non-capture, non-Pawn moves other than hash move are considered late), and uses vanilla alpha-beta rather than PVS. And on a 2.4GHz i7 it finishes the d=10 iteration in the initial position after 51 sec. (It reaches 567 knps under those conditions.) I do have to switch off its randomization for that. (Normally Fairy-Max randomizes heavily during the first 8 ply of the game, and as it adds different random scores in each iteration this interferes quite badly with its search efficiency.) Fairy-Max has no killers, and in fact the only move ordering is that it searches the hash move (or the IID move if there was no hash move) first. All other moves are searched in move-generation order. (Even the hash move would be searched a second time...) So all in all it utterly sucks, but still it searches two ply deeper in the same time.

So it seems there is ample room for improvement in your engine. First thing to check is your nodes per second, to see if there is a problem there.

Do you have a quiescence search? How do you order captures there? Having your queiescence search explode is a very effective way to keep the depth small.
lauriet
Posts: 199
Joined: Sun Nov 03, 2013 9:32 am

Re: How to speed up my engine

Post by lauriet »

The null move uses R=2 no matter what.
LMR (R=1) is based on the move ordering so it reduces moves that are < 0 based on PST.

Quiescence could be an issue.....I search ALL captures, good or bad. Can I improve this ??


Laurie.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How to speed up my engine

Post by Sven »

lauriet wrote:My engine gets to about 8 ply in 60 seconds. Following the addage of 'write it first and then optimize it' I would now like to improve things. So where should the effort go....move generation or search.
At the moment it has an 0x88 board and I have to scan the whole board to find a piece to generate its move. I could change this to a piece table that gives me the piece location which would reduce the scan to the 16 pieces.

My search has null move and LMR only.
Move ordering is PV move first, captures next, killers next, then the rest.
Im thinking about changing to PVS.

Im wandering what sort of 'bang for my buck' I will get for various changes.
I know bit boards would be an advantage but that would be a big rewrite, so I will leave that to my next engine.

Can anyone suggest what sort of changes and what improvement to expect.

Thanks
Laurie
First of all, swear that your program has no bugs :D Otherwise most optimizations won't make any sense.

Now I have these questions:
- Do you use a transposition hash table?
- How do you handle legality and is-in-check?
- In quiescence search, do you generate all moves and skip all non-captures/non-promotions, or do you have a dedicated move generator for captures and promotions only?
- How exactly do you order moves within quiescence search?
- What is your static evaluation function like: more like material + PST only, more like a full-blown, feature-rich evaluation function, or something inbetween?
- Have you done some profiling to find out whether your program behaves "as usual" performance-wise, i.e. uses most of its time during static evaluation, then search, move generation, and make/unmake? If it doesn't then you could try to find out why.
- Which is roughly the EBF (effective branching factor) of your engine, and about how many nodes per second does it search usually (opening/middlegame)?
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: How to speed up my engine

Post by Sven »

Two more questions:
- Are you collecting the PV during search (so that you are able to judge whether your search makes sense - if it doesn't make much sense then it would also be hard to improve its performance)?
- Can you show some example engine output documenting search progress (e.g. for WinBoard engines typically depth, score, time elapsed, nodes, PV, for UCI engines something similar)?
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: How to speed up my engine

Post by jdart »

Those are all good questions.

But behind the questions I think is this assumption: if you are getting 8 ply in 60 seconds on modern hardware very likely the problem is not the speed of the code but the algorithm. I suspect your branching factor is terrible (see https://chessprogramming.wikispaces.com ... ing+Factor).

--Jon
lauriet
Posts: 199
Joined: Sun Nov 03, 2013 9:32 am

Re: How to speed up my engine

Post by lauriet »

Sven Schüle wrote:
lauriet wrote:My engine gets to about 8 ply in 60 seconds. Following the addage of 'write it first and then optimize it' I would now like to improve things. So where should the effort go....move generation or search.
At the moment it has an 0x88 board and I have to scan the whole board to find a piece to generate its move. I could change this to a piece table that gives me the piece location which would reduce the scan to the 16 pieces.

My search has null move and LMR only.
Move ordering is PV move first, captures next, killers next, then the rest.
Im thinking about changing to PVS.

Im wandering what sort of 'bang for my buck' I will get for various changes.
I know bit boards would be an advantage but that would be a big rewrite, so I will leave that to my next engine.

Can anyone suggest what sort of changes and what improvement to expect.

Thanks
Laurie
First of all, swear that your program has no bugs :D Otherwise most optimizations won't make any sense.

Now I have these questions:
- Do you use a transposition hash table?
- How do you handle legality and is-in-check?
- In quiescence search, do you generate all moves and skip all non-captures/non-promotions, or do you have a dedicated move generator for captures and promotions only?
- How exactly do you order moves within quiescence search?
- What is your static evaluation function like: more like material + PST only, more like a full-blown, feature-rich evaluation function, or something inbetween?
- Have you done some profiling to find out whether your program behaves "as usual" performance-wise, i.e. uses most of its time during static evaluation, then search, move generation, and make/unmake? If it doesn't then you could try to find out why.
- Which is roughly the EBF (effective branching factor) of your engine, and about how many nodes per second does it search usually (opening/middlegame)?

No i cant swear that i have no bugs.......can anyone?
No transposition table.
In check, scans out from the king looking for attackers.
Quiescense has its own move gen that only gives captures.....no checks. Moves are ordered via captured - capturing.
Eval has pst, mobility, pawn structure, king safety, 2 rook/knight/bishop reward.
havent done any profiling yet......i will look into this.
branchinf factor is probably very poor as it can take 7 times longer to search the next ply.
nps is maybe about 1000-2000 per second.

thanks for the help
Laurie
lauriet
Posts: 199
Joined: Sun Nov 03, 2013 9:32 am

Re: How to speed up my engine

Post by lauriet »

Sven Schüle wrote:Two more questions:
- Are you collecting the PV during search (so that you are able to judge whether your search makes sense - if it doesn't make much sense then it would also be hard to improve its performance)?
- Can you show some example engine output documenting search progress (e.g. for WinBoard engines typically depth, score, time elapsed, nodes, PV, for UCI engines something similar)?

Yes i do collect the PV via triangular array method and it does make sense and the search score tallys with it ok. The program does beat some of my old desktop chess computers, and im guessing its around 1700 elo.
I will need to program in the some output during the search, so cant provide this just yet.

Laurie