about speed et profiling tools

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: about speed et profiling tools

Post by bob »

Another thought. NPS is highest when you do a pure minimax search. You don't waste any computation effort generating moves or doing other things that cost computational effort. As you improve ordering, you begin to see NPS drop because in lots of positions you just search one move and throw the rest away when you get a beta cutoff. You can recover a lot of that with tricks like not generating moves until you try the hash move or killer moves. But even then there is more cost over pure minimax because you still have to invest the time to do the ordering. Bottom line: Ignore NPS and work on time to solution. Any change that lets you complete a search to a fixed depth faster is usually better, whether the NPS goes up or down.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: about speed et profiling tools

Post by xr_a_y »

bob wrote: Sun Sep 23, 2018 6:59 pm You can recover a lot of that with tricks like not generating moves until you try the hash move or killer moves.
Former Weini move hashing was not bijective and i had to rebuild the whole move list for the position before knowing who is the tt move or the killer move...

Now Weini is just hashing move in a way I can rebuild the move from the hash to that I can try the TT and killers before building the move list. I started investigate this a few week ago but switch to something else. I'll give it another try soon.

Thanks
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: about speed et profiling tools

Post by Sven »

xr_a_y wrote: Sun Sep 23, 2018 7:22 pm
bob wrote: Sun Sep 23, 2018 6:59 pm You can recover a lot of that with tricks like not generating moves until you try the hash move or killer moves.
Former Weini move hashing was not bijective and i had to rebuild the whole move list for the position before knowing who is the tt move or the killer move...

Now Weini is just hashing move in a way I can rebuild the move from the hash to that I can try the TT and killers before building the move list. I started investigate this a few week ago but switch to something else. I'll give it another try soon.

Thanks
Note that killer moves are usually tried after all "winning" or "equal" captures, so you can only save generating quiet moves.
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
odomobo
Posts: 96
Joined: Fri Jul 06, 2018 1:09 am
Location: Chicago, IL
Full name: Josh Odom

Re: about speed et profiling tools

Post by odomobo »

PSA on Intel VTune: you can get a free renewable 90-day commercial license. Technically it's for Intel System Studio, but the license includes VTune. It took me a while to figure this out when researching this, so I figured I'd share. https://software.intel.com/en-us/system ... e-download
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: about speed et profiling tools

Post by xr_a_y »

odomobo wrote: Mon Sep 24, 2018 9:53 pm PSA on Intel VTune: you can get a free renewable 90-day commercial license. Technically it's for Intel System Studio, but the license includes VTune. It took me a while to figure this out when researching this, so I figured I'd share. https://software.intel.com/en-us/system ... e-download
Ok downloaded and installed. First try @1ms sampling rate gives the same result as valgrind. I'll try compiling the code with Intel compiler to see if line by line hotspot detection is better. Thanks for the link !
odomobo
Posts: 96
Joined: Fri Jul 06, 2018 1:09 am
Location: Chicago, IL
Full name: Josh Odom

Re: about speed et profiling tools

Post by odomobo »

xr_a_y wrote: Tue Sep 25, 2018 8:04 am
odomobo wrote: Mon Sep 24, 2018 9:53 pm PSA on Intel VTune: you can get a free renewable 90-day commercial license. Technically it's for Intel System Studio, but the license includes VTune. It took me a while to figure this out when researching this, so I figured I'd share. https://software.intel.com/en-us/system ... e-download
Ok downloaded and installed. First try @1ms sampling rate gives the same result as valgrind. I'll try compiling the code with Intel compiler to see if line by line hotspot detection is better. Thanks for the link !
You should be able to get a lot more info than valgrind gives you. You don't need to use ICC, you can use MSVC, GCC, clang.

You can see line-by-line hotspots (in optimized builds), and assembly-instruction hotspots. It can also tell you cpu-level things that are slowing down your code -- cache misses, branch prediction failures, etc. This is the real value of VTune, IMO. It can also do a lot more, but I've never used it beyond these basics.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: about speed et profiling tools

Post by xr_a_y »

So here is the final result of the experiment.

Starting with a very very simple Weini including only PST in eval and no search tricks, not even ordering, and adding more and more search features, I get those results.

Code: Select all

   1 dorpsgek                      370      27    1174   89.4%   11.1%
   2 29                            175      18    1174   73.3%   28.4%
   3 28                            174      18    1173   73.2%   29.1%
   4 25                            164      18    1174   72.0%   27.4%
   5 26                            163      18    1173   71.9%   29.8%
   6 24                            162      18    1174   71.8%   29.6%
   7 27                            150      17    1174   70.3%   30.7%
   8 22                            115      17    1174   66.0%   29.0%
   9 23                            109      17    1174   65.2%   28.4%
  10 fairymax                       85      18    1173   61.9%   22.9%
  11 18                             72      17    1174   60.3%   30.2%
  12 17                             71      17    1174   60.1%   32.2%
  13 15                             59      17    1173   58.4%   31.7%
  14 19                             54      17    1174   57.8%   29.0%
  15 20                             48      16    1174   56.9%   31.8%
  16 16                             45      17    1173   56.5%   31.1%
  17 21                             44      17    1174   56.3%   30.8%
  18 13                             -2      16    1173   49.7%   32.8%
  19 14                            -10      17    1173   48.6%   30.2%
  20 12                            -25      17    1173   46.4%   29.2%
  21 11                            -29      17    1173   45.8%   30.3%
  22 9                             -97      18    1174   36.5%   25.6%
  23 7                             -97      18    1174   36.3%   22.6%
  24 8                            -104      18    1174   35.5%   24.4%
  25 10                           -115      18    1174   34.0%   23.0%
  26 6                            -142      18    1174   30.6%   22.5%
  27 5                            -185      20    1174   25.6%   17.7%
  28 3                            -352      26    1174   11.7%   11.6%
  29 4                            -372      28    1174   10.5%    9.1%
  30 2                            -386      29    1174    9.8%    9.4%
  31 1                            -425      32    1173    8.0%    7.4%
1 : nothing, not even move ordering
2 : pvs at root : good boost +40
3 : pvs in search : good boost +34
4 : aspiration (I think it is ok for aspiration to lose some strength here as evaluation and search tricks are very very poor versus the small windows size +/- 6
5 : mvvlva : mega boost +187
6 : mvvlva in qsearch : good boost +43
7 : killers : good boost +45
8 : history : nothing ... I suspect there is something here. What is the expected elo gain of history ??
9 : counter : nothing ... I suspect there is something here. What is the expected elo gain of counter ??
10 : use PST for ordering : little lose but inside the error margin. Any comments ?
11 : null move : mega boost +86
12 : TT used for sorting root move : nothing, it probably adds strength but cost a little time
13 : TT used for sorting search move : +20, but inside the error margin, it probably adds strength but cost a little time
14 : TT used for sorting qsearch move : nothing, but inside the error margin, it probably adds strength but cost a little time
15 : TT used in search : mega boost +69
16 : TT use din qsearch : -14, inside error margin : probably cost too much to insert/get inside TT in qsearch
17 : dedicated TT for evaluation : +26
18 : allow recursive null move : nothing. I am surprised, with everything else activated and a full evaluation, this add something like +30
19 : use SEE for sorting move in search : -20. I am surprised, with everything else activated and a full evaluation, this add some elo
20 : use SEE for sorting move in qsearch : little lose. I am surprised, with everything else activated and a full evaluation, this add some elo
21 : add a sort bonus for recapturing the last piece moved if it was a capture : nothing
22 : LMR in search : mega boost +71
23 : LMR at root : nothing
24 : LMP in search : mega boost +53
25 : check extension (root, search) : nothing. I am surprised, with everything else activated and a full evaluation, this add some elo
26 : Single reply extension in search : nothing. I am surprised, with everything else activated and a full evaluation, this add some elo
27 : "near promotion" extension : little lose. I am surprised, with everything else activated and a full evaluation, this add some elo
28 : recapture extension : +24
29 : SEE pruning in qsearch : Nothing. I was expecting more.

So this is a total of +600elo from the most dummy engine.
The current dev version of Weini is -20 versus dorpsgek, so around +175 more, and includes a better evaluation and some more search tricks.

I would really appreciate your analysis on the strength scale.

Best regards.