What i do is a tad more work. I have a function that's looping in the same manner as how the indexation works. So the way how it puts pieces on the board, that's exactly the same as how the indexation works. That's WAY FASTER. then for each lookup you do, you figure out where it is in incremental manners. The things you don't want to do is sophisticated tricks. What you can't avoid is mirroring and n of the same men, as the savings is too huge of that. Koistinen wanted to save also onto the 63 * 62 * bla bla, and just use either 48 for pawns or 64 there,Daniel Shawul wrote:I did a profile on generation of 4 men bitbases and guess what comes out at the top. Yes it is the complex indexing which is the price I pay for ellegance. After making non-capture moves (forward or retro), I calculate the index of the position from _scratch_. This is because I used all kinds of tricks for indexing. If it was a simple 64x64x.. etc it would have been possible to update incrementally by just taking out the contribution from the from square and then adding the to square contribution. I guess I will have to try naive indexing methods for speedy generation.. I think it may even be possible to incrementlly update with the indexing I have that could give me some speedup. OTOH move generation is so fast depsite what someone here might claimCode: Select all
Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 22.96 5.08 5.08 80073496 0.00 0.00 ENUMERATOR::get_index(unsigned long&, bool) 21.67 9.87 4.79 8 0.60 1.94 ENUMERATOR::backward_pass(unsigned char*, unsigned char*, unsigned char*, unsigned char*, unsigned long const&, unsigned long const&, bool) 10.04 12.09 2.22 12024640 0.00 0.00 ENUMERATOR::get_retro_score(unsigned char*, unsigned char*, unsigned char*, unsigned char*, int, bool) 9.64 14.22 2.13 195361391 0.00 0.00 SEARCHER::do_move(int const&) 7.80 15.94 1.73 19882129 0.00 0.00 ENUMERATOR::get_pos(unsigned long) 7.01 17.49 1.55 195361391 0.00 0.00 SEARCHER::undo_move(int const&) 5.56 18.72 1.23 14874174 0.00 0.00 ENUMERATOR::get_init_score(unsigned char&, int, int, bool) 5.29 19.89 1.17 210660367 0.00 0.00 SEARCHER::attacks(int, int) const 3.44 20.65 0.76 8 0.10 0.77 ENUMERATOR::initial_pass(unsigned char*, unsigned char*, unsigned char*, unsigned char*, unsigned long const&, unsigned long const&, bool) 2.96 21.31 0.66 14874174 0.00 0.00 SEARCHER::gen_all() 1.36 21.61 0.30 12024640 0.00 0.00 SEARCHER::gen_retro() 1.18 21.87 0.26 2 0.13 10.96 generate_slice(ENUMERATOR*, _IO_FILE*, _IO_FILE*) 0.41 21.96 0.09 print_sq(int const&) 0.36 22.04 0.08 SEARCHER::SEARCHER() 0.27 22.10 0.06 4439741 0.00 0.00 SEARCHER::probe_bitbases(int&) 0.05 22.11 0.01 7 0.00 0.00 ENUMERATOR::init() 0.02 22.11 0.01 SEARCHER::gen_noncaps()If I did generation of 5 or 6 men involving disk access then move generation speed will be pretty irrelevant..
but that's something you have to test yourself. With a few tables you already get far there and just replace the tables.
Note that most profilers do not reflect correctly how much you would lose to the RAM when code would be optimized. Don't forget this remark please some time from now. The GCC/valgrind junk is notorious there. Intel's stuff doing a tad better job there.
p.s. doing things backwards is too slow - do things forward it's easier and faster and more bugfree.
