Uri Blass wrote:I found that for me bigger eval hash is better.
Note that I evaluate every node and with small eval hash tables I am afraid I will get almost no hits.
I also found that pawn hash table of some mbytes make movei slightly faster relative to no hash.
Well, if your evaluation takes longer to compute than a hash probe divided by the hit rate, the a bigger table is always better. What I suggest applies to the opposite case. There will be an upper limit to the hit rate, as even an asymptotically large table will havee misses when a position is encountered for the first time. If even in that case computing the eval is still faster than probing for it, the only thing that could help is make the probing faster. And for that you would have to make the table fit L2.
I do not understand what is the meaning of 286 clocks on your slow pentium M-machine and I would like to have some translation to times in terms of 1/10^6 seconds.
Do you think that the times are the same with a faster machine or do you think that the times are simply not faster enough with a faster machine and it may cost less time but more than 286 cycles?
My Pentium M runs at 1.3GHz, so that is 13 CPU clocks per 10 ns. The multiplier ratio was 13, so the Front-Side Bus was running at 100 MHz (this is advertized by Intel as 400MHz, because the bus is ' quad pumped' , meaning that 4 data words (of 64 bits) can be transferred per FSB clock cycle). So one memory access takes 22 FSB clocks, or 220 ns = 0.22/10^6 sec.
This is including the write-back time of the old cache contents, so it is actually the average time it takes to do 2 memory accesses, one read and one write. As these seem to occur partly in parallel, access with only clean cash (that has not been written, and so can be discarded rather than having to be written back) is slightly faster. (Something like 234 clocks, IIRC.)
In principle this time is constant: if I would have a Pentium M at 1.6GHz the multiplier would be 16, but such a Pentium M would still have the '400 MHz' FSB, and the same chip set (containing the memory controller), and would thus also take 22 FSB cycles for the memory operations. But in that case it would be 352 CPU clocks.
Now with Intel chips with faster FSB (e.g. 533 MHz, or 1066 MHz, meaning that the bus spead in reality is 133MHz or 266 MHz) you cannot count on that it also takes 22 FSB cycles: the access time is mainly determined by memory technology. Only the transfer time between the North bridge of the Chipset and the CPU will be faster, but from memory to North bridge will be the same (unless your machine also requires faster memory modules). I have not measured exactly yet how much this matters.
On an Athlon64 in principle the same holds, except that memory access is systematically faster because the North bridge is taken out of the loop, and the CPU is connected to the memory directly. But also there the access time is determined by the memory technology, and independent of the CPU speed. So the faster your CPU, the less competative it will be to hash things in main memory.