kranium wrote:the significant improvement in hyperthreading has been known for some time...
Microsoft worked very closely w/Intel during the development of WIndows 7, and significantly beefed up support for HT.
a large proportion of the best engines on playchess are now running w/ hyperthreading enabled,
why?, the operators noticed a significant performance improvement
there's no Houdini 'magic' here...regardless of what the author may want you to believe...
all SMP engines should benefit equally (if run on a modern Intel architecture w/ Windows 7)
Houdini: 0 net ELO gain when tested against other SMP opponents on same architecture
I tested a bit Stockfish (Stockfish 2.3.1 JA 64bit SSE4.2), and it seems to benefit too from HT. 100 hard positions for 30 seconds, 1024MB Hash, average of 2 runs
4 threads: 28.5/100 Average time: 10.75s
8 threads : 36/100 Average time: 9.85s
I think this is pretty conclusive (if there is no massive distortion of the tree from more threads). So yes, it seems Houdini 3 is not the only engine benefiting from HT.
kranium wrote:the significant improvement in hyperthreading has been known for some time...
Microsoft worked very closely w/Intel during the development of WIndows 7, and significantly beefed up support for HT.
a large proportion of the best engines on playchess are now running w/ hyperthreading enabled,
why?, the operators noticed a significant performance improvement
there's no Houdini 'magic' here...regardless of what the author may want you to believe...
all SMP engines should benefit equally (if run on a modern Intel architecture w/ Windows 7)
Houdini: 0 net ELO gain when tested against other SMP opponents on same architecture
I tested a bit Stockfish (Stockfish 2.3.1 JA 64bit SSE4.2), and it seems to benefit too from HT. 100 hard positions for 30 seconds, 1024MB Hash, average of 2 runs
4 threads: 28.5/100 Average time: 10.75s
8 threads : 36/100 Average time: 9.85s
I think this is pretty conclusive (if there is no massive distortion of the tree from more threads). So yes, it seems Houdini 3 is not the only engine benefiting from HT.
Kai
In test positions, the key is hard to find in the three. As more threads tends to make the tree larger, the engine finds the key faster. That's why I suggested to measure time to depth (in the start position). It reflects more normal games ...
kranium wrote:the significant improvement in hyperthreading has been known for some time...
Microsoft worked very closely w/Intel during the development of WIndows 7, and significantly beefed up support for HT.
a large proportion of the best engines on playchess are now running w/ hyperthreading enabled,
why?, the operators noticed a significant performance improvement
there's no Houdini 'magic' here...regardless of what the author may want you to believe...
all SMP engines should benefit equally (if run on a modern Intel architecture w/ Windows 7)
Houdini: 0 net ELO gain when tested against other SMP opponents on same architecture
I tested a bit Stockfish (Stockfish 2.3.1 JA 64bit SSE4.2), and it seems to benefit too from HT. 100 hard positions for 30 seconds, 1024MB Hash, average of 2 runs
4 threads: 28.5/100 Average time: 10.75s
8 threads : 36/100 Average time: 9.85s
I think this is pretty conclusive (if there is no massive distortion of the tree from more threads). So yes, it seems Houdini 3 is not the only engine benefiting from HT.
Kai
In test positions, the key is hard to find in the three. As more threads tends to make the tree larger, the engine finds the key faster. That's why I suggested to measure time to depth (in the start position). It reflects more normal games ...
Depth doesn't say anything, neither pure NPS will do. The quality of the move in a given time is important, and that is decided in games, but there is no way for me to play 500 long TC games. In test suites we know that the solution is the optimal move, the only bothering thing is the shape of the tree going from 4 to 8 threads, but I think Houdini 3 TM is already optimized, so that 8 threads tree will not become suddenly more up to task than 4 threads tree, otherwise one could improve Houdini's TM tactical abilities generally, even on one core. I am more worried about Stockfish results, because it is optimized on games, but I don't think I can do better than testing some hard solvable positions, many of them and with long times to solutions. The ultimate test is just playing 500-2000 medium-long TC games for detecting 10-20 Elo points performance change.
Laskos wrote:The quality of the move in a given time is important, and that is decided in games, but there is no way for me to play 500 long TC games. In test suites we know that the solution is the optimal move, the only bothering thing is the shape of the tree going from 4 to 8 threads, but I think Houdini 3 TM is already optimized, so that 8 threads tree will not become suddenly more up to task than 4 threads tree, otherwise one could improve Houdini's TM tactical abilities generally, even on one core.
You could try running the test also on positional suites, that tends to be less tied to the growing of the tree in comparision to tactical ones.
I think the best test (apart running games) is to run both tactical and positional test suites and compare the results on average. This should give a somewhat closer look at what can happen during a normal game scenario.
Laskos wrote:The quality of the move in a given time is important, and that is decided in games, but there is no way for me to play 500 long TC games. In test suites we know that the solution is the optimal move, the only bothering thing is the shape of the tree going from 4 to 8 threads, but I think Houdini 3 TM is already optimized, so that 8 threads tree will not become suddenly more up to task than 4 threads tree, otherwise one could improve Houdini's TM tactical abilities generally, even on one core.
You could try running the test also on positional suites, that tends to be less tied to the growing of the tree in comparision to tactical ones.
I think the best test (apart running games) is to run both tactical and positional test suites and compare the results on average. This should give a somewhat closer look at what can happen during a normal game scenario.
I got a work-around (?) for playing actual games, right now under test. I am curious of Robert's opinion on that. On my comp HT is worth at least 27% (up to 36%). So, I put 4-threaded Houdini 3 on 1 million nodes/move setting against 8-threaded Houdini 3 on 1.27 million nodes/move setting, resulting in games in about 25 seconds. This way in several hours I will have a result. For now it's 15:15, just the beginning.
Ernest: 20 Elo points is an estimation for some 55% draws
Hi all
These are my Fritz Benchmark results as promised :-
and with 12 threads ;-
As is obvious, there is more than a 35 % increase in kN/s speed ( same as Mr Kai Laskos obtained) when all 12 threads are enabled.
No doubt about it at least in my mind, Hyperthreading RULEZZZ, at least for my i7 3930 K !
Laskos wrote:The quality of the move in a given time is important, and that is decided in games, but there is no way for me to play 500 long TC games. In test suites we know that the solution is the optimal move, the only bothering thing is the shape of the tree going from 4 to 8 threads, but I think Houdini 3 TM is already optimized, so that 8 threads tree will not become suddenly more up to task than 4 threads tree, otherwise one could improve Houdini's TM tactical abilities generally, even on one core.
You could try running the test also on positional suites, that tends to be less tied to the growing of the tree in comparision to tactical ones.
I think the best test (apart running games) is to run both tactical and positional test suites and compare the results on average. This should give a somewhat closer look at what can happen during a normal game scenario.
I got a work-around (?) for playing actual games, right now under test. I am curious of Robert's opinion on that. On my comp HT is worth at least 27% (up to 36%). So, I put 4-threaded Houdini 3 on 1 million nodes/move setting against 8-threaded Houdini 3 on 1.27 million nodes/move setting, resulting in games in about 25 seconds. This way in several hours I will have a result. For now it's 15:15, just the beginning.
Ernest: 20 Elo points is an estimation for some 55% draws
Kai
I managed to have 600 games with 1.27 million nodes/move 8-threads against 1.00 million nodes/move 4-threads. Every 80 or so games I had to restart engines, they were stalling in LittleBlitzer. 8-threaded went on average 0.40 ply deeper. The result is