Discussion of chess software programming and technical issues.
Moderators: hgm, Rebel, chrisw
-
smatovic
- Posts: 2658
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Post
by smatovic »
Wait you use a bubble sort???
I also tried a buttom up heapsort, not much better.
--
Srdja
-
voyagerOne
- Posts: 154
- Joined: Tue May 17, 2011 8:12 pm
Post
by voyagerOne »
Right, even without move ordering I am still surprise at the low speed...
By the way, nice work! I am sure this was no easy task to program.
-
smatovic
- Posts: 2658
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Post
by smatovic »
Right, even without move ordering I am still surprise at the low speed...
Maybe i should switch to newer hardware....the GTS250 is based on the the 8800 from 2006....
By the way, nice work! I am sure this was no easy task to program.
thx, coding on such an architecture is definitely a new experience.
--
Srdja
-
voyagerOne
- Posts: 154
- Joined: Tue May 17, 2011 8:12 pm
Post
by voyagerOne »
Yes, I also suggest to test it on new hardware. Maybe you know somebody with a powerful GPU that you can test it on.
So I would go down this path...instead of optimizing move ordering since you know you will only get 100kns at best.
-
smatovic
- Posts: 2658
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Post
by smatovic »
changed the design from "One SIMD Unit One Board" to "One Thread One Board",
Source Nvidia
https://github.com/smatovic/Zeta/tree/zeta_nvidia_0920
Source AMD
https://github.com/smatovic/Zeta/tree/zeta_amd_0920
One thread makes now about 10 000 nps.
Next topic would be a load balancer for min 512 threads across 16 SIMD Units.
YBWC with its Master/Slave relations seems a bit sophisticated for OpenCL.
Maybe i will try a two tier system, stack based inside a simd unit and master/slave across simd units....
--
Srdja