AMD Bulldozer Architecture - Chess Performance Predictions

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

FlavusSnow
Posts: 89
Joined: Thu Apr 01, 2010 5:28 am
Location: Omaha, NE

AMD Bulldozer Architecture - Chess Performance Predictions

Post by FlavusSnow »

AMD is scheduled to release a new chip design (Bulldozer) in the next 6 months. I'm curious to see what everyone thinks about the performance penalties or benefits of the new design.

To reduce a description of the new design into a few lines: a single bulldozer module consists of one scheduler, two ALUs, and a Flex-FP (capable of 2 x 128 bit floating points or a single 256 bit floating point operation per clock). In essence, a single module runs two threads, and in some cases will increase performance by 80% in ALU intensive tasks.

Of course, operating frequency is still unknown, but for the sake of discussion lets assume all clocks are equal. Does the general audience here think that the bulldozer design will scale similar to normal cores or do you think they will scale more like intel's hyperthreading cores? e.g. do you think it will be better to run with 4 threads or 8 threads on a 4-module chip?

I'm not really interested in trying to argue Intel vs. AMD, or cost, or anything. I'm interested to know if this new design is good for chess or of no impact or no advancement.

I realize this is speculation, but I find it fun in this case.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by bob »

FlavusSnow wrote:AMD is scheduled to release a new chip design (Bulldozer) in the next 6 months. I'm curious to see what everyone thinks about the performance penalties or benefits of the new design.

To reduce a description of the new design into a few lines: a single bulldozer module consists of one scheduler, two ALUs, and a Flex-FP (capable of 2 x 128 bit floating points or a single 256 bit floating point operation per clock). In essence, a single module runs two threads, and in some cases will increase performance by 80% in ALU intensive tasks.

Of course, operating frequency is still unknown, but for the sake of discussion lets assume all clocks are equal. Does the general audience here think that the bulldozer design will scale similar to normal cores or do you think they will scale more like intel's hyperthreading cores? e.g. do you think it will be better to run with 4 threads or 8 threads on a 4-module chip?

I'm not really interested in trying to argue Intel vs. AMD, or cost, or anything. I'm interested to know if this new design is good for chess or of no impact or no advancement.

I realize this is speculation, but I find it fun in this case.
I ignore this stuff until the chip actually shows up, since it is difficult to know what the final product will look like. But for chess, FP doesn't offer much...
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by rbarreira »

These are my impressions based on very little information:

1- A module contains two cores (it doesn't really count as HT to run two threads on a module).

2- Bulldozer gives more importance to integer performance than FP performance (as you said, the FP hardware is shared between two cores in a module while the ALUs are unique for each core).

3- It's designed for high clock rates, so clock speeds being equal probably won't favor it.
User avatar
M ANSARI
Posts: 3707
Joined: Thu Mar 16, 2006 7:10 pm

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by M ANSARI »

My experience is that you should wait until the product actually comes out and is tested. The latest AMD CPU's are doing very well for chess, but most likely the top end performance still belongs to Intel. I think Intel still has a better or higher performance product and would be very happy to see AMD come back and take first place again. But I have to see it to believe it.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by jdart »

It sounds good, but Intel has been getting a lot more performance out of their cores than AMD, so even with fewer cores, they are getting better performance in some cases. See for example:

http://www.anandtech.com/show/2978/amd- ... -core-xeon

OTOH Intel's top Xeon and desktop chips are very expensive.
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by rbarreira »

jdart wrote:It sounds good, but Intel has been getting a lot more performance out of their cores than AMD, so even with fewer cores, they are getting better performance in some cases. See for example:

http://www.anandtech.com/show/2978/amd- ... -core-xeon

OTOH Intel's top Xeon and desktop chips are very expensive.
Quite a few of the benchmarks in that article are applications that have significant serial components. Obviously those aren't going to be fair to the 12-core CPU.

In other server benchmarks like this one, it takes a 8-core Xeon (costing almost $4000) to beat the 12-core Opterons which cost a little more than $1000.

If you're looking for the top performance Intel is definitely the answer now (and has been for a while). If price matters too, AMD is very often the best choice.
FlavusSnow
Posts: 89
Joined: Thu Apr 01, 2010 5:28 am
Location: Omaha, NE

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by FlavusSnow »

I ignore this stuff until the chip actually shows up, since it is difficult to know what the final product will look like. But for chess, FP doesn't offer much...
I'll ignore your comment all together and go ahead and ask another question.

If the chip is released with two ALUs per scheduler (one ALU per thread), do you think this is nearly as good as a completely separate core when it comes to chess?

For example 3.0 ghz on a quad core (3.6 speed up): 3.0 * 3.6 = 10.8 performance index. 3.0 Ghz on an 8 threaded bulldozer assuming 6.6 speedup and a 0.8 factor due to inefficiencies of a shared scheduler: 3.0 * 6.6 * 0.8 = 15.8

15.8-10.0 / 10.8 = 46% performance increase...all hypothetical of course.

Do you think this is outlandish? Intel chips already are about 30% faster clock for clock when it comes to chess...so 46% increase would put AMD with a very slight advantage, assuming nothing else in the system becomes a bottleneck.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by bob »

FlavusSnow wrote:
I ignore this stuff until the chip actually shows up, since it is difficult to know what the final product will look like. But for chess, FP doesn't offer much...
I'll ignore your comment all together and go ahead and ask another question.

If the chip is released with two ALUs per scheduler (one ALU per thread), do you think this is nearly as good as a completely separate core when it comes to chess?
No. Unless the chess engine is poorly optimized. The ideal engine will not have lots of idle time where it is stalled waiting on memory, or pipelines, or other resources...


For example 3.0 ghz on a quad core (3.6 speed up): 3.0 * 3.6 = 10.8 performance index. 3.0 Ghz on an 8 threaded bulldozer assuming 6.6 speedup and a 0.8 factor due to inefficiencies of a shared scheduler: 3.0 * 6.6 * 0.8 = 15.8

15.8-10.0 / 10.8 = 46% performance increase...all hypothetical of course.
Until we physically see 'em, it is difficult to predict how they "might" perform. But in general, I'd always prefer full, independent cores...


Do you think this is outlandish? Intel chips already are about 30% faster clock for clock when it comes to chess...so 46% increase would put AMD with a very slight advantage, assuming nothing else in the system becomes a bottleneck.
clock speed is not so important. The pentium IV had impressive clock speeds but was, in reality, a dog for chess...
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by rbarreira »

FlavusSnow wrote:
I ignore this stuff until the chip actually shows up, since it is difficult to know what the final product will look like. But for chess, FP doesn't offer much...
I'll ignore your comment all together and go ahead and ask another question.

If the chip is released with two ALUs per scheduler (one ALU per thread), do you think this is nearly as good as a completely separate core when it comes to chess?

For example 3.0 ghz on a quad core (3.6 speed up): 3.0 * 3.6 = 10.8 performance index. 3.0 Ghz on an 8 threaded bulldozer assuming 6.6 speedup and a 0.8 factor due to inefficiencies of a shared scheduler: 3.0 * 6.6 * 0.8 = 15.8

15.8-10.0 / 10.8 = 46% performance increase...all hypothetical of course.

Do you think this is outlandish? Intel chips already are about 30% faster clock for clock when it comes to chess...so 46% increase would put AMD with a very slight advantage, assuming nothing else in the system becomes a bottleneck.
There is one integer scheduler per core (i.e. two per module). What is shared between cores in a module is the Fetch, Decode, L2 Cache and FP hardware (including FP scheduler).

I have searched quite a bit and all available sources seem to confirm that. For example:

http://www.anandtech.com/show/3863/amd- ... ips-2010/5

http://hothardware.com/Reviews/Next-Gen ... ve/?page=2

This is how a 2 core chip would look like:

Image
FlavusSnow
Posts: 89
Joined: Thu Apr 01, 2010 5:28 am
Location: Omaha, NE

Re: AMD Bulldozer Architecture - Chess Performance Predictio

Post by FlavusSnow »

So this is even closer to true cores than I originally thought.

Lets just hope that other resources don't begin to limit performance when these chips finally do come out.