Optimizing for AMD - which processors/flags should be used?
Moderators: hgm, Rebel, chrisw
-
- Posts: 593
- Joined: Sat Aug 20, 2011 9:43 am
Optimizing for AMD - which processors/flags should be used?
Anyone know which amd processors are most widely used for chess, or which flags would have a positive elo gain on a typical amd-based chess computer?
http://gcc.gnu.org/onlinedocs/gcc/i386- ... 64-Options
http://gcc.gnu.org/onlinedocs/gcc/i386- ... 64-Options
-
- Posts: 1334
- Joined: Sun Jul 17, 2011 11:14 am
Re: Optimizing for AMD - which processors/flags should be us
From what little I know of the AMD processors used by CC, optimising for either K10 or Bulldozer is probably best. Ray Banks has a Phenom II X6, I have an FX-6300, and someone else (I cannot remember your name, sorry) has an FX-8120.
Matthew:out
Matthew:out
Some believe in the almighty dollar.
I believe in the almighty printf statement.
I believe in the almighty printf statement.
-
- Posts: 593
- Joined: Sat Aug 20, 2011 9:43 am
Re: Optimizing for AMD - which processors/flags should be us
If you're using Linux, what's the output of this command on the 6300:
thanks,
Jesse
Code: Select all
~$ sudo lshw | grep -A12 "description: CPU"
Jesse
-
- Posts: 1334
- Joined: Sun Jul 17, 2011 11:14 am
Re: Optimizing for AMD - which processors/flags should be us
Jesse Gersenson wrote:If you're using Linux, what's the output of this command on the 6300:
thanks,Code: Select all
~$ sudo lshw | grep -A12 "description: CPU"
Jesse
Code: Select all
[sudo] password for matthew:
Code: Select all
description: CPU
product: (To Be Filled By O.E.M.)
vendor: Advanced Micro Devices [AMD]
physical id: 4
bus info: cpu@0
version: AMD FX(tm)-6300 Six-Core Processor
serial: To Be Filled By O.E.M.
slot: CPUSocket
size: 3600MHz
capacity: 3600MHz
width: 64 bits
clock: 200MHz
capabilities: x86-64 fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bmi1 cpufreq
Code: Select all
gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
Matthew:out
Some believe in the almighty dollar.
I believe in the almighty printf statement.
I believe in the almighty printf statement.
-
- Posts: 593
- Joined: Sat Aug 20, 2011 9:43 am
Re: Optimizing for AMD - which processors/flags should be us
Ok, thanks. What's your output for
AMD users, please chime in with your output. Thanks.
Code: Select all
gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
-
- Posts: 1334
- Joined: Sun Jul 17, 2011 11:14 am
Re: Optimizing for AMD - which processors/flags should be us
Code: Select all
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.1/cc1 -E -quiet -v - -march=bdver2 -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mno-movbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mno-bmi2 -mtbm -mavx -mno-avx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mf16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mxsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver2
Matthew:out
Some believe in the almighty dollar.
I believe in the almighty printf statement.
I believe in the almighty printf statement.
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Optimizing for AMD - which processors/flags should be us
There are many AMD processors.
Modern ones support the POPCNT instruction, so if possible, you should compile with -msse4.2, which will enable that, but you probably also want a non-POPCNT build for older chips. The GCC builtin for 64-bit popcnt is __builtin_popcountll.
I think most other processor-specific flags are going to have a very small effect on performance.
--Jon
Modern ones support the POPCNT instruction, so if possible, you should compile with -msse4.2, which will enable that, but you probably also want a non-POPCNT build for older chips. The GCC builtin for 64-bit popcnt is __builtin_popcountll.
I think most other processor-specific flags are going to have a very small effect on performance.
--Jon
-
- Posts: 411
- Joined: Thu Dec 30, 2010 4:48 am
Re: Optimizing for AMD - which processors/flags should be us
amd supported popcount before it supported sse4, so if the goal is to have a popcount enabled build that supports as many amd processors as possible, you should just turn it on separately and not force sse4 support:
From a phenom ii k10 :
/usr/lib/gcc/x86_64-linux-gnu/4.8/cc1 -E -quiet -v -imultiarch x86_64-linux-gnu - -march=amdfam10 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mabm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10 -fstack-protector -Wformat -Wformat-security
From a phenom ii k10 :
/usr/lib/gcc/x86_64-linux-gnu/4.8/cc1 -E -quiet -v -imultiarch x86_64-linux-gnu - -march=amdfam10 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul -mpopcnt -mabm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10 -fstack-protector -Wformat -Wformat-security
-
- Posts: 3550
- Joined: Thu Jun 07, 2012 11:02 pm
Re: Optimizing for AMD - which processors/flags should be us
There are a lot of Phenom II X4 and X6 machines out there.
The Piledrivers have been very successful too, from a price/performance point of view.
Architecturally these CPUs are very different, so if you want to maximise performance it may be that you need a compile for each.
The Piledrivers have been very successful too, from a price/performance point of view.
Architecturally these CPUs are very different, so if you want to maximise performance it may be that you need a compile for each.
-
- Posts: 793
- Joined: Sun Aug 03, 2014 4:48 am
- Location: London, UK
Re: Optimizing for AMD - which processors/flags should be us
I would recommend something likeJesse Gersenson wrote:Anyone know which amd processors are most widely used for chess, or which flags would have a positive elo gain on a typical amd-based chess computer?
http://gcc.gnu.org/onlinedocs/gcc/i386- ... 64-Options
Code: Select all
-march=[some 5 year old architecture] -mtune=[some very recent architecture]
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.