Ryzen 2 and BMI2?

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Steve Maughan
Posts: 1074
Joined: Wed Mar 08, 2006 7:28 pm
Location: Florida, USA
Contact:

Ryzen 2 and BMI2?

Post by Steve Maughan » Sun May 13, 2018 1:26 pm

The first generation of Ryzen processors are extremely slow at executing the BMI2 instruction set. Does anyone know if this has been corrected in Ryzen 2 chips?

- Steve
http://www.chessprogramming.net - Maverick Chess Engine

Gian-Carlo Pascutto
Posts: 1196
Joined: Sat Dec 13, 2008 6:00 pm
Contact:

Re: Ryzen 2 and BMI2?

Post by Gian-Carlo Pascutto » Tue May 15, 2018 7:04 am

Ryzen 1 was really fast at BMI2, it was just slow at a single instruction, i.e. PEXT.

I wouldn't expect this to change. Nothing uses PEXT, aside from some chess engine movegens.

syzygy
Posts: 4503
Joined: Tue Feb 28, 2012 10:56 pm

Re: Ryzen 2 and BMI2?

Post by syzygy » Tue May 15, 2018 7:15 am

Gian-Carlo Pascutto wrote:
Tue May 15, 2018 7:04 am
Ryzen 1 was really fast at BMI2, it was just slow at a single instruction, i.e. PEXT.

I wouldn't expect this to change. Nothing uses PEXT, aside from some chess engine movegens.
Please make Sjeng use PEXT ;-)

Sesse
Posts: 203
Joined: Mon Apr 30, 2018 9:51 pm
Contact:

Re: Ryzen 2 and BMI2?

Post by Sesse » Tue May 15, 2018 1:27 pm

I wanted to use PEXT for a branchless UTF-8 parser, but unfortunately, the instruction is too slow for it to be a win over straight-up code. (I know others have tried and come to the same conslusion.)

Gian-Carlo Pascutto
Posts: 1196
Joined: Sat Dec 13, 2008 6:00 pm
Contact:

Re: Ryzen 2 and BMI2?

Post by Gian-Carlo Pascutto » Tue May 15, 2018 2:10 pm

syzygy wrote:
Tue May 15, 2018 7:15 am
Please make Sjeng use PEXT ;-)
There's no way to have the instruction inferred from pure C code, is there? That would make it annoying to use in a portable benchmark.

Sesse
Posts: 203
Joined: Mon Apr 30, 2018 9:51 pm
Contact:

Re: Ryzen 2 and BMI2?

Post by Sesse » Tue May 15, 2018 3:43 pm

No, you'd have to use an intrinsic or inline assembler. The former is fairly portable across compilers; at least MSVC, GCC, Clang and ICC all tend to support the Intel intrinsic style (_pext_u64 in this case) with some coaxing.

Joost Buijs
Posts: 1039
Joined: Thu Jul 16, 2009 8:47 am
Location: Almere, The Netherlands

Re: Ryzen 2 and BMI2?

Post by Joost Buijs » Tue May 15, 2018 5:52 pm

Gian-Carlo Pascutto wrote:
Tue May 15, 2018 7:04 am
Ryzen 1 was really fast at BMI2, it was just slow at a single instruction, i.e. PEXT.

I wouldn't expect this to change. Nothing uses PEXT, aside from some chess engine movegens.
PEXT and his counterparty PDEP are both incredible slow on AMD Zen hardware because AMD was lazy and implemented these instructions in microcode instead of logic.

On intel processors you can really make very good use of PEXT in your evaluation function, for instance to index pawn patterns (or any other pattern) in a very fast way. In the pawn evaluator I'm currently working on I use PEXT throughout, using PEXT it runs about twice as fast as what I can get without using PEXT, unfortunately this doesn't work on AMD processors, on AMD is the old vintage way of calculating indices the only solution.

I'm pretty sure that AMD didn't fix this for Zen+ either, maybe they will fix it next year when Zen2 arrives, who knows? Until this is fixed I won't consider buying an AMD processor because it is unusable for the things I want to do, I'd rather wait for Intel Cascade Lake that arrives by the end of the year.

Gian-Carlo Pascutto
Posts: 1196
Joined: Sat Dec 13, 2008 6:00 pm
Contact:

Re: Ryzen 2 and BMI2?

Post by Gian-Carlo Pascutto » Tue May 15, 2018 8:29 pm

Sesse wrote:
Tue May 15, 2018 3:43 pm
No, you'd have to use an intrinsic or inline assembler.
Right but that's not doable in a benchmark that also has to run on ARM and Power etc and has to be "fair", i.e. what Roland was referring to.

The versions in SPEC don't even use BSF/LZCNT/POPCNT because of the same reasons. Althought it wouldn't surprise me if Intel C++ generates them anyway, as long as you use the SPEC sources. :D

Sesse
Posts: 203
Joined: Mon Apr 30, 2018 9:51 pm
Contact:

Re: Ryzen 2 and BMI2?

Post by Sesse » Tue May 15, 2018 9:58 pm

Obviously an Intel-specific instruction will not be applicable to PowerPC, indeed.

FWIW, bsf maps fairly well to the ffs() call in POSIX.

syzygy
Posts: 4503
Joined: Tue Feb 28, 2012 10:56 pm

Re: Ryzen 2 and BMI2?

Post by syzygy » Tue May 15, 2018 11:08 pm

Gian-Carlo Pascutto wrote:
Tue May 15, 2018 8:29 pm
Sesse wrote:
Tue May 15, 2018 3:43 pm
No, you'd have to use an intrinsic or inline assembler.
Right but that's not doable in a benchmark that also has to run on ARM and Power etc and has to be "fair", i.e. what Roland was referring to.

The versions in SPEC don't even use BSF/LZCNT/POPCNT because of the same reasons. Althought it wouldn't surprise me if Intel C++ generates them anyway, as long as you use the SPEC sources. :D
Did Crafty as included in SPEC CPU2000 not use any of those on platforms where they were available? (Probably not, I guess...)

Apparently SPEC CPU2017 not includes Deep Sjeng but also Leela. Nice :)

Post Reply