AVX

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Gerd Isenberg
Posts: 2128
Joined: Wed Mar 08, 2006 7:47 pm
Location: Hattingen, Germany

AVX

So Sandy Bridge is out, and has 16 256-bit YMM registers with AVX, Bulldozer will follow.

So far, they can only treated as vectors of 4 64-bit doubles or 8 32-bit floats, likely later extended to integer vectors.
Similar to SSE with 128-bit registers, soon followed by SSE2 for various integers. Following bitwise operations are available for doubles, as well for floats.

Code: Select all

``````VANDPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 & ymm3
VANDNPD ymm1, ymm2, ymm3/mem256    ymm1 &#58;= ~ymm2 & ymm3
VORPD   ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 | ymm3
VXORPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 ^ ymm3
``````
for instrinsics, see
http://software.intel.com/sites/product ... /index.htm

My question, for what are bitwise operations used with float or double, if not interpreted as integer or set? Is there an enriched double arithmetic, where modifying bits other than sign, that is the mantissa or exponent makes any sense?

Thanks,
Gerd

Zach Wegner
Posts: 1922
Joined: Wed Mar 08, 2006 11:51 pm
Location: Earth
Contact:

Re: AVX

Gerd Isenberg wrote:So Sandy Bridge is out, and has 16 256-bit YMM registers with AVX, Bulldozer will follow.

So far, they can only treated as vectors of 4 64-bit doubles or 8 32-bit floats, likely later extended to integer vectors.
Similar to SSE with 128-bit registers, soon followed by SSE2 for various integers. Following bitwise operations are available for doubles, as well for floats.

Code: Select all

``````VANDPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 & ymm3
VANDNPD ymm1, ymm2, ymm3/mem256    ymm1 &#58;= ~ymm2 & ymm3
VORPD   ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 | ymm3
VXORPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 ^ ymm3
``````
for instrinsics, see
http://software.intel.com/sites/product ... /index.htm

My question, for what are bitwise operations used with float or double, if not interpreted as integer or set? Is there an enriched double arithmetic, where modifying bits other than sign, that is the mantissa or exponent makes any sense?

Thanks,
Gerd
I would guess that's for masking with -1/0 values, and OR/XOR would be for manipulating multiple masks.

Gerd Isenberg
Posts: 2128
Joined: Wed Mar 08, 2006 7:47 pm
Location: Hattingen, Germany

Re: AVX

Zach Wegner wrote: I would guess that's for masking with -1/0 values, and OR/XOR would be for manipulating multiple masks.
Sure, of course! How embarrassing
All the branchless stuff with

Code: Select all

``````VCMPPD ymm1, ymm2, ymm3/mem256, imm8
``````
Wow! Treating bitboards as doubles
Some problems with Nan-values though.

bob
Posts: 20643
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: AVX

Gerd Isenberg wrote:So Sandy Bridge is out, and has 16 256-bit YMM registers with AVX, Bulldozer will follow.

So far, they can only treated as vectors of 4 64-bit doubles or 8 32-bit floats, likely later extended to integer vectors.
Similar to SSE with 128-bit registers, soon followed by SSE2 for various integers. Following bitwise operations are available for doubles, as well for floats.

Code: Select all

``````VANDPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 & ymm3
VANDNPD ymm1, ymm2, ymm3/mem256    ymm1 &#58;= ~ymm2 & ymm3
VORPD   ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 | ymm3
VXORPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 ^ ymm3
``````
for instrinsics, see
http://software.intel.com/sites/product ... /index.htm

My question, for what are bitwise operations used with float or double, if not interpreted as integer or set? Is there an enriched double arithmetic, where modifying bits other than sign, that is the mantissa or exponent makes any sense?

Thanks,
Gerd
Been programming in assembly language for 40+ years now. I have never seen bitwise instructions for FP. Whether those are designed to be used on 64/128/whatever values, I do not know. But using ints is dangerous in true FP mode since there are invalid configurations of bits that can make a program blow up (from experience, I did this on a sun SPARC when I first tried to get a version of Cray Blitz up on a local sparc in 1985. I used a double (double precision in FORTRAN, actually) for the hash signature and got a lot of various types of floating point exceptions.

This would therefore suggest that the FP regs can be used as integer values. Whether that gets extended to the "pseudo-vector" stuff or not I have no idea at the moment.

Gerd Isenberg
Posts: 2128
Joined: Wed Mar 08, 2006 7:47 pm
Location: Hattingen, Germany

Re: AVX

bob wrote:
Gerd Isenberg wrote:So Sandy Bridge is out, and has 16 256-bit YMM registers with AVX, Bulldozer will follow.

So far, they can only treated as vectors of 4 64-bit doubles or 8 32-bit floats, likely later extended to integer vectors.
Similar to SSE with 128-bit registers, soon followed by SSE2 for various integers. Following bitwise operations are available for doubles, as well for floats.

Code: Select all

``````VANDPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 & ymm3
VANDNPD ymm1, ymm2, ymm3/mem256    ymm1 &#58;= ~ymm2 & ymm3
VORPD   ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 | ymm3
VXORPD  ymm1, ymm2, ymm3/mem256    ymm1 &#58;=  ymm2 ^ ymm3
``````
for instrinsics, see
http://software.intel.com/sites/product ... /index.htm

My question, for what are bitwise operations used with float or double, if not interpreted as integer or set? Is there an enriched double arithmetic, where modifying bits other than sign, that is the mantissa or exponent makes any sense?

Thanks,
Gerd
Been programming in assembly language for 40+ years now. I have never seen bitwise instructions for FP. Whether those are designed to be used on 64/128/whatever values, I do not know. But using ints is dangerous in true FP mode since there are invalid configurations of bits that can make a program blow up (from experience, I did this on a sun SPARC when I first tried to get a version of Cray Blitz up on a local sparc in 1985. I used a double (double precision in FORTRAN, actually) for the hash signature and got a lot of various types of floating point exceptions.

This would therefore suggest that the FP regs can be used as integer values. Whether that gets extended to the "pseudo-vector" stuff or not I have no idea at the moment.
Yes, as Zach mentioned with vectors of floats or doubles you will do stuff branchless. For instance max or min and that like, where a mask selects from two sources

Code: Select all

``````VXORPD  ymm1, ymm2, ymm3  ; a ^ b
VCMPPD  ymm0, ymm2, ymm3, less ; mask is 0xff..ff if a < b, zero otherwise
VANDPD  ymm0, ymm0, ymm1  ; &#40;a ^ b&#41; & mask
VXORPD  ymm0, ymm0, ymm2  ; result = a ^ (&#40;a ^ b&#41; & mask&#41;
``````

smatovic
Posts: 966
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: AVX

i try to get a 4*32 bit optimized Board presentation running on a GPU.

Therefore i thought on vectorizing your QuadBoards into 8*32 bit pieces, maybe this idea could also fit on AVX...

--
srdja

Gerd Isenberg
Posts: 2128
Joined: Wed Mar 08, 2006 7:47 pm
Location: Hattingen, Germany

Re: AVX

smatovic wrote:i try to get a 4*32 bit optimized Board presentation running on a GPU.

Therefore i thought on vectorizing your QuadBoards into 8*32 bit pieces, maybe this idea could also fit on AVX...

--
srdja
Yes, if AVX 2 with 256-bit integer vectors become available. I need shifts, which is not yet possible with 256-bits float/double vectors.

smatovic
Posts: 966
Joined: Wed Mar 10, 2010 9:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic
Contact:

Re: AVX

Yes, if AVX 2 with 256-bit integer vectors become available. I need shifts, which is not yet possible with 256-bits float/double vectors.
Afaik AMD plans beside AVX also an extended instruction set with Bulldozer, "XOP"

http://en.wikipedia.org/wiki/XOP_instruction_set

This AVX/SSE5, Intel vs AMD story is confusing.

--
srdja