Caissa 1.16 AVX512

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Werner
Posts: 2994
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Caissa 1.16 AVX512

Post by Werner »

On AMD Processor (Ryzen 9 7945HX) AVX512 is not faster than bmi2 version. But on Intel (I7 11800H) AVX512 compile is faster.
Do we need a special compile for AMD AVX512?
Werner
Magnum
Posts: 195
Joined: Thu Feb 04, 2021 10:24 pm
Full name: Arnold Magnum

Re: Caissa 1.16 AVX512

Post by Magnum »

A special very fast compile for Apple macOS ARM would be great too.
Werewolf
Posts: 2042
Joined: Thu Sep 18, 2008 10:24 pm

Re: Caissa 1.16 AVX512

Post by Werewolf »

Magnum wrote: Fri Jan 12, 2024 9:29 pm A special very fast compile for Apple macOS ARM would be great too.
Apple doesn't support AVX
Witek
Posts: 87
Joined: Thu Oct 07, 2021 12:48 am
Location: Warsaw, Poland
Full name: Michal Witanowski

Re: Caissa 1.16 AVX512

Post by Witek »

Werner wrote: Fri Jan 12, 2024 12:22 pm On AMD Processor (Ryzen 9 7945HX) AVX512 is not faster than bmi2 version. But on Intel (I7 11800H) AVX512 compile is faster.
Do we need a special compile for AMD AVX512?
AVX-512 speedup highly depends on CPU implementation. AMD CPUs are know to have "fake" AVX-512, where each 512-bit operation is split into two 256-bit operations.
Magnum wrote: Fri Jan 12, 2024 9:29 pm A special very fast compile for Apple macOS ARM would be great too.
I don't have Apple PC to test this. There is NEON implementation of neural net evaluation, so it's just a matter of tweaking CMake to make use of it. Contributions are welcome :)
Author of Caissa Chess Engine: https://github.com/Witek902/Caissa
User avatar
Werner
Posts: 2994
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: Caissa 1.16 AVX512

Post by Werner »

Witek wrote: Sat Jan 13, 2024 8:37 pm
Werner wrote: Fri Jan 12, 2024 12:22 pm On AMD Processor (Ryzen 9 7945HX) AVX512 is not faster than bmi2 version. But on Intel (I7 11800H) AVX512 compile is faster.
Do we need a special compile for AMD AVX512?
AVX-512 speedup highly depends on CPU implementation. AMD CPUs are know to have "fake" AVX-512, where each 512-bit operation is split into two 256-bit operations.
...thanks for the info - and newer Intel processors have no more AVX512.
Werner
User avatar
Dariusz
Posts: 379
Joined: Sat Jun 13, 2015 10:08 am
Location: Poland
Full name: Dariusz Domagała

Re: Caissa 1.16 AVX512

Post by Dariusz »

Yes, NEON neural network implementation for evaluation does a good job ;)
Just look at the results of, for example, the latest Stockfish or RubiChess or Texel on the MCERL rating list.

Witek, It's a pity that there is such diffcult way to fast & easily compile Caissa for Macs with Apple Silicon. I don't have the knowledge to do it well enough to keep Caissa from losing its strenght. Some time ago I managed to compile one of the latest versions of Caissa for the M1 but it was about 150 Elo weaker than the windows version running on the Mac M1 via Wine.

I, unfortunately, am not a computer scientist and am not very competent at it, but that did not prevent me from finally (!) compiling Caissa 1.16 (the latest version) in native versions for x64 Macs. This resulted in a popcnt compilation that already runs seamlessly on Macs with Apple Silicon CPUs (e.g. M1, M2, ...). This is done via the Rosseta layer. And, surprisingly, it gives really good performance!

According to my tests, for example, the performance of Stockfish in the x64 version of popcnt (compiled on a Mac with an Intel CPU) does not differ significantly from the performance of Stockfish compiled for ARCH=apple-silicon; I think it is from a few to a maximum of 10 Elo.

Caissa 1.16 running on CPU M1, M2, .... is available for download from my site (Files section - for free, of course).

Below are the results of some recent chess engines compiled natively for Apple Silicon + Caissa 1.16 (Mac x64 popcnt version).
Image

Witek, Caissa is amazing. It runs like a locomotive pushing forward and climbing higher and higher on the ranking lists. Congratulations!
Regards, Darius
https://chessengeria.eu
petero2
Posts: 729
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Caissa 1.16 AVX512

Post by petero2 »

Witek wrote: Sat Jan 13, 2024 8:37 pm
Werner wrote: Fri Jan 12, 2024 12:22 pm On AMD Processor (Ryzen 9 7945HX) AVX512 is not faster than bmi2 version. But on Intel (I7 11800H) AVX512 compile is faster.
Do we need a special compile for AMD AVX512?
AVX-512 speedup highly depends on CPU implementation. AMD CPUs are know to have "fake" AVX-512, where each 512-bit operation is split into two 256-bit operations.
But AVX-512 can still give a speedup on AMD. I tested some engines on both AMD and Intel and measured how much faster the AVX-512 version was compared to the AVX2 version. This is what I got:

Code: Select all

engine     AMD     Intel
berserk    -1.5%   3.6%
cheng4     10.4%   0.2%
obsidian    0.7%   9.2%
stockfish  -1.4%   0.04%
texel       7.0%   5.7%
Different engines behave very differently. Cheng4 is special in that it uses floating point for NNUE evaluation.

The AMD CPU was: "AMD Ryzen 9 7950X3D 16-Core Processor" running at 4.4GHz
The Intel CPU was: "Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz" running at 3.5GHz
mar
Posts: 2665
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Caissa 1.16 AVX512

Post by mar »

petero2 wrote: Wed Jan 17, 2024 10:29 pm Different engines behave very differently. Cheng4 is special in that it uses floating point for NNUE evaluation.
actually that no longer holds for dev, I've switched to fixedpoint instead, because the roundoff errors caused nondeterminism which I didn't like.
BrendanJNorman
Posts: 2584
Joined: Mon Feb 08, 2016 12:43 am
Full name: Brendan J Norman

Re: Caissa 1.16 AVX512

Post by BrendanJNorman »

Sorry guys, I'm a big behind on Caissa.

Is it using Stockfish data or a lot of own ideas/data?

(not making allegations here btw, just very interested)

Looks very strong, would love it if it were totally unique!
Witek
Posts: 87
Joined: Thu Oct 07, 2021 12:48 am
Location: Warsaw, Poland
Full name: Michal Witanowski

Re: Caissa 1.16 AVX512

Post by Witek »

BrendanJNorman wrote: Thu Jan 18, 2024 3:14 pm Sorry guys, I'm a big behind on Caissa.

Is it using Stockfish data or a lot of own ideas/data?

(not making allegations here btw, just very interested)

Looks very strong, would love it if it were totally unique!
Caissa's neural net is trained purely on positions from Caissa selfplay games. Regarding ideas it's heavily inspired by other top engines. But let's be honest - all top engines are stealing ideas from each other.
One of the original idea I introduced was "eval history correction" which was successfully adopted later by Berserk, Seer and Stockfish (and probably more).
Author of Caissa Chess Engine: https://github.com/Witek902/Caissa