if statement and calculation faster than constant

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
fabianVDW
Posts: 146
Joined: Fri Mar 15, 2019 7:46 pm
Location: Germany
Full name: Fabian von der Warth

Re: if statement and calculation faster than constant

Post by fabianVDW » Fri Apr 10, 2020 2:59 pm

mvanthoor wrote:
Fri Apr 10, 2020 2:35 pm

For my feeling, I've now "proven" that Rust can be as fast as C, even in non-trivial projects like mini-benchmarks, but it does require a bit of special work. I have two arrays backed by a counter, in a struct, which implement the normal vector methods such as push, pop, clear, len, etc, just because it's faster than even a fixed length vector.
I'm going to make a generic out of those two arrays to avoid code duplication, and call it quits on the speed optimizations. I'll start working on the search and UCI-protocol this weekend.
You might also be interested in using the arrayvec crate: https://docs.rs/crate/arrayvec/0.5.1
Rust has a minimal std library, and for other languages such things would surely be contained in std. Of course you can also just use your own data types, but I don't think it is necessary if you don't explicitly want to. That said, I don't think I've ever used this crate, so I am not sure it is as fast as your minimal implementation, but I would expect it to be.
Author of FabChess: https://github.com/fabianvdW/FabChess
A UCI compliant chess engine written in Rust.
FabChessWiki: https://github.com/fabianvdW/FabChess/wiki
fabianvonderwarth@gmail.com

User avatar
leanchess
Posts: 84
Joined: Sun Dec 08, 2019 7:16 pm
Full name: Dmitry Shechtman
Contact:

Re: if statement and calculation faster than constant

Post by leanchess » Fri Apr 10, 2020 3:11 pm

mvanthoor wrote:
Fri Apr 10, 2020 2:35 pm
It's not a dramatic difference, but "to ^ 8" is measurably slower than "if us == WHITE {to - 8} else {to + 8}" construction.
I tried it in my (incomplete) perft, and can confirm that the XOR is slower. I'm using a single if statement at the top level with entirely separate logic for black/white pawns.
As this lookup has "attacker ^ 1" in it, and "to ^ 1" is also much slower than a massive statement, I'm starting to believe the XOR operation is slow; at least in Rust. (I'll see if ! works in this context; as in, !attacker. Could be that this is only for bools though, and side is a u8).
Indeed, it's not just in Rust. I opted for keeping those as precalculated values.
Thanks :) I've switched to the GNU toolchain to test this again. It seems that HGM is right on this; suddenly, the GNU toolchain became much slower than MSVC. Now they are almost equal again, with GNU pulling ahead a little bit.
Are you using clang?
For my feeling, I've now "proven" that Rust can be as fast as C, even in non-trivial projects like mini-benchmarks, but it does require a bit of special work.
I just noticed a mention of a Zobrist key in your output. Does that mean that it's 90 seconds WITH hashing?

fabianVDW
Posts: 146
Joined: Fri Mar 15, 2019 7:46 pm
Location: Germany
Full name: Fabian von der Warth

Re: if statement and calculation faster than constant

Post by fabianVDW » Fri Apr 10, 2020 3:20 pm

An xor is a hardware operation, and independent of programming language. It takes around 1 cycle. What you are really measuring is most likely what HGM describes in his post above. I can recommend this talk: https://www.youtube.com/watch?v=r-TLSBd ... e=youtu.be
Author of FabChess: https://github.com/fabianvdW/FabChess
A UCI compliant chess engine written in Rust.
FabChessWiki: https://github.com/fabianvdW/FabChess/wiki
fabianvonderwarth@gmail.com

D Sceviour
Posts: 568
Joined: Mon Jul 20, 2015 3:06 pm
Contact:

Re: if statement and calculation faster than constant

Post by D Sceviour » Fri Apr 10, 2020 3:32 pm

mvanthoor wrote:
Thu Apr 09, 2020 11:41 pm
A few weeks ago there was a discussion about someone pushing to replace large table-arrays with a calculation in Stockfish where possible, because it was deemed faster.
There was a previous discussion that demonstrated arrays are faster than inline functions, as should be expected. The probable reason for using the inline function is that the variables are currently under testing. The variable constants can be quickly changed while tuning. Once the formula constants are stabilized, the function can be turned into a faster array for a release version of the code.

User avatar
leanchess
Posts: 84
Joined: Sun Dec 08, 2019 7:16 pm
Full name: Dmitry Shechtman
Contact:

Re: if statement and calculation faster than constant

Post by leanchess » Fri Apr 10, 2020 4:33 pm

fabianVDW wrote:
Fri Apr 10, 2020 3:20 pm
An xor is a hardware operation, and independent of programming language. It takes around 1 cycle. What you are really measuring is most likely what HGM describes in his post above. I can recommend this talk: https://www.youtube.com/watch?v=r-TLSBd ... e=youtu.be
Nice presentation, I was unaware of Stabilizer.

However, through a lot of trial and error I did manage to get my perft to about half the speed of qperft, which I find especially funny considering this:
https://peterellisjones.com/posts/generating-legal-chess-moves-efficiently/ wrote:Qperft is a purpose-built perft calculator by H.G.Muller. It’s considered one of the fastest move generators out there so my goal with my engine was to take at most twice as long as qperft.

The author of that article employed some highly sophisticated algorithms including magic bitboards to achieve that, while I didn't do much beyond simple BB LUTs and clang -Ofast to similar results. The only problem is my perft(6) is 119060974...

User avatar
mvanthoor
Posts: 767
Joined: Wed Jul 03, 2019 2:42 pm
Full name: Marcel Vanthoor

Re: if statement and calculation faster than constant

Post by mvanthoor » Fri Apr 10, 2020 5:02 pm

fabianVDW wrote:
Fri Apr 10, 2020 2:59 pm
mvanthoor wrote:
Fri Apr 10, 2020 2:35 pm

For my feeling, I've now "proven" that Rust can be as fast as C, even in non-trivial projects like mini-benchmarks, but it does require a bit of special work. I have two arrays backed by a counter, in a struct, which implement the normal vector methods such as push, pop, clear, len, etc, just because it's faster than even a fixed length vector.
I'm going to make a generic out of those two arrays to avoid code duplication, and call it quits on the speed optimizations. I'll start working on the search and UCI-protocol this weekend.
You might also be interested in using the arrayvec crate: https://docs.rs/crate/arrayvec/0.5.1
Rust has a minimal std library, and for other languages such things would surely be contained in std. Of course you can also just use your own data types, but I don't think it is necessary if you don't explicitly want to. That said, I don't think I've ever used this crate, so I am not sure it is as fast as your minimal implementation, but I would expect it to be.
This is one of my pet peeves.

In the past, I often wrote stuff in "C++", which in my case was basically C with some C++ and Boost stuff thrown in. I'm used to a HUGE standard library. In Rust, I have to go and install a crate for every non-trivial thing I want to do, having to rely on other people's code. For big crates supported by Mozilla or some VERY well known developers, groups, or companies I'm fine with that, but I avoid depending on Crate X from RandomDude.

I love cargo and the easy way Rust stuff builds as compared to writing a makefile for a C/C++ project, but I really, really dislike this "npm-like" situation where half your program comes from "somewhere" on the internet. Who's going to guarantee me that, if I quit chess engine programming for 10 years and then decide to pick it up again, those crates will still exist and be downloadable?

I hope Crates.io will create a function where you can download big important crates and then 'import' them into the standard library.

(PS. I can program object oriented very well, but I think it's often hugely overdone. I much prefer the "struct + functions" approach of Rust; it's what I typically did in C. The only difference in Rust is that I don't have to actually pass the struct to the function every time, but I can just write stuff.function().)
Author of Rustic.
Releases | Code | Docs

Terje
Posts: 302
Joined: Tue Nov 19, 2019 3:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: if statement and calculation faster than constant

Post by Terje » Fri Apr 10, 2020 5:37 pm

mvanthoor wrote:
Fri Apr 10, 2020 2:35 pm
If you want to see Weiss speed without the eval/move order code you can comment out the lines marked as 'update material', 'update phase', and 'update various piece lists' (this comment needs updating lol), in Add/Clear/MovePiece functions in makemove.c and line 67-75 in movegen.c. On my machine perft 7 drops from 104 to 88 seconds, a ~15% speedup.

I too only count leaf nodes so my perft output is a bit misleading. I'll update this to properly specify that it means leaves not all nodes, as well as leaves/s.
Don't need to do that; I also calculate hashes, piece-table updates and material updates.
Oh, you said the only thing your perft did that wasn't needed was hash updates so I assumed you hadn't added piece value / piece square table value updates, non-pawn or other such counts, or move ordering code.

User avatar
hgm
Posts: 25843
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: if statement and calculation faster than constant

Post by hgm » Fri Apr 10, 2020 5:38 pm

leanchess wrote:
Fri Apr 10, 2020 4:33 pm
The author of that article employed some highly sophisticated algorithms including magic bitboards to achieve that, while I didn't do much beyond simple BB LUTs and clang -Ofast to similar results. The only problem is my perft(6) is 119060974...
(Magic) bitboards aren't an especially fast way to do perft.

User avatar
mvanthoor
Posts: 767
Joined: Wed Jul 03, 2019 2:42 pm
Full name: Marcel Vanthoor

Re: if statement and calculation faster than constant

Post by mvanthoor » Fri Apr 10, 2020 5:41 pm

leanchess wrote:
Fri Apr 10, 2020 3:11 pm
Are you using clang?
No. Eh... yes. Wait. I'll explain :)

For C/C++, there are many compilers: clang/clang++, gcc/g++, msvc/++, and many, many more.
As far as I know, there's only one Rust compiler at the moment: rustc.

Both clang and rustc compile to what is known "LLVM intermediate code", and from there, the LLVM backend takes over. This results in one or more object files, which are then linked into an executable. On Windows, this can be done with GNU's ld.exe (under MinGW/MSYS2), or Microsoft's link.exe, if you have the MSVC toolchain selected.

So yes, I'm using 'clang', in the sense that both Rust (rustc) and C (clang) compile to LLVM.
I just noticed a mention of a Zobrist key in your output. Does that mean that it's 90 seconds WITH hashing?
Yes. My perft 7 runtime is now down to 86.xx seconds on my system (i7-6700K), and this is:

- Without hash/transposition table
- Without bulk counting of leaf nodes (full make/unmake)
- Without special "in check" move generators and such

- WITH full chess implementation (promotions, castling)
- WITH incremental Zobrist hash updates
- WITH incremental piece list updates (location of each piece on the board)
- WITH incremental material balance update (material count for each side)

This is the same as it is in Weiss (edit: without move ordering). They both use magic bitboards, and their make/unmake functions are almost the same. The only difference is that Weiss (by Terje) is written in C, and Rustic (by me) is written in Rust. On this system, the speed of Weiss and Rustic is now within two seconds of one another (edit: with the advantage going to Rustic; probably because it doesn't yet do move ordering.)

(Weiss was based on an older tutorial engine, called VICE and later converted from a piece/square iterator to a bitboard engine; Rustic is built from scratch, using bitboards from the start, but I did follow VICE's tutorials. Therefore Rustic is inevitably a bit Weiss as well. However, Weiss is in development much longer, so it is already quite strong. Rustic is, as of now, only a move generator / perft engine. It will get there, in time.)
Last edited by mvanthoor on Fri Apr 10, 2020 6:03 pm, edited 1 time in total.
Author of Rustic.
Releases | Code | Docs

User avatar
mvanthoor
Posts: 767
Joined: Wed Jul 03, 2019 2:42 pm
Full name: Marcel Vanthoor

Re: if statement and calculation faster than constant

Post by mvanthoor » Fri Apr 10, 2020 5:42 pm

Terje wrote:
Fri Apr 10, 2020 5:37 pm
mvanthoor wrote:
Fri Apr 10, 2020 2:35 pm
If you want to see Weiss speed without the eval/move order code you can comment out the lines marked as 'update material', 'update phase', and 'update various piece lists' (this comment needs updating lol), in Add/Clear/MovePiece functions in makemove.c and line 67-75 in movegen.c. On my machine perft 7 drops from 104 to 88 seconds, a ~15% speedup.

I too only count leaf nodes so my perft output is a bit misleading. I'll update this to properly specify that it means leaves not all nodes, as well as leaves/s.
Don't need to do that; I also calculate hashes, piece-table updates and material updates.
Oh, you said the only thing your perft did that wasn't needed was hash updates so I assumed you hadn't added piece value / piece square table value updates, non-pawn or other such counts, or move ordering code.
See the post above :)

There isn't move ordering yet. That'll be the first thing that I'll be adding after I finish the bare-bones a/b-search and evaluation function.

I've looked at your makemove/takemove code. (Second time I actually looked into another chess engine after FabChess.)

First time after implementing the move list pool, it was almost, but not quite, a direct copy of FabChess' implementation (without me ever having seen it before).

Now, your makemove/takemove, and my make/unmake functions are basically twins, allowing for differences because of mine being Rust and yours being C. Our programming style and even naming seems to be the same as well.

If I have both functions open in VSCode, it's actually possible for me to be looking at yours and it'll take a few seconds for me to realize it :X

If the rest of your engine is as readable as takemove/makemove, then I congratulate you on a job well done. I hope you will take a look at Rustic Alpa 1 when it comes out. It'll be the base-line version; weak, feature-deprived, barely able to play, but written and commented like a literary work. (I hope... because I'm not the one judging this.)
Author of Rustic.
Releases | Code | Docs

Post Reply