I looked into the C-version of QBB-Perft, and decided that I don't want to spend the time to port this to Rust. Rust can do everything QBB-Perft does, including the macro's, using macrorules!, procedural macro's, or even const functions.
I don't want to spend the time converting all that, just to end up with a program that is effectively compiled by LLVM, just as when I'd compile the C-version with Clang. Rest assured that if someone ports this and does it properly, it'll be exactly as fast as the C-version, give or take 1-2% owing to compiler differences. I can do the conversion faster when just using inline functions instead of the macro's, but in that case, the C-version will have a big advantage because it pre-computes lots of data during the compile step.
To be honest, I'd rather spend the free time I do have on refactoring my engine's code in preparation for the next version.
Why C++ instead of C#?
Moderators: hgm, Rebel, chrisw
-
- Posts: 881
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Why C++ instead of C#?
Okay, I finally understand where you got the 1.671x from and I think 16837ms must be wrong. I never made the same the changes to the console output in the C version and so I had to add the ms of the individual tests manually on a calculator. I probably bungled one addition or made a typo when transferring the result to the post.R. Tomasi wrote: ↑Mon Sep 20, 2021 3:57 pmAccumulated there are 1 361 558 651 nodes total (assuming I did not misstype on my calculator). Divided by 28.132s that yields 48.398 NPS and dividing by 16.837s yields 80.867 NPS. That means the C code is 1.671 times faster than the C# code, or - alternatively formulated - the C# code is 40.1% slower.
If you look on page six of this thread I printed the whole listing and it took a bit over 19s to complete (I just added the values printed there up again) and I did no optimizations for the C version so there's really no reason why it should run more than two seconds faster now. If we assume the 19.1s is correct then the speed fits my guesstimate.
I'll run it again later. And I think at least in the C# version I'll add a few lines of code to get rid of the need to "guess" the NPS. Sorry for the confusion!
-
- Posts: 307
- Joined: Wed Sep 01, 2021 4:08 pm
- Location: Germany
- Full name: Roland Tomasi
Re: Why C++ instead of C#?
I was thinking about that, too. Effectively, anything that goes through an LLVM backend (where most of the optimizations would happen) should end up with the same speed, plus/minus compiler differences (as you would also get compiling the C code using different versions of Clang).
-
- Posts: 307
- Joined: Wed Sep 01, 2021 4:08 pm
- Location: Germany
- Full name: Roland Tomasi
Re: Why C++ instead of C#?
I think we should simply include one or two lines into the code that do the total NPS calculation automatically. Manually calculating stuff like that for every run we do feels very "last century" to me, in any case.lithander wrote: ↑Mon Sep 20, 2021 6:36 pmOkay, I finally understand where you got the 1.671x from and I think 16837ms must be wrong. I never made the same the changes to the console output in the C version and so I had to add the ms of the individual tests manually on a calculator. I probably bungled one addition or made a typo when transferring the result to the post.R. Tomasi wrote: ↑Mon Sep 20, 2021 3:57 pmAccumulated there are 1 361 558 651 nodes total (assuming I did not misstype on my calculator). Divided by 28.132s that yields 48.398 NPS and dividing by 16.837s yields 80.867 NPS. That means the C code is 1.671 times faster than the C# code, or - alternatively formulated - the C# code is 40.1% slower.
If you look on page six of this thread I printed the whole listing and it took a bit over 19s to complete (I just added the values printed there up again) and I did no optimizations for the C version so there's really no reason why it should run more than two seconds faster now. If we assume the 19.1s is correct then the speed fits my guesstimate.
I'll run it again later. And I think at least in the C# version I'll add a few lines of code to get rid of the need to "guess" the NPS. Sorry for the confusion!
Edit: I see you suggested that already. My bad! I'll add it to the C version, too, when I'm doing the algorithmic changes later tonight.
-
- Posts: 179
- Joined: Tue Jun 15, 2021 8:11 pm
- Full name: Emanuel Torres
Re: Why C++ instead of C#?
As I expected, it was a typo. Anyway on my machine I get the following results, using the optimized v1.4 C# version:R. Tomasi wrote: ↑Mon Sep 20, 2021 4:39 pm Using the source would, in this case, mean that you use the source, compile it, run it on your machine, and post your findings here
I actually would love to see runs on different machines. I really suspect that the performance difference might not be the same across all CPUs.
Code: Select all
C: 20959 ms
C#: 33845 ms (1.61x slower)
Java: 37483 ms (1.79x slower)
What do you refer to by this? Whether we use define or inline in the C version should not be relevant, assuming it actually inlines.
[Moderation warning] This signature violated the rule against commercial exhortations.
-
- Posts: 1784
- Joined: Wed Jul 03, 2019 4:42 pm
- Location: Netherlands
- Full name: Marcel Vanthoor
Re: Why C++ instead of C#?
There is still a difference. If you inline a function, the compiler pastes the code of the called function into the caller, but the code of the called function still gets executed at every run of the encompassing function. When you use a macro, the compiler not only inlines it, but also resolves it to a single value if at all possible.
Rust (or more correctly, LLVM) can do this with macro's and const functions too; and sometimes, even with inlined functions IF those functions can be resolved to a simpler value. For example, if you do something like this:
Code: Select all
function complicated_function(x) {
y = (do super complex stuff with x here)
z = (do more super cmplex stuff with x here)
return y + z;
}
Code: Select all
function complicated_function(x) {
return x + 35;
}
-
- Posts: 307
- Joined: Wed Sep 01, 2021 4:08 pm
- Location: Germany
- Full name: Roland Tomasi
Re: Why C++ instead of C#?
Problem is, with modern compilers there is no guarantee an inline function will be inlined at all. The compiler will only consider it as a hint and then do hat it thinks is best. If the macro resolves to a compile-time constant, the corresponding function (in C++) would be a constexpr function, which may get evaluated at compile time.mvanthoor wrote: ↑Mon Sep 20, 2021 11:10 pm There is still a difference. If you inline a function, the compiler pastes the code of the called function into the caller, but the code of the called function still gets executed at every run of the encompassing function. When you use a macro, the compiler not only inlines it, but also resolves it to a single value if at all possible.
-
- Posts: 323
- Joined: Tue Aug 31, 2021 10:32 pm
- Full name: tcusr
Re: Why C++ instead of C#?
C macros are just copying and pasting made by the processor before compilation, the compiler only sees the final expression (i guess this is what you mean by 'inlining' the macro). inline functions are as fast as macrosmvanthoor wrote: ↑Mon Sep 20, 2021 11:10 pmThere is still a difference. If you inline a function, the compiler pastes the code of the called function into the caller, but the code of the called function still gets executed at every run of the encompassing function. When you use a macro, the compiler not only inlines it, but also resolves it to a single value if at all possible.
Rust (or more correctly, LLVM) can do this with macro's and const functions too; and sometimes, even with inlined functions IF those functions can be resolved to a simpler value. For example, if you do something like this:
It could be that LLVM simplifies this, to something like:Code: Select all
function complicated_function(x) { y = (do super complex stuff with x here) z = (do more super cmplex stuff with x here) return y + z; }
I just don't feel like rewriting the C macro's into stuff that can be optimized by LLVM to the same extent in which the C-compiler can optimize macro's, because a) it's a lot of work, and b) I'm not that good at Rust macro's, because they're code that writes other code (to avoid having to write repetitive code yourself), and I just don't need that a lot. Therefore it'll take longer. If I just rewrite the macro's into simple functions, the Rust version will be slower than the C version.Code: Select all
function complicated_function(x) { return x + 35; }
-
- Posts: 179
- Joined: Tue Jun 15, 2021 8:11 pm
- Full name: Emanuel Torres
Re: Why C++ instead of C#?
What specifically in this application would make a difference?mvanthoor wrote: ↑Mon Sep 20, 2021 11:10 pm There is still a difference. If you inline a function, the compiler pastes the code of the called function into the caller, but the code of the called function still gets executed at every run of the encompassing function. When you use a macro, the compiler not only inlines it, but also resolves it to a single value if at all possible.
Exactly. I can't speak for Rust, but that's how it is in C / C++. I would expect Rust to do the same.
To prove this, I just replaced all the #defines with inline functions in the C file, and the performance is identical.
[Moderation warning] This signature violated the rule against commercial exhortations.
-
- Posts: 323
- Joined: Tue Aug 31, 2021 10:32 pm
- Full name: tcusr
Re: Why C++ instead of C#?
the inline keywords in C and C++ has different meanings.R. Tomasi wrote: ↑Mon Sep 20, 2021 11:25 pmProblem is, with modern compilers there is no guarantee an inline function will be inlined at all. The compiler will only consider it as a hint and then do hat it thinks is best. If the macro resolves to a compile-time constant, the corresponding function (in C++) would be a constexpr function, which may get evaluated at compile time.mvanthoor wrote: ↑Mon Sep 20, 2021 11:10 pm There is still a difference. If you inline a function, the compiler pastes the code of the called function into the caller, but the code of the called function still gets executed at every run of the encompassing function. When you use a macro, the compiler not only inlines it, but also resolves it to a single value if at all possible.
in C++ is basically 'don't complain if you see multiple definitions of ...', meanwhile in C it is a hint, like you said.
usually only large functions are not inlined but if you really want to you can use GCC's attributes. a rule of thumb is to declare a function inline if it's less than 10 LOC.