Java: white ? repeat : repeat;

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

DustyMonkey
Posts: 61
Joined: Wed Feb 19, 2014 10:11 pm

Re: Java: white ? repeat : repeat;

Post by DustyMonkey »

mar wrote: Wed Nov 28, 2018 5:39 am Go on an embarass me by making Portfish run as fast in C# as SF3 in C++. Dream on.
You wrote this. That tells me a lot.
mar wrote: Wed Nov 28, 2018 5:39 am You seem to overrate IPO/whole program optimization. How many conditionals in performance-critical loops will be optimized away? None.
Then the programmer manually did it. Get it yet?

I know you do, but you are already in attack mode... past the point of no return.

Somehow you are emotionally wrapped up in forum posts. Pretty sure its language bigotry now.
mar wrote: Wed Nov 28, 2018 5:39 am All constants are in headers in C++ so the compiler sees them.
..and NEVER passed to a function, eh? Thats right, you didnt respond to what was written.
mar wrote: Wed Nov 28, 2018 5:39 am Academic nonsense. Elementary types can "behave like objects" at language level, but you'll find no object-related opcodes in CIL for integers of course.
Academic nonsense? Thats what you just did. Words have meaning. In C#, Integers are objects, and inherit from object. The only way this is controversial is if you confuse being an object with being a reference. Pretty sure you didnt know there was a difference at this point. While I was reserving judgment before... I am no longer.
mar wrote: Wed Nov 28, 2018 5:39 am Sesse already answered this. And don't use caps-lock in a discussion.
Sesse speaks for you? Gotcha.

While you may enjoy sesse speaking for you, YOU WONT GET TO ENJOY TELLING ME WHAT TO DO.

Not only a language bigot... but also a caps lock cop... yeah... you've got real strong arguments. If you reply, it will go ignored by me, because your goals here are not to be instructive or informative, but instead to masturbate your self-esteem. Not sorry that your goals didnt pan out.
DustyMonkey
Posts: 61
Joined: Wed Feb 19, 2014 10:11 pm

Re: Java: white ? repeat : repeat;

Post by DustyMonkey »

Pio wrote: Wed Nov 28, 2018 12:46 am Doing that together with favouring captures a lot (that could be merged in a similar fashion maybe) will lead to much better play where the engine will learn by itself. One nice property will be that generating an opening database will be very easy and you would actually more or less steal the opening databases from the opponents you play against.
Hmm, why should favoring captures help the learning?

Not saying the theory is wrong, just that I dont see the justification.
Pio
Posts: 334
Joined: Sat Feb 25, 2012 10:42 pm
Location: Stockholm

Re: Java: white ? repeat : repeat;

Post by Pio »

DustyMonkey wrote: Wed Nov 28, 2018 11:01 am
Pio wrote: Wed Nov 28, 2018 12:46 am Doing that together with favouring captures a lot (that could be merged in a similar fashion maybe) will lead to much better play where the engine will learn by itself. One nice property will be that generating an opening database will be very easy and you would actually more or less steal the opening databases from the opponents you play against.
Hmm, why should favoring captures help the learning?

Not saying the theory is wrong, just that I dont see the justification.
The justification is simple. It will reduce the search space and shorten the games so more games can be played. Actually the statistics can be updated without prior knowledge and shortening games can be factored into the end result
DustyMonkey
Posts: 61
Joined: Wed Feb 19, 2014 10:11 pm

Re: Java: white ? repeat : repeat;

Post by DustyMonkey »

Pio wrote: Wed Nov 28, 2018 12:46 am I did not think it was possible in C# to do it.
(forgot this part)

Intrinsics in .NET have been a long time coming for sure. It is still not official, and I wouldnt be all that surprised if it takes several years for finalization of any of it. Thats just how Microsoft rolls.

Once upon a time they were previewing some experimental GPU processing support for .NET, called Microsoft Research Accelerator. Ultimately it was abandoned rather than finalized. Probably still works but also probably requires being limited to the 2.0 framework circa 2007 and the abilities of DX9 pixel shaders. I played around with it at some point when it was "news", and decided that my video card at the time made it pointless.
User avatar
emadsen
Posts: 434
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Java: white ? repeat : repeat;

Post by emadsen »

Intrinsics in .NET have been a long time coming for sure. It is still not official, and I wouldnt be all that surprised if it takes several years for finalization of any of it.
Trying this has been on my to-do list for a few weeks now. I finally got around to it tonight. It was surprisingly easy to integrate the intrinsics package into my engine. Of course it helps that my engine is built on .NET Core. It was a simple matter of adding the package (like you said, not from NuGet but from MyGet at https://dotnet.myget.org/F/dotnet-core/ ... index.json) then replacing this code:

Code: Select all

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int CountSetBits(ulong Value)
{
    int count = 0;
    while (Value > 0)
    {
        count++;
        Value &= Value - 1ul;
    }
    Debug.Assert((count >= 0) && (count <= _longBits));
    return count;
}


// See https://stackoverflow.com/questions/37083402/fastest-way-to-get-last-significant-bit-position-in-a-ulong-c
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int FindFirstSetBit(ulong Value)
{
    // TODO: Change the return value when no bit is set to Square.Illegal.
    if (Value == 0) return -1;
    return _multiplyDeBruijnBitPosition[((ulong) ((long) Value & -(long) Value) * _deBruijnSequence) >> 58];
}
With the intrinsics:

Code: Select all

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int CountSetBits(ulong Value) => (int) Popcnt.PopCount(Value);


[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int FindFirstSetBit(ulong Value) => Value == 0 ? -1 : _longBits - (int) Lzcnt.LeadingZeroCount(Value) - 1;
Took 15 minutes. One important note: Add the 4.5.0-rtm package, which is compatible with .NET Core 2.1. Subsequent packages are for beta versions of .NET Core 2.2.

I ran my engine with and without the CPU intrinsics and got these results when analyzing the WAC test positions:

Without CPU Intrinsics (using De Bruijn instead)

Code: Select all

analyzepositions "C:\Users\Erik\Documents\Chess\Tests\WinAtChess.epd" 3000
Number                                                                     Position  Solution    Expected Moves   Move  Correct    Pct
======  ===========================================================================  ========  ================  =====  =======  =====
     1                      2rr3k/pp3pp1/1nnqbN1p/3pN3/2pP4/2P3Q1/PPB4P/R4RK1 w - -      Best              g3g6   g3g6     True  100.0
     2                                   8/7p/5k2/5p2/p1p2P2/Pr1pPK2/1P1R3P/8 b - -      Best              b3b2   b3b7    False   50.0
     3                      5rk1/1ppb3p/p1pb4/6q1/3P1p1r/2P1R2P/PP1BQ1P1/5RKN w - -      Best              e3g3   e3g3     True   66.7
   ...
   300                              b2b1r1k/3R1ppp/4qP2/4p1PQ/4P3/5B2/4N1K1/8 w - -      Best              g5g6   g5g6     True   90.7

Solved 272 of 300 positions in 906 seconds.
Counted 2,322,005,022 nodes (2,561,711 nodes per second).
With CPU Intrinsics

Code: Select all

analyzepositions "C:\Users\Erik\Documents\Chess\Tests\WinAtChess.epd" 3000
Number                                                                     Position  Solution    Expected Moves   Move  Correct    Pct
======  ===========================================================================  ========  ================  =====  =======  =====
     1                      2rr3k/pp3pp1/1nnqbN1p/3pN3/2pP4/2P3Q1/PPB4P/R4RK1 w - -      Best              g3g6   g3g6     True  100.0
     2                                   8/7p/5k2/5p2/p1p2P2/Pr1pPK2/1P1R3P/8 b - -      Best              b3b2   b3b7    False   50.0
     3                      5rk1/1ppb3p/p1pb4/6q1/3P1p1r/2P1R2P/PP1BQ1P1/5RKN w - -      Best              e3g3   e3g3     True   66.7
   ...
   300                              b2b1r1k/3R1ppp/4qP2/4p1PQ/4P3/5B2/4N1K1/8 w - -      Best              g5g6   g5g6     True   90.7

Solved 272 of 300 positions in 906 seconds.
Counted 2,544,505,012 nodes (2,808,780 nodes per second).
That's a 10% speedup of search performance (on my Ryzen Threadripper 1950X). The move generation speedup is closer to 30%- at least from the start position. But it's search speed that matters.
My C# chess engine: https://www.madchess.net
ker2x
Posts: 17
Joined: Sun Nov 11, 2018 1:28 pm
Full name: Laurent Laborde

Re: Java: white ? repeat : repeat;

Post by ker2x »

AxolotlFever wrote: Mon Nov 26, 2018 9:36 pm Hi all,

I am steadily making my Java engine Axolotl less horrible, and there is one particular bit of code that bloats my engine a heck of a lot.

Code: Select all

if (white){
            knights = board.WHITE_KNIGHTS;
        }
        else {
            knights = board.BLACK_KNIGHTS;
        }

Code: Select all

long myPieces = white ? board.WHITE_BISHOPS : board.BLACK_BISHOPS;
And similar. While this might not slow me down a lot (I am not sure about though) it is ugly and I would like to get rid of it.

In your (Java) engine, how do you get around this kind of checking?

Kind regards,
Louis
There isn't anything you can do ni this code by itself. This kind of stuff is absurdly fast, no matter how you write it it will end up the same in both jvm bytecode and x86_64.

I'm sorry, i didn't read much of the thread as it seems it escalated quickly and seemed totally offtopic.
as a side note : ternary operator is evil and doesn't change the compiled code in any way, it's translated to if/then internally and it's a single instruction in both x86_64 instruction set and jvm instruction set. (i didn't play with JVM "assembly" in a long time, it was fun. https://github.com/ker2x/MiouLang/blob/ ... mioulang.j )
ker2x
Posts: 17
Joined: Sun Nov 11, 2018 1:28 pm
Full name: Laurent Laborde

Re: Java: white ? repeat : repeat;

Post by ker2x »

i managed to read the whole thread.

With the exception of some very specific, like intrinsic of moderately modern cpu instruction set, the speed at which an instruction is executing doesn't really matter.

In this very specific case of some intrinsic like popcnt :
stockfish_64: 1764951
stockfish_64_popcnt: 1813557
stockfish_64_bmi2: 1886284

But it's an absurdly specific case since we're talking about a replacing an conditional loop by a single instruction. Which lead me to my point :

I suggest your guys read both AMD and Intel programmer guide.
If it's too much, and it seriously is, i suggest reading this one : https://www.agner.org/optimize/microarchitecture.pdf
"The microarchitecture of Intel, AMD and VIA CPUs -- An optimization guide for assembly programmers and compiler makers"
It's a bible.

What matter in modern cpu is : memory access, cache miss, branch misprediction, pipeline optimization.

The good old days of CPU doing exactly what you wrote in ASM are long gone. Since the first pentium, CPU are using micro-operations and even do instruction level parallelism. the X86_64 instruction set is pretty much just an API now.

Trying to trick the compiler and cpu to do something that seems optimized (eg : replacing some math with some bit shifting) may force the compiler to use your optimisation instead of a better optimisation you didn't think about.

I suggest you watch this : https://www.youtube.com/watch?v=bSkpMdDe4g4
CppCon 2017: Matt Godbolt “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”
ker2x
Posts: 17
Joined: Sun Nov 11, 2018 1:28 pm
Full name: Laurent Laborde

Re: Java: white ? repeat : repeat;

Post by ker2x »

feel free to skip at 30:00 if you want to see what i'm talking about when i wrote about bit shifting.
When you multiply by 2 you can do even better than << 2
ker2x
Posts: 17
Joined: Sun Nov 11, 2018 1:28 pm
Full name: Laurent Laborde

Re: Java: white ? repeat : repeat;

Post by ker2x »

emadsen wrote: Fri Nov 30, 2018 6:38 am

Code: Select all

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int CountSetBits(ulong Value)
{
    int count = 0;
    while (Value > 0)
    {
        count++;
        Value &= Value - 1ul;
    }
    Debug.Assert((count >= 0) && (count <= _longBits));
    return count;
}


// See https://stackoverflow.com/questions/37083402/fastest-way-to-get-last-significant-bit-position-in-a-ulong-c
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int FindFirstSetBit(ulong Value)
{
    // TODO: Change the return value when no bit is set to Square.Illegal.
    if (Value == 0) return -1;
    return _multiplyDeBruijnBitPosition[((ulong) ((long) Value & -(long) Value) * _deBruijnSequence) >> 58];
}
With the intrinsics:

Code: Select all

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int CountSetBits(ulong Value) => (int) Popcnt.PopCount(Value);


[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int FindFirstSetBit(ulong Value) => Value == 0 ? -1 : _longBits - (int) Lzcnt.LeadingZeroCount(Value) - 1;
I have a surprise for you :
Clang notice you're doing a popcnt and replace the code with the asm instruction :D

Image
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Java: white ? repeat : repeat;

Post by Joost Buijs »

ker2x wrote: Sat Dec 01, 2018 7:50 am feel free to skip at 30:00 if you want to see what i'm talking about when i wrote about bit shifting.
When you multiply by 2 you can do even better than << 2
Haha, I always thought it was << 1, always good to learn something new.