Micro optimization in chess programming

Fguy64 · Post by **Fguy64** » Tue Jan 26, 2010 5:06 pm

Greetings. My language of choice is java, which I suppose is not the best if micro-optimization of peformance is a consideration. But that's a different story.

Anyways, I suppose that one aspect of chess programming is about doing as many relational operations on integers as possible, in as short a period of time as possible. So I'd like to understand exactly how these relational operations work at the bitwise level in order to make informed decisions.

As an example, there are various ways to represent things like black/white, empty squares, border squares etc, and various ways to test for these conditions using relational operators, and I suppose some ways are better than others.

Anyways, If anyone has any links on how these relational operations are performed at the bitwise level, I'd be grateful. I understand that it is a CPU issue rather than a language dependant issue. As always, other remarks are welcome.

Kempelen · Post by **Kempelen** » Tue Jan 26, 2010 6:04 pm

Fguy64 wrote:Greetings. My language of choice is java, which I suppose is not the best if micro-optimization of peformance is a consideration. But that's a different story.

The main problem of java is not optimization, is that is a (partially) interpreted language and so it takes more time and more low-level operations to do the same things a compiled language does.

Fguy64 · Post by **Fguy64** » Tue Jan 26, 2010 6:06 pm

Anyways, I guess I am overdoing the optimization thing, it's probably not worth considering, but my question was motivated by the following...

Suppose an integer variable p could have many possible non-negative values. So would it be better, in any way shape or form to test for zero condition using p<1 or p==0. Intuition says using p<1 might save a flop or two.

So I devised the following code...

Code: Select all

 
int p = 0;
boolean isTrue;

t1 = (new Date()).getTime();
for( int i=-2000000000; i<2000000000; i++ ) {
    isTrue = ( p == 0 );
}
t2 = (new Date()).getTime();
System.out.println("" +(t2-t1) +"ms");
		
t1 = (new Date()).getTime();
for( int i=-2000000000; i<2000000000; i++ ) {
    isTrue = ( p < 1 );
}
t2 = (new Date()).getTime();
System.out.println("" +(t2-t1) +"ms");

and after several runs I found that the slower code block was the one that was executed first. LOL

Fermin, duly noted I suppose if a one type of low level operation is faster than another when java is used, that same advantage would exist for c++. My reason being that these types of low level operations are CPU dependant, not language dependent.

p.s. ok, maybe you are saying that making such an evaluation would make more sense using a compiled language, as in a partially interpreted language any difference is dwarfed by other considerations?

Nettogrof · Post by **Nettogrof** » Tue Jan 26, 2010 6:29 pm

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Donald Knuth"

in your code example, I get the same result as you, but if int p =1 then it's not the same result.

I strongly suggest to test your optimization, only on your whole program, like EngineA (with p==0 ) vs EngineB (with p < 1 )

Because with JIT, it's hard to know how java will optimize your code.

Fguy64 · Post by **Fguy64** » Tue Jan 26, 2010 6:53 pm

Nettogrof wrote:"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Donald Knuth"

in your code example, I get the same result as you, but if int p =1 then it's not the same result.

I strongly suggest to test your optimization, only on your whole program, like EngineA (with p==0 ) vs EngineB (with p < 1 )

Because with JIT, it's hard to know how java will optimize your code.

cool.

I am thinking that the < should always be at least as fast or faster, and that the difference become more pronounced for larger integers, because I suspect that < will compare bits from left to right and the larger number is the one with the first 1 bit.

But anyways, I read from your comment that it can lead you astray to consider these sorts of things in isolation from the rest of the program. Makes sense, I'm sure we've all had tests where the results were not at all what we expected.

regards.

hgm · Post by **hgm** » Tue Jan 26, 2010 8:24 pm

This is not how CPUs work. A comparison between integers of upto 32 bits (or 64 bits if you have an x64 CPU) is always done in a single clock cycle. It does not matter if you compare for equalite, or , <, or >. If the thing you compare is the result of a previous operation, comparing it with zero might be a little bit faster, because the hardware that does the operation usually indicates if the answer is positive, zero or negative as a side effect.

Gerd Isenberg · Post by **Gerd Isenberg** » Tue Jan 26, 2010 8:24 pm

Fguy64 wrote:Anyways, I guess I am overdoing the optimization thing, it's probably not worth considering, but my question was motivated by the following...

Suppose an integer variable p could have many possible non-negative values. So would it be better, in any way shape or form to test for zero condition using p<1 or p==0. Intuition says using p<1 might save a flop or two.

So I devised the following code...
Code: Select all
 
int p = 0;
boolean isTrue;

t1 = (new Date()).getTime();
for( int i=-2000000000; i<2000000000; i++ ) {
    isTrue = ( p == 0 );
}
t2 = (new Date()).getTime();
System.out.println("" +(t2-t1) +"ms");
		
t1 = (new Date()).getTime();
for( int i=-2000000000; i<2000000000; i++ ) {
    isTrue = ( p < 1 );
}
t2 = (new Date()).getTime();
System.out.println("" +(t2-t1) +"ms");
and after several runs I found that the slower code block was the one that was executed first. LOL

Fermin, duly noted I suppose if a one type of low level operation is faster than another when java is used, that same advantage would exist for c++. My reason being that these types of low level operations are CPU dependant, not language dependent.

p.s. ok, maybe you are saying that making such an evaluation would make more sense using a compiled language, as in a partially interpreted language any difference is dwarfed by other considerations?

Any reasonable optimizing compiler may recognize your loop bodies invariant and since isTure is no longer used, will likely throw out the whole loops, reporting zero time used. Even a JVM will likely recognize the invariant term and will stop calculating it over and over again. So you likely don't "measure" the execution speed of the bodies.

In x86 assembly both compare statements are about equal, assuming p is loaded in eax and isTrue is assigned to byte reg dl:

Code: Select all

cmp  eax, 0  ; alternatively test eax, eax
sete dl      ; set if equal (same as setz, set if zero)

versus

Code: Select all

cmp  eax, 1
setl dl      ; set of signed less

Dann Corbit · Post by **Dann Corbit** » Tue Jan 26, 2010 8:54 pm

Did you read Laramee's articles?

http://www.gamedev.net/reference/progra ... es/chess1/
http://www.gamedev.net/reference/progra ... es/chess2/
http://www.gamedev.net/reference/progra ... es/chess3/
http://www.gamedev.net/reference/progra ... es/chess4/
http://www.gamedev.net/reference/progra ... es/chess5/
http://www.gamedev.net/reference/progra ... es/chess6/

mathmoi · Post by **mathmoi** » Tue Jan 26, 2010 9:03 pm

Bonjour de Québec Carl et Bienvenue sur CCC

jwes · Post by **jwes** » Tue Jan 26, 2010 11:29 pm

Gerd Isenberg wrote:
Fguy64 wrote:Anyways, I guess I am overdoing the optimization thing, it's probably not worth considering, but my question was motivated by the following...

Suppose an integer variable p could have many possible non-negative values. So would it be better, in any way shape or form to test for zero condition using p<1 or p==0. Intuition says using p<1 might save a flop or two.

So I devised the following code...
Code: Select all
 
int p = 0;
boolean isTrue;

t1 = (new Date()).getTime();
for( int i=-2000000000; i<2000000000; i++ ) {
    isTrue = ( p == 0 );
}
t2 = (new Date()).getTime();
System.out.println("" +(t2-t1) +"ms");
		
t1 = (new Date()).getTime();
for( int i=-2000000000; i<2000000000; i++ ) {
    isTrue = ( p < 1 );
}
t2 = (new Date()).getTime();
System.out.println("" +(t2-t1) +"ms");
and after several runs I found that the slower code block was the one that was executed first. LOL

Fermin, duly noted I suppose if a one type of low level operation is faster than another when java is used, that same advantage would exist for c++. My reason being that these types of low level operations are CPU dependant, not language dependent.

p.s. ok, maybe you are saying that making such an evaluation would make more sense using a compiled language, as in a partially interpreted language any difference is dwarfed by other considerations?
Any reasonable optimizing compiler may recognize your loop bodies invariant and since isTure is no longer used, will likely throw out the whole loops, reporting zero time used. Even a JVM will likely recognize the invariant term and will stop calculating it over and over again. So you likely don't "measure" the execution speed of the bodies.

This is very true. One good reason for knowing at least some assembler is to check if your benchmark code is measuring what you think it is measuring. I have recently had several surprising benchmarks which were only explained when I looked at the assembly code and realized that the interesting bits were optimized out.

Micro optimization in chess programming

Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming

Re: Micro optimization in chess programming