I'm not very happy with the do {} while() statement in C

Ras · Post by **Ras** » Fri Mar 09, 2018 9:11 pm

Michael Sherwin wrote:my jump table style of chess programming

GCC and Clang support computed goto in C.

However, going for that style before having done a clean implementation AND profiled it AND determined that this is a serious bottlebeck, that only leads to serious maintainability problems in the future.

Ras · Post by **Ras** » Fri Mar 09, 2018 9:14 pm

abulmo2 wrote:What about asmfish ?

That's not a project of its own since it relies on Stockfish. Means, in order to even have asmfish, you need roughly twice the manpower. With constant manpower (which is Michael's situation), you would have much less time for the algorithms.

bob · Post by **bob** » Sat Mar 10, 2018 5:20 am

Rebel wrote:
Michael Sherwin wrote:I don't mean to be a pain but to me it sounds like you are describing the C++ way. I know that C++ objects can be simulated in C using structures and being strict in using function calls accordingly. But then I might as well use C++. In my way of thinking C++ using objects is superior for a team project so team members do not trample all over another team members variable names and function names. For a single person working on a one file source C++ methodology does not seem (as) beneficial. I can write 32 bit assembler with the best of em. However, the nuances of 64 bit assembly is giving me a hard time or I'd be writing this primarily in assembler. My goal is writing code that is as fast as it can be. I have sort of a reputation for that or at least I did when RomiChess first came out and also my perft examples I wrote. My perft example in 32 bit assembler runs at 65 million nodes per second using a single thread on my 3.4GHz i7. Not bragging, just saying I like to stick with my programming style. And learning a new paradigm at my age is not easy. What you are suggesting that I should do by passing a pointer around I understand would not make a very noticeable difference in speed but learning a whole new paradigm at my age would certainly slow my progress. I thank you for the philosophical discussion and anything more that you would like to add. I wonder what others might think about what you are suggesting.

Bob?
I am not Bob but what I did in the past (32 bit of course) is to write stuff in C then look at the ASM code the compiler generated and then manually optimize it. Perhaps it makes sense for 64 bit as well.

I have done that myself. But it is not the optimal solution. Because you start with the framework dictated by the compiler. You might find more efficient instructions or combinations, but you are still working with how the compiler did things. For Cray Blitz, Harry and I started from scratch. Completely from scratch. Now we are able to make register assignments as we choose, and since we primarily write "leaf routines" we don't have to worry about registers getting overwritten when we don't call anything that could do so.

That being said, it is not a particularly efficient way to write software in terms of human development time, but the rewards can be significant. In more than one place, we wrote something from scratch that was 5x to 10x faster than the compiler (we were using Cray's vectorizing fortran compiler with CB). But we knew things the compiler could not know. IE this value can never be negative, it can never be less than 0 or larger than 15, etc... But when you want to make changes... ugh...

I would add that 64 bit code (Intel world) is beyond a messy environment. It was a HUGE kludge (64 bit instructions added) on top of an already HUGE kludge (the x86 instruction is not exactly the best laid out instruction set I have seen, just like the processor architecture really suck when compared to much better ones that didn't try to maintain backward compatibility with 8 bit architectures... I've taught assembly language for the IBM 1620, IBM /360, Xerox sigma series, data general MV8000/10000, DEC Vax, Sun including the motorola 680x0 chips and the SPARC processors. Intel is the worst of the group, particularly when factoring in the 64 bit kludges AMD added to make the 64 bit extensions compatible with the 8/16/32 bit instruction formats.

I've programmed a lot of other machines in assembly language, including my all-time most hated machine, the Intel Itanium...

mar · Post by **mar** » Sat Mar 10, 2018 10:21 am

bob wrote:I would add that 64 bit code (Intel world) is beyond a messy environment. It was a HUGE kludge (64 bit instructions added) on top of an already HUGE kludge

Why? I actually think that long mode in x64 (AMD 64) is a natural extension to x86, basically boils down to REX prefix (sacrificing 1-byte inc/dec instructions)
and defaulting to 64-bit regs for addressing modes, zeroing upper 32 bits
where destination is 32-bit register is also a good thing so I actually like that.

There's a bit more to it, but those 3 are the most important ones.

So if you know how to write 32-bit assembly then switching to x64 is actually piece of cake. And you also have twice as many registers.

What I dislike about the x86 instruction set is that I have to use cl to do variable shifts, but I can live with that.

Backwards compatibility depends on POV, you can still run 32-bit apps and for those who write x86 machine code generators, it's also welcome.

Oh and I'm really grateful that x86 arch is little endian.

mar · Post by **mar** » Sat Mar 10, 2018 10:29 am

Forgot to mention that with REX prefix present 8-bit registers were changed and useless AH, BH were replaced with LSByte of various regs, so we can now have sil, dil, bpl or even spl, which is also a good thing.

Ras · Post by **Ras** » Sat Mar 10, 2018 10:58 am

And there is another thing with jump tables. For my project, I am using that for the sorting routine, which is small and self-confined so that I don't lose much readbility. But still, I first had the "clean" implementation to get the algorithmic part right, and that is still in there via #ifdef for compilers that don't support computed goto.

Guess what, I need that "clean" code path with Clang from the Android NDK because the compiler itself crashes when compiling the 64 bit version with the jump tables, both for x86-Android and for ARM-Android although computed goto is actually supported in Clang and does compile for 32 bit.

I filed a bug report for Clang two months ago, but nobody cares. Not that I had expected it otherwise; it's just for my own project documentation why the workaround is necessary.

Rebel · Post by **Rebel** » Sat Mar 10, 2018 10:59 am

bob wrote:
Rebel wrote:
Michael Sherwin wrote:I don't mean to be a pain but to me it sounds like you are describing the C++ way. I know that C++ objects can be simulated in C using structures and being strict in using function calls accordingly. But then I might as well use C++. In my way of thinking C++ using objects is superior for a team project so team members do not trample all over another team members variable names and function names. For a single person working on a one file source C++ methodology does not seem (as) beneficial. I can write 32 bit assembler with the best of em. However, the nuances of 64 bit assembly is giving me a hard time or I'd be writing this primarily in assembler. My goal is writing code that is as fast as it can be. I have sort of a reputation for that or at least I did when RomiChess first came out and also my perft examples I wrote. My perft example in 32 bit assembler runs at 65 million nodes per second using a single thread on my 3.4GHz i7. Not bragging, just saying I like to stick with my programming style. And learning a new paradigm at my age is not easy. What you are suggesting that I should do by passing a pointer around I understand would not make a very noticeable difference in speed but learning a whole new paradigm at my age would certainly slow my progress. I thank you for the philosophical discussion and anything more that you would like to add. I wonder what others might think about what you are suggesting.

Bob?
I am not Bob but what I did in the past (32 bit of course) is to write stuff in C then look at the ASM code the compiler generated and then manually optimize it. Perhaps it makes sense for 64 bit as well.
I have done that myself. But it is not the optimal solution. Because you start with the framework dictated by the compiler.

Well of course you need a good compiler. I had one. And nowadays they are (no doubt) even better. So I don't see a problem there.

bob wrote: You might find more efficient instructions or combinations, but you are still working with how the compiler did things.

Sure, I still could improve new stuff typically with 20-30%. The main work was to rearrange the register use of the compiler (avoiding memory use) that dd the trick among some minor stuff.

bob wrote:I would add that 64 bit code (Intel world) is beyond a messy environment. It was a HUGE kludge (64 bit instructions added) on top of an already HUGE kludge (the x86 instruction is not exactly the best laid out instruction set I have seen, just like the processor architecture really suck when compared to much better ones that didn't try to maintain backward compatibility with 8 bit architectures...

No denial here, but as Martin already said, knowing one is knowing all and you just need to learn its specifics and then get the most out of it.

I don't know how comparable AsmFish and Stockfish are but the NPS of the first is significally higher.

bob · Post by **bob** » Sat Mar 10, 2018 7:08 pm

mar wrote:
bob wrote:I would add that 64 bit code (Intel world) is beyond a messy environment. It was a HUGE kludge (64 bit instructions added) on top of an already HUGE kludge
Why? I actually think that long mode in x64 (AMD 64) is a natural extension to x86, basically boils down to REX prefix (sacrificing 1-byte inc/dec instructions)
and defaulting to 64-bit regs for addressing modes, zeroing upper 32 bits
where destination is 32-bit register is also a good thing so I actually like that.

There's a bit more to it, but those 3 are the most important ones.

So if you know how to write 32-bit assembly then switching to x64 is actually piece of cake. And you also have twice as many registers.

What I dislike about the x86 instruction set is that I have to use cl to do variable shifts, but I can live with that.

Backwards compatibility depends on POV, you can still run 32-bit apps and for those who write x86 machine code generators, it's also welcome.

Oh and I'm really grateful that x86 arch is little endian.

I can think of lots of things that are wrong. Little endian is a retarded design that came due to backward compatibility with the original 8 bit architecture. Many machines have done it right (big endian). The restrictions on which registers can be used for what. General Purpose registers have been around forever. It is simply what you get when you keep hacking on an instruction set to add new features. At some point in time, you have to step up and re-design. Cray did this when the YMP came along. It was already a pretty good architecture, but they moved to the 32 bit addressing model rationally rather than a cobbled/hacked up instruction format.

Yes, you gain 8 new registers. With caveats on what you can and can't do. But the instructions (machine language) get REALLY big, they can be almost anything from 1 byte to 16. Yes the micro-ops inside the machine are pretty clean. But what WE get to use is pretty much crapola...

mar · Post by **mar** » Sat Mar 10, 2018 7:37 pm

bob wrote:Little endian is a retarded design that came due to backward compatibility with the original 8 bit architecture. Many machines have done it right (big endian).

Little endian has one important property, fetching smaller integer from the same address. This is the reason why BE sucks and why LE is actually useful.

Ras · Post by **Ras** » Sat Mar 10, 2018 8:16 pm

bob wrote:It is simply what you get when you keep hacking on an instruction set to add new features.

The customers have decided that they don't want to throw away all the software they already paid for. That's why backwards compatibility won, because it addressed the needs of the paying customers.

Funnily enough, you also dislike the Itanic, where Intel actually wanted to make a clean cut (plus vendor lock-in, of course).

As for the ugly assembly: yes, that is there, but those who care are a few programmers, not the majority of the customers. And even then, it's mostly compiler developers because no considerable project is done in assembly these days anyway.

What's quite nice in today's CPUs is the Thumb-2 assembly in Cortex-M. I'm using that also for stuff before main() can come into play.

I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C

Re: I'm not very happy with the do {} while() statement in C