Yes, this is the preferred way to go. Note that with C++11 allocators you can even omit the construct()/destroy() members as well as all the nested typedefs. This works because all STL containers are required to extract default implementations for these omitted functions from std::allocator_traits.petero2 wrote:If you use STL containers you could write a custom allocator that ensures 64-byte alignment. In my current development version of texel I use this: alignedAlloc.hpp. You could then write:lucasart wrote:Another question related to 64-byte alignment. Actually this is more a C++ question.
Suppose I define a struct TTable::Entry that is 64-bytes long. If I do something likeCan I assume (void *)p to be divisible by 64 ? Is this specified by the C++ standard, or compiler specific ? If the latter, is there a portable way to make sure ? 64-byte alignment is crucial here, not doing it defeats the purpose of prefetching in the first place.Code: Select all
TTable::Entry *p = new TTable::Entry[count];Or if you use a new enough C++11 compiler, you could write:Code: Select all
template <typename T> class vector_aligned : public std::vector<T, AlignedAllocator<T> > { }; vector_aligned<int> someVector;I am not sure my AlignedAllocator class is currently portable because I have only tested it in 64 bit linux so far. However, custom allocators are part of the C++ standard, so it should be quite easy to make it portable if it is not already.Code: Select all
template <typename T> using vector_aligned = std::vector<T, AlignedAllocator<T>>; vector_aligned<int> someVector;
prefetch questions
Moderators: hgm, Dann Corbit, Harvey Williamson
-
Rein Halbersma
- Posts: 741
- Joined: Tue May 22, 2007 11:13 am
Re: prefetch questions
-
lucasart
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: prefetch questions
This is typical:Rein Halbersma wrote:Yes, this is the preferred way to go. Note that with C++11 allocators you can even omit the construct()/destroy() members as well as all the nested typedefs. This works because all STL containers are required to extract default implementations for these omitted functions from std::allocator_traits.petero2 wrote:If you use STL containers you could write a custom allocator that ensures 64-byte alignment. In my current development version of texel I use this: alignedAlloc.hpp. You could then write:lucasart wrote:Another question related to 64-byte alignment. Actually this is more a C++ question.
Suppose I define a struct TTable::Entry that is 64-bytes long. If I do something likeCan I assume (void *)p to be divisible by 64 ? Is this specified by the C++ standard, or compiler specific ? If the latter, is there a portable way to make sure ? 64-byte alignment is crucial here, not doing it defeats the purpose of prefetching in the first place.Code: Select all
TTable::Entry *p = new TTable::Entry[count];Or if you use a new enough C++11 compiler, you could write:Code: Select all
template <typename T> class vector_aligned : public std::vector<T, AlignedAllocator<T> > { }; vector_aligned<int> someVector;I am not sure my AlignedAllocator class is currently portable because I have only tested it in 64 bit linux so far. However, custom allocators are part of the C++ standard, so it should be quite easy to make it portable if it is not already.Code: Select all
template <typename T> using vector_aligned = std::vector<T, AlignedAllocator<T>>; vector_aligned<int> someVector;
C++ programmers share the religious belief that the more complex and "smart" their code is the better it is. I believe it's the exact opposite, and code should be simple and stupid. It's the only way to write maintainable software, in the long run, IMO.
C++ is the art of making trivial things artificially complicated. If you look into alignedalloc.hpp, you'll see that behing all the C++ syntatic sugar (incomprehensible code for anyone who isn't an STL guru btw), it does nothing more than what Evert Glebeek suggested: malloc'ing 64-bytes more and shifting the pointer to the right aligned adress. So as always, the C++ solution is nothing more than the C solution, with many layers of artificial obfuscation...
The C solution is dead simple: 3 lines of code, a child could understand them!
That's for writing the library. Now for using the library, let's compare:
=> C
Code: Select all
buf = my_aligned_malloc(N*64, 64);Code: Select all
buf = new std::vector<TTable::Entry, AlignedAllocator<TTable::Entry> >(N);Seriously, does anyone find the C++ solution easier ? Is there anything that I'm not seeing... ? Am I the only one immune to the C++ kool-aid ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
AlvaroBegue
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: prefetch questions
Although I do have some sympathy for your point of view, it's actually not as bad as you portray it.
C++ has lots of problems, but `std::vector<>' is not one of them. If you are writing `new std::vector...' you are probably doing something wrong. The whole point of std::vector is to hold an array in situations where you don't know how long it will have to be. So the code would be more like
In general, C++ adds lots of bureaucracy to make even the simplest tasks, but it allows the main code to look very clean.
C++ has lots of problems, but `std::vector<>' is not one of them. If you are writing `new std::vector...' you are probably doing something wrong. The whole point of std::vector is to hold an array in situations where you don't know how long it will have to be. So the code would be more like
Code: Select all
typedef std::vector<TTable::Entry, AlignedAllocator<TTable::Entry> > TranspositionBuffer;
TranspositionBuffer buf(N);-
Rein Halbersma
- Posts: 741
- Joined: Tue May 22, 2007 11:13 am
Re: prefetch questions
Your C solution with aligned malloc either by hand or as call to 3rd party library is nice and dandy, no complaints about that. On the contrary, it seems to me that you are the one who is quite fanatical (or should one say: religious?) about bashing C++ to justify your own C usage. I have no illusions of converting you to my favorite kool-aid. If C works for you, all the power to you, but for me it seriously fails the Pepsi-challengelucasart wrote: C++ is the art of making trivial things artificially complicated. If you look into alignedalloc.hpp, you'll see that behing all the C++ syntatic sugar (incomprehensible code for anyone who isn't an STL guru btw), it does nothing more than what Evert Glebeek suggested: malloc'ing 64-bytes more and shifting the pointer to the right aligned adress. So as always, the C++ solution is nothing more than the C solution, with many layers of artificial obfuscation...
And by the way, let's not even talk about the benefit of OOP, and destructor that means there's no free() to call in C++. In this case the size is not known at compile time, so it has to be "new vector", not just "vector" that can free itself when it goes out of scope.
Seriously, does anyone find the C++ solution easier ? Is there anything that I'm not seeing... ? Am I the only one immune to the C++ kool-aid ?
However, implementing a TT using an STL vector instead of a raw pointer to malloc-ed memory has several advantages (automatic destructor call when going out of scope, resize()/reserve() convenience when users want different size, optimized for move semantics and in-place insertion instead of copying, etc. etc.).
To combine the usage of an STL vector with aligned memory, all you need to do is a write an allocator and hide that allocator in a typedef. It really doesn't require guru skills to write an allocator in C++11 when you simply forward to malloc. Note that this is all *library* code that actually is hidden during calls in the application.
Finally, STL vector has several constructors to be initialized with a run-time number of elements, so absolute no need to call new explicitly. If you want to seriously compare both solutions, show best-practice C vs best-practice C++. YOu'd be surpised...
-
wgarvin
- Posts: 838
- Joined: Thu Jul 05, 2007 5:03 pm
- Location: British Columbia, Canada
Re: prefetch questions
Rein, I think you're missing Lucas's point.Rein Halbersma wrote:Your C solution with aligned malloc either by hand or as call to 3rd party library is nice and dandy, no complaints about that. On the contrary, it seems to me that you are the one who is quite fanatical (or should one say: religious?) about bashing C++ to justify your own C usage. I have no illusions of converting you to my favorite kool-aid. If C works for you, all the power to you, but for me it seriously fails the Pepsi-challengelucasart wrote: C++ is the art of making trivial things artificially complicated. If you look into alignedalloc.hpp, you'll see that behing all the C++ syntatic sugar (incomprehensible code for anyone who isn't an STL guru btw), it does nothing more than what Evert Glebeek suggested: malloc'ing 64-bytes more and shifting the pointer to the right aligned adress. So as always, the C++ solution is nothing more than the C solution, with many layers of artificial obfuscation...
And by the way, let's not even talk about the benefit of OOP, and destructor that means there's no free() to call in C++. In this case the size is not known at compile time, so it has to be "new vector", not just "vector" that can free itself when it goes out of scope.
Seriously, does anyone find the C++ solution easier ? Is there anything that I'm not seeing... ? Am I the only one immune to the C++ kool-aid ?
However, implementing a TT using an STL vector instead of a raw pointer to malloc-ed memory has several advantages (automatic destructor call when going out of scope, resize()/reserve() convenience when users want different size, optimized for move semantics and in-place insertion instead of copying, etc. etc.).
To combine the usage of an STL vector with aligned memory, all you need to do is a write an allocator and hide that allocator in a typedef. It really doesn't require guru skills to write an allocator in C++11 when you simply forward to malloc. Note that this is all *library* code that actually is hidden during calls in the application.
Finally, STL vector has several constructors to be initialized with a run-time number of elements, so absolute no need to call new explicitly. If you want to seriously compare both solutions, show best-practice C vs best-practice C++. YOu'd be surpised...
That stuff you're talking about is complicated. A little more complicated than a chess engine actually needs. A few lines of simple C code that does malloc, is good enough to meet the needs of most chess engines for their TT. Sprinkling more C++ in it won't make it any simpler--quite the opposite! The complexity might be hidden away in the template class, but its still extra complexity that is now part of the program and might fail/need to be debugged/need to be read and understood someday by a maintainer. And it doesn't provide anything that a chess engine actually needs. Its an abstraction hiding away some complex, icky C++ stuff which a simpler program would not contain at all.
Now you appear to enjoy writing a templated allocator class for your STL containers, which is fine. The difference is that you chose this problem to solve -- you chose to spend time grokking STL's allocator stuff and figuring out how to make this stuff work. The code gets a bit more complicated, because it uses "complex" language/library features and the reader has to know some in-depth STL stuff to follow all of the details.
Lucas instead chooses to spend his effort solving a different problem: Making his chess engine better. He has learned the hard way, that in C++ its very easy to go down a rabbit hole of solving made-up problems using complicated language features of C++ such as 3 levels of nested templates with partial template specializations here and there, and then a little factory class with private virtual destructor, and then a Visitor pattern thingy that wraps the ... Before you know it, you can find that you're spending all your time trying to dig your program out of the sinkhole of C++ complexity that it has become.
Now some people love to solve those kind of problems, I used to like that kind of stuff too in the past, but over time I started to prize more and more the simplicity of straight-line procedural code that I can read from top to bottom and understand WTF it actually does. I now really dislike "magical" stuff like templated smart-handle class wrapper thingies that automagically double-buffer renderer state for me so I don't have to do it myself. But its 1000 lines of dense C++ template stuff and if I ever have to debug it, its very painful. I make my own share of "clever C++ things" at work sometimes, but I usually end up regretting it eventually. I try hard to curtail my "clever" impulse because in production environments, clever code often ends up being bad code! Especially when other people are going to have to debug it too. Martin Golding said it well: "Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."
C++ gives you a million ways to make complicated things. But complicated does not necessarily mean "more powerful" or "better". Sometimes it doesn't even meet the rather low bar of "good enough to ship". Simplicity is an elusive property, an extremely hard thing to achieve especially in a largeish codebase, but I've learned the hard way that it is the most valuable property that code can have. My frail human brain is the biggest limiting factor for me when programming. I need code that I can understand too, not just my compiler... I want to be able to read it again 3 months or 2 years from now and still understand it. A lot of complex C++ code doesn't have that property. Its full of obscure things, layers of magical things hiding lots of complexity in unexpected places. There's no denying that this is sometimes very useful, but as Carmack said.. if you're willing to restrict the flexibility of your approach, you can nearly always do something better! (and "better" in this context means things like "simple for any programmer to understand" and "simple to refactor, or optimize, or debug").
-
lucasart
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: prefetch questions
I like that! I would add that the psychopath also has a limited IQ of 80 or so. That means he will have a very limited ability to understand C++ wizardry... Remember to bolt your doorwgarvin wrote: Martin Golding said it well: "Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.