Intell 48-core CPU scales to 1k cores

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Intell 48-core CPU scales to 1k cores

Post by Don »

diep wrote:
Don wrote: The Monte Carlo approach in Go is an amazing breakthrough, nothing short of fantastic as it has quickly added the Go equivalent of hundreds of ELO to Go engines.
With all respect, but they were complete ignorant how to produce a search in go. All go-engines used to hard forward prune in the root for example.

The discussion of elopoints and doubling in go is a very difficult discussion, because go is total other game than chess from search viewpoint.

In chess very unexpected moves can be the best move in a position, in go this is nearly impossible. So you can *nearly* hard forward prune at every position say about 90% of all moves quite instantly.

So actually a few years ago, giving more system time to go programs made them play WORSE, not better.

I'm not joking.

They simply didn't search at all.
Yes, I agree. In fact I made a lot of posts on computer-go forum that what they were doing was not scalable and was wrong. This was especially before the monte-carlo explosion too place. I lot of people felt that go had to be approached without any type of search whatsoever (except the 1 ply search paradigm being used at the time) other than local search to determine if groups were alive or not. This is a perfect example of idealistic right brain thinking that looks good on paper, but is just a dream and no grounding in reality.


Then evaluation speed. Where i had designed an incremental evaluation function for go, which did do board control and kept track of groups incremental for the life&death (basically the most important consideration in go), that went at a speed of 3000+ nps. This for 1 week of programming.

Similar evaluations from go-programs searched at a speed of 100 evaluations a second at the same hardware.

The trick the 'world top', just short before monte-carlo arrived, was using, was a fast 'local tactical search'. So again very hard pruning. Your program can't learn from the search in such case.

So there was 0 effects in 0 of the top programs that would let them play better with more system time, not to mention more cores.

If you move from that to a brute force approach that, be it semi-random, considers all moves, needless to say you win 1000 elo from search viewpoint.

Also please realize that before they figured out how to search, basically this means the evaluation fucntions some had written, do a lot better job than chess evaluations do.

Know any program that at standard time controls, searching 100 nps with hard forward pruning in root, gets a 'first-dan' claimed elostrength.

Ok that would of course also mean i'm first dan, as i can easily beat the go software, whereas experienced go players easily beat me.

I only play for life&death against the computer, against human you make no chance with that strategy as they then go try to win tactical against you instead of strategical, so they win all fights everywhere then.

So i would argue it was really weak that software and suffered major league from not having a decent (parallel) search and had an even bigger problem programming for speed.

You get what you pay for of course; there is no fulltime salary to make with a go-engine other than a few selling a few GUI's, and that basically sells to western europe and USA and not to Asia at all. That market is basically a complete closed market.

Obviously if a few chessprogrammers do some effort, they would show up with very sophisticated selective forms of search, at which point you can completely throw away anything that has to do with monte-carlo.

If i wanted i would be able to write down a method here that gives you a complete superior form of search in go, which is going to kick monte carlo major league.

Yet an important condition for such a search is that you search 20-30 plies deep at least, so that your evaluation function realizes what happens.

In itself your search is a lot more selective then of course (way way more) than in chess, but realize this is POSSIBLE in go, and IMPOSSIBLE to do in the same manner in chess. There is no way you can prune in the root in chess like you can in go.

Nullmove also works far better in go than in chess of course. So that is a huge branching factor improvement it'll give there which it won't give in chess in the same manner.

Already in 1998 my go program was reaching 10 plies deep, just using nullmove and no other pruning at all.

Yes, in the openings position.

10 plies in chess took longer simply.

So with some selectivity the branching factor in go isn't worse than in chess at all, as you can really reduce most moves, except for a handful, in go, which is total impossible in chess.

In chess reductions of more than 1 ply get done by some programs and i've extensively experimented with it, but it is really difficult to get to work.

AFAIK no top engine of today in chess is doing this, except for maybe junior.

So a move is either 1 ply or 2 plies in nearly all programs (and when extended 0 plies of course).

In Go you can easily say that 90% of the moves has a cost of 8 plies.
So a total reduction of 7 plies, is peanuts there, as some moves are such crap that they never gonna make it to best move EVER.

Then on top of that nullmove in Go and other forms of selectivity and you
can easily search 30 ply in go, right from start, and beat the very best go players at a 19x19 board.

I see no problems in doing that, however you would invest a part of your life to produce such software and get no salary at all. Not from sales either.

You get what you pay for.

Go is not more complicated than chess. In Go the best go players try to beat the best go players. That's just as hard as the best chess players trying to beat the best chess players.

Computer-go is simpler than computer-chess however; beating the best chess programs is nearly impossible for someone who doesn't come from within that field, whereas in computer-go this is a lot easier.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Intell 48-core CPU scales to 1k cores

Post by diep »

Gian-Carlo Pascutto wrote:
bob wrote: Again, define "doubling". If you mean 32 threads at 2.0 ghz and doubling that to be 32 threads at 4.0ghz, I can show you at least _one_ program that will gain +70 Elo...
It depends on the game under consideration, which is why directly comparing ELO or %score differences between games and making conclusions about parallelization is erronous.

A tic-tac-toe program wouldn't "scale" in terms of ELO strength when given more cores, even if there is a real and effective parallelization of its tree search.
It's more than that of course.

If you take a 90s program and throw it at a 1000 processor machine,
then against some software from start 21th century it still will lose and keep losing.

Todays software scales better there.

Vincent
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Intell 48-core CPU scales to 1k cores

Post by diep »

Milos wrote:
Gian-Carlo Pascutto wrote:The experiment is a cheat? So, the 48 core chip doesn't really exist? Or what are you claiming?
Are you unable to understand what's written or just playing stupid?
The cheat is to use '94 Pentium as a core and claim 1000 cores scaling without problem. Dissipation is a problem without solution. Ignoring it just doesn't make it go away. If you don't understand that, maybe you should go back to basics.
Of course not. Those cores are designed to achieve outstanding single-thread performance, rather than the most optimal size/dissipation/nr cores/core performance product.
This is just BS. What kind of outstanding performance on the level of '94 Pentium are you talking about???
The cores are not designed for anything specific. They just took the most advanced core that would still allow them to pack 48 cores without burning the package.
There exists beta cpu's without cache coherency from intel.

I also filled in a form to get one, some time ago.

Can't remember what happened after that.

Without cache coherency cpu's scale easily further to thousands of cores.

GPU's are a good example. AMD's Radeon 5970 card has 2 gpu's with in total 3200 streamcores, clocked at nearly a Ghz in the sapphire edition.

So it has kick butt potential, yet programming for it is really complex.

Vincent