Evidence That NNs Work Best With Multiple Modules

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
towforce
Posts: 12708
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK
Full name: Graham Laight

Evidence That NNs Work Best With Multiple Modules

Post by towforce »

Researchers at Fujitsu an MIT have shown that for understanding images, NNs work best when broken into multiple modules, each of which does a separate part of the work.

https://www.techradar.com/news/boffins- ... like-we-do

It seems obvious to me that the same will apply to chess: one NN to do everything is clearly not the way. This partially explains why NNs trained on several order of magnitude more games than any human will ever see are unable to beat top humans at ply 1.

(post first published on Ed's ProDeo forum earlier today).
Human chess is partly about tactics and strategy, but mostly about memory
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Evidence That NNs Work Best With Multiple Modules

Post by Milos »

towforce wrote: Sun Dec 12, 2021 1:26 pm Researchers at Fujitsu an MIT have shown that for understanding images, NNs work best when broken into multiple modules, each of which does a separate part of the work.

https://www.techradar.com/news/boffins- ... like-we-do

It seems obvious to me that the same will apply to chess: one NN to do everything is clearly not the way. This partially explains why NNs trained on several order of magnitude more games than any human will ever see are unable to beat top humans at ply 1.

(post first published on Ed's ProDeo forum earlier today).
As usual, your layman intuition doesn't at all apply.
What they did is not groundbraking in any way (again some layman tech journalist reporting) what could clearly be seen from the category of their paper at NIPS (just a regular poster, not chosen as the best poster, let alone best paper). There are plenty of research groups working on the same idea.
However, that's not even a point. The point is that this has absolutely nothing to do with chess and is totally not applicable in chess.
Human based evaluation (classical) tends to separate categories (pieces types, colors, ranks) because hand coding actual dependence between categories (features) in efficient way is difficult if not impossible. OTOH, NN(UE) based evaluation tries to extract features that contain interdependencies between categories that are crucial for accurate evaluation. Separating NNs doesn't help in any way because the goal of NN in chess is not classification that is the main task of the approach you cited.
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Evidence That NNs Work Best With Multiple Modules

Post by dkappe »

Milos wrote: Sun Dec 12, 2021 5:06 pm However, that's not even a point. The point is that this has absolutely nothing to do with chess and is totally not applicable in chess.
Human based evaluation (classical) tends to separate categories (pieces types, colors, ranks) because hand coding actual dependence between categories (features) in efficient way is difficult if not impossible. OTOH, NN(UE) based evaluation tries to extract features that contain interdependencies between categories that are crucial for accurate evaluation. Separating NNs doesn't help in any way because the goal of NN in chess is not classification that is the main task of the approach you cited.
Milos,

I guess you can be forgiven for sleeping through the AlphaZero revolution. Like Athena emerging from the forehead of Zeus, the NN techniques used in AlphaZero were adapted from the work on image classification and recognition. Even the latest NNUE’s used by stockfish have sub modules corresponding to psqt’s and game phase.

Eagerly awaiting your next angry response. :D
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Evidence That NNs Work Best With Multiple Modules

Post by Milos »

dkappe wrote: Sun Dec 12, 2021 5:31 pm
Milos wrote: Sun Dec 12, 2021 5:06 pm However, that's not even a point. The point is that this has absolutely nothing to do with chess and is totally not applicable in chess.
Human based evaluation (classical) tends to separate categories (pieces types, colors, ranks) because hand coding actual dependence between categories (features) in efficient way is difficult if not impossible. OTOH, NN(UE) based evaluation tries to extract features that contain interdependencies between categories that are crucial for accurate evaluation. Separating NNs doesn't help in any way because the goal of NN in chess is not classification that is the main task of the approach you cited.
Milos,

I guess you can be forgiven for sleeping through the AlphaZero revolution. Like Athena emerging from the forehead of Zeus, the NN techniques used in AlphaZero were adapted from the work on image classification and recognition. Even the latest NNUE’s used by stockfish have sub modules corresponding to psqt’s and game phase.

Eagerly awaiting your next angry response. :D
Why would I be angry because of someone's layman understanding? :lol:
I guess since Resnet was used for determining policy and value it must be that categorization based image classification is also very useful for chess. Gee. I get it, you are just a hobbyist, but it's not that much expecting some basic understanding.
Thinking that different psqt based on the game phase has something to do with what OP quoted is a bit clueless.
P.S. I analyzed particular AlphaGo Resnet architecture when you didn't even know what deep learning is. :wink:
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Evidence That NNs Work Best With Multiple Modules

Post by dkappe »

Milos wrote: Sun Dec 12, 2021 8:07 pm Why would I be angry because of someone's layman understanding? :lol:
I guess since Resnet was used for determining policy and value it must be that categorization based image classification is also very useful for chess. Gee. I get it, you are just a hobbyist, but it's not that much expecting some basic understanding.
Thinking that different psqt based on the game phase has something to do with what OP quoted is a bit clueless.
P.S. I analyzed particular AlphaGo Resnet architecture when you didn't even know what deep learning is. :wink:
And you didn’t disappoint. Angry response, right on time. :D Since I was using ResNETs for image classification in medical devices long before AlphaGo, when exactly were you analyzing these AlphaGo ResNET?

Also, basic reading comprehension: sub modules corresponding to psqt’s are not the same thing as different psqt based on the game phase. Think before you write.

Eagerly awaiting your next angry response. :D
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Evidence That NNs Work Best With Multiple Modules

Post by Milos »

dkappe wrote: Sun Dec 12, 2021 8:57 pm And you didn’t disappoint. Angry response, right on time. :D Since I was using ResNETs for image classification in medical devices long before AlphaGo, when exactly were you analyzing these AlphaGo ResNET?
Late 2015 when first version of DeepMind's Nature paper appeared on arXiv (they later removed it in order to get it published in Nature).
Even though AlexNet already won LSVRC2012 the first ResNets really appeared in 2015.
For your information I have papers in ECCV and MICCAI in ML for medial images, so I kind of know the field. So, excuse me for not believing you, but I can smell BS from a mile away, and you are kind of known for it.
So when were you using those ResNets for classification? What kind of classification? What medical images? Which exact ResNet?
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Evidence That NNs Work Best With Multiple Modules

Post by dkappe »

Milos,

since ResNET’s are a specialized kind of highway network, I suppose you are correct that I technically wasn’t using ResNET’s in early 2015. But don’t tell me the first you heard of a ResNET was in a paper.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Evidence That NNs Work Best With Multiple Modules

Post by Milos »

dkappe wrote: Sun Dec 12, 2021 11:51 pm Milos,

since ResNET’s are a specialized kind of highway network, I suppose you are correct that I technically wasn’t using ResNET’s in early 2015. But don’t tell me the first you heard of a ResNET was in a paper.
Ofc, but how else?
I remember what was discussion at that time. CNNs were in for a while (AlexNet, GoogleNet) but they were simple, couple of convolutional layers with ReLUs and fully-connected layer in the end. And they all suffered from saturation once you start adding more layers (usually more than 10 hidden layers).
There was a famous pre-print of ResNet paper published on arXiv (and later in CVPR) that has like ridiculous number of citations (one of the most cited ML papers ever) claiming that introducing residual connections solves issue of saturation of accuracy in training. And if I remember correctly, the discussion at that time was whether Deepmind used residual connections in their policy and value nets in AlphaGo or not, because that one was first trained with supervised learning and there was no saturation in their training graphs in the paper, and they were quite vague (just checked in the paper they didn't even mention BN and pooling layers) about policy and value network architecture (and they got away with it even in Nature publication). Later with AlphaZero paper it was confirmed that they indeed used residue blocks.

And if you want to listen, I can explain you more in detail what is the problem with approach in that paper that OP cited and why it can't be applied to chess. First this approach of having multiple categorized networks assumes that objects that need to be classified can be more less cleanly divided into different categories that don't overlap. That is in practice of a very little importance since usually many features in images overlap and having networks that are trained separately for shape, color and size, for example, doesn't work that well. It's even more complicated with chess because you can't train one separate network for each individual piece type, or each side and seriously expect to have any meaningful result. There are no categories that make sense with clear separation. Even some that might sound interesting on paper were tried in Lc0 (like separate nets for pawns and other pieces, or PSQT for the mater) and failed miserably.
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: Evidence That NNs Work Best With Multiple Modules

Post by dkappe »

Well, what were later termed “highway nets” had been around for a while (2013?) in my academic circles. It was just a matter of time before someone came up with a practical application such as ResNET’s. We employed them (highway nets) in a ridiculous application that used a smartphone to read the level of discharge in a graduated container normally placed under beds. You always remember the silly ones.

So, I’m curious what your take on that paper is. Let’s wipe the slate clean. All ears (or eyes).
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: Evidence That NNs Work Best With Multiple Modules

Post by Sopel »

Went through the slides (https://neurips.cc/media/neurips-2021/Slides/26740.pdf). I agree with Milos, this is useless for chess.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.