Idea For ANNs

towforce · Post by **towforce** » Tue Dec 10, 2024 2:46 pm

This might be a bad idea - just interested if anyone has any comments.

It's maddening that in nature, animals with tiny brains can generate a large amount of skills and behaviours, but ANNs seem to need to be huge.

One of the problems is probably that in ANNs, the "bits of knowledge" may be a long way away from each other. Chess obviously has a large number of dimensions, but imagine a problem that only has 2 dimensions, but lots of nodes in the NN. If you mapped the NN model onto A4 paper, you probably wouldn't be able to see enough detail. To make that detail clear, you'd have to draw it onto a very large piece of paper.

Also, the NN is probably much more complicated than it really needs to be (based on comparing small animal brains with today's ANNs).

So how about forcibly reducing the size?

After the first layer, the nodes get their initial value (before applying the activation function) by summing the input connection values multiplied by the weights. So, to keep the above "paper map" small, how about...

1. The weights are integers in the range 0-1000 (or 0-1023) if you want a good binary fit - 2^10 = 1024)

2. The arithmetic (both the multiplying and the adding) are modular. So, for example:

1020 + 5 = 1
512 * 2 = 0

This network would be more difficult to train, but it could result in more knowledge being encoded into fewer neurons, because paper map distance between the parts of the knowledge would be smaller. You would likely get helpful knowledge overlap - or, to put it another way, deeper patterns.

towforce · Post by **towforce** » Tue Dec 10, 2024 2:55 pm

Plan B is, of course, is to come up with a better way to train the ANN apart from throwing billions of chess positions at it.

towforce · Post by **towforce** » Tue Dec 10, 2024 8:36 pm

Amazingly, there is an academic paper about using modular arithmetic in ANNs.

It's about securing the ANN model against theft rather than actually improving the model - though, to my surprise, it only makes the model slightly worse.

I seem to be the only person thinking in terms of how far apart (in centimetres) different pieces of knowledge would be if you drew them on a piece of paper. That's because I'm thinking in terms of encoding the knowledge as a generative function rather than an NN, and in constructing the generative function, the parts of knowledge being far apart makes it an order of magnitude more difficult to create the function (though that might be a problem I can solve).

The academic paper about using modular arithmetic in an NN: link.

Idea For ANNs

Idea For ANNs

Re: Idea For ANNs

Re: Idea For ANNs