Definitely not random - it's just genetics applied to algorithms. It's that same "slight pressure to improve" that makes us different to yeast cells. Admittedly the testing period was 3.5 billion years for life but it deals with quite long strands of DNA rather than just a couple of hundred weighting parameters for chess.Don wrote:The changes would be almost random (with only slight pressure to improve) and you would have to run a lot of games for each single change.
I'm not sure about the "simulated annealing" but I haven't tried it. I suppose if all the changes that are made work well together then you'll get a good positive jump. Of course if any part of the changes was bad then it will destroy any benefit. I'm not sure whether evolutionary mutations work by altering multiple genes at the same time or whether it's one gene at a time. If "simulated annealing" isn't used in nature then I suspect it won't benefit chess programs either.Don wrote:I think what you might like is some variation of simulated annealing. You basically make lot's of changes at one time, test, do again.
That might work assuming you know exactly what the right limits are - the problem is we don't. I play chess and have always assumed that knights and bishops are roughly 3 pawns but that's because I read it in a book and it seems to work over the board. Methods like this actually test these assumptions and check that they are correct. If a rook is really worth 5.5 pawns then it will find that out (obviously it isn't as simple as this).Don wrote:Learning could be improved I think if a human put limits on the values each parameter could take i.e. we know that it would be unreasonable for a knight be less than 200 or more than 500 and that most positional terms should not be more than half a pawn - and any that are could be identified.
I'll admit it's very different to ordinary programming methods but sometimes it's worth trying something new. The other benefit is that it effectively checks all your code as well. I have switches in my code to turn on/off alpha beta, size of extensions etc and I can test whether this code works by just looking at the weighting values that control these things - if a weighting starts to fade to zero or go negative then this tells you there's something wrong with that part of your code.
Does it really matter if it takes a month to run enough test games? You could probably speed it up by having a pool of programs playing each other. At the end of the day you'll have tested all your algorithms and adjusted your weights better than you could ever do by hand. If it uncovers a few bugs then that's an added benefit.