Linear Programming Will See MUCH Deeper Than Alpha-Beta

diep · Post by **diep** » Mon Jan 05, 2009 1:49 am

to Paul Gift: Chess is not a lineair programming problem.

And yes navigation is similar to a chess search, just it has less nodes and no opponent.

Chess is a lot harder in that sense.

We cannot continue the discussion however about navigation algorithms as that's where i work.

Didn't Dijkstra release an algorithm half a century ago for that?

Note that this is also not a typical example of a lineair programming algorithm. Simplex is for example.

In chess with 10^43 possible positions you won't get very far with lineair programming, as you don't have yet a machine with 10^43 positions.

If you do have such a box that can calculate that much, you can however solve chess in much simpler manner than using a slow lineair programming algorithm (as they are publicly known). In checkers especially EGTBs get used for the last plies to solve it. In chess we have those also.

Say it is about factor 200 overhead to create them additional to the number of positions you want to store it in.

Vincent

diep · Post by **diep** » Mon Jan 05, 2009 2:16 am

towforce wrote:For linear problems (of which chess is an example) integer linear programming (ILP) enables optimal values to be found relatively quickly in large search spaces. For example, the very largest proven TSP (travelling salesman problem) solutions uses variants of this method: the record is 85,900 towns - a search space of 85900!=9.6e386526. One man has even claimed to have found a way to solve TSP in polynomial time (click here) - though AFAIK this claim hasn't been peer-reviewed yet. ILP solvers are improving rapidly at this time. At the German ILP problem library, you'll see that in 2003 they had to introduce a new set of much more difficult problems, because newer solvers were able to solve the old ones too quickly!

I gave an example of applying ILP to a river-crossing problem which, like chess, is a series of consecutive states, here. The first thing you'll notice is that, to solve a simple combinatorial problem, I created a 90-line model. The corresponding model I wrote for a 20-man river crossing problem was 1, 500 lines long. Representing chess would require an absolutely huge model.

That's the easy part of the problem!!!

The medium part of the problem is getting the model to converge to a good optimum. The chess tree is full of symmetries, which would have to be broken. There's a paper on breaking ILP symmetries here. Getting large ILP models to converge is often difficult anyway. There are various methods available - the cutting-plane method, branch and bound, branch and cut, and delayed column generation for example - it would, of course, be a case of finding out what combination of known methods works best for a chess tree.

The hard part of the problem is that before getting started at all, it will be necessary to take an ILP solver, and rewrite it to allow not one objective function (the expression that linear programming tries to optimise), but two. I can see no mathematical reason why this should be a show-stopper - but it will be necessary because of the existence of an opponent - whose "objective function" will be approximately the opposite of what our side's will be. As an example, our objective function might be:

* maximise white king safety

* minimise black king safety

* maximise white piece mobility

* minimise black piece mobility

* maximise white piece value

* minimise black piece value

To get the opponent's objective function, simply take the above objective function and reverse each term!

If I were to use the above objective function for white, and somehow got the model to converge in a solver to a good solution over, say, 5 moves, the solution would simply consist of 5 "good" white moves and 5 obviously bad black moves. What is actually required, though, is to find the "best line" - a sequence of moves optimised for both the black and the white pieces. One way might be to take an open-source ILP program and make a fork of it that can take two objective functions - one to be applied to half the expressions in the model and the other to be applied to the other half. Doing this will take both programming and mathematical know-how.

A big job then.

The reward, though, will be a chess program that can see much more deeply into the chess tree than anyone today would imagine to be possible.

Paul to give you a simple example where all the lineair optimizers go dead wrong when optimizing parameters from a chess evaluation (not to confuse with my previous posting which is about SEARCH, this is about the EVALUATION of a chessposition).

Most stronger engines have some thousands of parameters to get tuned.
We assume we start tuning from what we have right now. We're not interested in how fast your algorithm finds what, we are just interested in BETTER values than we have now.

If we now just look to the material values and let's assume that we have these values:

pawn : 100
knight: 300
bishop: 300
rook: 500
queen: 900

These are the values given by chess worldchampion Max Euwe in his very popular and very good books to learn chess.

Suppose now you fix the rook at a new value you want to try which is let's say for example 600.

600 actually is a value used in many programs for the rook.

Now the big problem is the relationship that the other values have.

The experience learns that optimizers from the lineair programming corner are not capable of finding the relationship that if you change this value that another COMBINATION of values is needed to get others to work.

All the values in short are total dependant upon each other.

It won't be able to conclude that it needs to fix then also the queen to 1200, the knight to 350+, the bishop to 350+.

It is that COMBINATION it has to find out, or it will just play like total garbage in games, after which the tuner logically concludes that fixing this value to 600 produced a horrible result.

Vincent

Pradu · Post by **Pradu** » Mon Jan 05, 2009 3:11 am

diep wrote:to Paul Gift: Chess is not a lineair programming problem

As far as I can see, chess hasn't been posed explicitly as a linear problem yet, however there's no reason why it can't be. There are some documents on the web on applying linear programming to two-player zero-sum games. Mostly they talk about matrix games where both players make a decision simultaneously but some papers do mention chess and seem to suggest that chess could be posed as a linear programming problem. Paul might figure out a way to (if he does it "for free" as a hobby

). However I qualitatively doubt that it will massively outsearch current search algorithms. The search will be considerably different if it takes an approach of solving subproblems to generate nodes that are potentially promising so we probably can't compare the search depth to that of a regular program, instead we should probably compare playing strength.

The first thing to do is probably to look at why a linear-programming formulation and solution methods work well for ... for example TSP and see if those underlying features exist for chess. If it can be shown that it may improve computer chess search, then there will be considerable interest in formulating and solving chess as a linear-programming problem.

Pradu · Post by **Pradu** » Mon Jan 05, 2009 3:31 am

diep wrote:The experience learns that optimizers from the lineair programming corner are not capable of finding the relationship that if you change this value that another COMBINATION of values is needed to get others to work.

Optimization algorithms generally do tune combinations of values to find the optimum, otherwise it wouldn't be an optimization algorithm. If you only tune one coordinate, then you are performing a line-search in that coordinate. If you cyclically do line-searches for individual coordinates, you are doing a cyclic coordinate descent (which will reach the solution where the combination of values work well together). However, evaluation tuning is probably best expressed as a integer nonlinear problem (depending on how you pose the problem), for which you may still use sequential linear programming which is probably not readily applicable to evaluation but useful for some other non-linear problems...

diep · Post by **diep** » Mon Jan 05, 2009 10:33 am

Pradu wrote:
diep wrote:The experience learns that optimizers from the lineair programming corner are not capable of finding the relationship that if you change this value that another COMBINATION of values is needed to get others to work.
Optimization algorithms generally do tune combinations of values to find the optimum, otherwise it wouldn't be an optimization algorithm. If you only tune one coordinate, then you are performing a line-search in that coordinate. If you cyclically do line-searches for individual coordinates, you are doing a cyclic coordinate descent (which will reach the solution where the combination of values work well together). However, evaluation tuning is probably best expressed as a integer nonlinear problem (depending on how you pose the problem), for which you may still use sequential linear programming which is probably not readily applicable to evaluation but useful for some other non-linear problems...

You don't have the system time to do lineair optimizations in the manner as they get done now.

Note i'm busy with a project there together with Renze Steenhuisen (Technical University Delft) with respect to parameter tuning and the things i formulated seem interesting to tune well. The project runs at a slow speed though as it doesn't bring money.

I hope to achieve exactly what i expressed above as the problem that current algorithms are not capable of optimizing. In artificial intelligence, things like parameter optimization are still in the stone age so to speak, so there is a lot to invent there.

Vincent

towforce · Post by **towforce** » Mon Jan 05, 2009 1:03 pm

diep wrote:And yes navigation is similar to a chess search, just it has less nodes and no opponent.

If you are talking about road navigation, then this is the shortest path problem - see http://en.wikipedia.org/wiki/Shortest_path . Most road navigation problems probably could be solved to optimum - but in-car navigators and web-sites want to give people quick solutions, so the route that they give is a "reasonable" route, not an "optimised" one. However - I have personally noticed that newer sat-navs are giving much better routes than the older ones from 10 years ago.

As I have previously shown in this thread, to solve the TSP for 85900 towns (which has been done to a proven optimum), the search space is the 85900! permutations of towns, which is the equivalent of the search space in chess to depth of many tens of thousands of moves.

For anyone who hasn't realised yet, ILP can prove that a particular permutation of something is optimal without having to visit all the nodes. From a practical point of view, it can usually find optimum values relatively quickly - my personal experience is that, roughly speaking, an ILP will spend 5% of its time finding the optimal solution, and then 95% of its time proving that this solution is optimal.

Regarding the piece values - you are quite right. A good chess program would adjust them during the game - and this would be a non-starter for ILP. Let's say that we are looking 40 plies (20 moves) ahead: the objective function (LP's equivalent of an evaluation function) would be constant throughout the whole of the search. This is why I suggested that it put more emphasis on piece mobility and king safety, and less emphasis on piece values. But in general, one would be relying on sheer depth of search to overcome weaknesses in evaluation of positional nuances - so Vincent is absolutely right to be concerned about this issue. My opinion is that in a battle of depth-of-search vs positional nuances, depth of search would win handsomely.

You don't have the system time to do lineair optimizations in the manner as they get done now.

We would need to find out what works in chess. As I stated in my previous post, an LP model to play chess would be very large, which implies the usage of the delayed column generation technique - probably combined with symmetry breaking techniques (see OP in this thread for link). I remain confident that an ILP search could productively search a much larger portion of the chess tree than today's node-generation techniques.

Pradu · Post by **Pradu** » Mon Jan 05, 2009 1:52 pm

towforce wrote:my personal experience is that, roughly speaking, an ILP will spend 5% of its time finding the optimal solution, and then 95% of its time proving that this solution is optimal.

Have you taken a look at the Georgia Tech website for the 85,000 VLSI solution history? :
http://www.tsp.gatech.edu/pla85900/compute/compute.htm

Pradu · Post by **Pradu** » Mon Jan 05, 2009 1:53 pm

towforce wrote:I remain confident that an ILP search could productively search a much larger portion of the chess tree than today's node-generation techniques.

Why?

towforce · Post by **towforce** » Mon Jan 05, 2009 2:06 pm

Pradu wrote:
towforce wrote:I remain confident that an ILP search could productively search a much larger portion of the chess tree than today's node-generation techniques.
Why?

Because, whilst there is no denying that there is an absolutely massive overhead involved in constructing an LP model for a chess position and its next n moves (in many positions, current chess computers would have found the best move before the ILP model has even been written - still less parsed - and all that has to be done before solving can even begin!), once we have a working model, and we can get it to converge (!!!), the "magic" of linear programming will kick in and move towards the optimal value. Generating nodes, unless you prune them furiously (which runs the risk of missing something important) will simply result in a geometric expansion of the amount of work that must be done for every extra ply of search. Although generating nodes and evaluating them is many orders of magnitude quicker than generating and running an ILP model, the geometric expansion of their numbers will ultimately allow the ILP model to first overtake, and then move far, far ahead.

towforce · Post by **towforce** » Mon Jan 05, 2009 2:11 pm

Pradu wrote:
towforce wrote:my personal experience is that, roughly speaking, an ILP will spend 5% of its time finding the optimal solution, and then 95% of its time proving that this solution is optimal.
Have you taken a look at the Georgia Tech website for the 85,000 VLSI solution history? :
http://www.tsp.gatech.edu/pla85900/compute/compute.htm

Very interesting - thanks for the link! They used ILP with the branch and cut technique.

Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta

Re: Linear Programming Will See MUCH Deeper Than Alpha-Beta