He should team up with Fernando to help improve the Moron engine by adding random weights to all zero features.Graham Banks wrote:I wonder which engine Alex is the author of.Don wrote:I could hardly fail to disagree with you less.LudiBuda wrote:I couldn't agree less with you on this.
Evaluation of the engine is of almost no importance for the ELO strength. Just try to modify Ivanhoe by putting random weights for the evaluation terms. You will still have a super strong engine.
Evaluation has great influence on the playing style, but lets not kid ourselves. Most people care about ELO and ELO only.
When will we see HOUDINI in official tournaments?
Moderator: Ras
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: When will we see HOUDINI in official tournaments?
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
Uri Blass
- Posts: 11209
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: When will we see HOUDINI in official tournaments?
The question is how do you define super strong engine.LudiBuda wrote:I couldn't agree less with you on this.
Evaluation of the engine is of almost no importance for the ELO strength. Just try to modify Ivanhoe by putting random weights for the evaluation terms. You will still have a super strong engine.
Evaluation has great influence on the playing style, but lets not kid ourselves. Most people care about ELO and ELO only.
suppose you get an engine that is X elo weaker than Ivanhoe by evaluation modification for some X(X may be 20 or 40 or 100 or 200)
Do you define it to be a super strong engine in part of the cases and if yes then for what value of X?
Note that I wonder what is the correlation between gambitTiger and Tiger14.0 and if they have similiarity of more than 60%
I remember that the programmer claimed that he was surprised when gambitTiger was so strong and he learned that it is possible to do significant changes in the evaluation that change the playing style but not reduce much the playing strength.
Uri
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: When will we see HOUDINI in official tournaments?
None of the anecdotes resonate much with me because random changes hurt Komodo's playing strength. We have changed the style somewhat with carefully engineered changes to many terms in the past - but nothing random.Uri Blass wrote:The question is how do you define super strong engine.LudiBuda wrote:I couldn't agree less with you on this.
Evaluation of the engine is of almost no importance for the ELO strength. Just try to modify Ivanhoe by putting random weights for the evaluation terms. You will still have a super strong engine.
Evaluation has great influence on the playing style, but lets not kid ourselves. Most people care about ELO and ELO only.
suppose you get an engine that is X elo weaker than Ivanhoe by evaluation modification for some X(X may be 20 or 40 or 100 or 200)
Do you define it to be a super strong engine in part of the case and if yes then for what value of X?
Note that I wonder what is the correlation between gambitTiger and Tiger14.0 and if they have similiarity of more than 60%
I remember that the programmer claimed that he was surprised when gambitTiger was so strong and he learned that it is possible to do significant changes in the evaluation that change the playing style but not reduce much the playing strength.
Uri
Very few people really understand how to measure program strength anyway, this forum is littered with posts where people have made some change and run 50 game matches to "prove" it.
Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
Uri Blass
- Posts: 11209
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: When will we see HOUDINI in official tournaments?
I believe that random changes hurt komodo's playing strength and the same for other top programs but the question is how much and if people can start with something that is 50 elo weaker or 100 elo weaker than Ivanhoe only by some random modifications of the evaluation of IvanHoe and escape the similiarity test.
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: When will we see HOUDINI in official tournaments?
If they have to give up 50 - 100 ELO it is a disincentive not to cheat but a cheater by his very nature doesn't want to give up any ELO. I should qualify that. Some cheat just because they cannot program and will be happy with a pretty weak program, but the more typical case I call the "Rosie Ruiz" style cheater.Uri Blass wrote:I believe that random changes hurt komodo's playing strength and the same for other top programs but the question is how much and if people can start with something that is 50 elo weaker or 100 elo weaker than Ivanhoe only by some random modifications of the evaluation of IvanHoe and escape the similiarity test.
But I don't think this weakening can easily fool the test because Doch through Komodo had well over 50 ELO due just to evaluation improvements which did not fool the test. This was a lot of weight changes and added terms.
It's surprising to me how resilient that test is to trickery - but I don't think we yet understand it's limits. It seems likely to me that it will have some weaknesses.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
LudiBuda
- Posts: 76
- Joined: Sat Mar 03, 2012 7:53 pm
Re: When will we see HOUDINI in official tournaments?
Hope you are not making an excuse to use Ivanhoe search 'ideas', because 'evaluation is what matters'.
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: When will we see HOUDINI in official tournaments?
Please do not start this worn out rant again.LudiBuda wrote:Hope you are not making an excuse to use Ivanhoe search 'ideas', because 'evaluation is what matters'.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
Uri Blass
- Posts: 11209
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: When will we see HOUDINI in official tournaments?
I think that some of the tricks to fool the test may be to change the move generator.Don wrote:If they have to give up 50 - 100 ELO it is a disincentive not to cheat but a cheater by his very nature doesn't want to give up any ELO. I should qualify that. Some cheat just because they cannot program and will be happy with a pretty weak program, but the more typical case I call the "Rosie Ruiz" style cheater.Uri Blass wrote:I believe that random changes hurt komodo's playing strength and the same for other top programs but the question is how much and if people can start with something that is 50 elo weaker or 100 elo weaker than Ivanhoe only by some random modifications of the evaluation of IvanHoe and escape the similiarity test.
But I don't think this weakening can easily fool the test because Doch through Komodo had well over 50 ELO due just to evaluation improvements which did not fool the test. This was a lot of weight changes and added terms.
It's surprising to me how resilient that test is to trickery - but I don't think we yet understand it's limits. It seems likely to me that it will have some weaknesses.
Imagine that Be2 and Bc2 have exactly the same score at every depth and are the 2 best move.
The choice if to play Be2 or Bc2 may be dependent on the move generator.
If the move generator generates first Be2 then Be2 may become the pv and if the move generator generates first Bc2 then Bc2 may become the pv.
programmers who do not like to change the move generator may change the program to change its mind at depth 1 if they have at least the same score and not if they have better score when the search at bigger depth is the same.
I wonder if you tested changes in the move generator to see if they help komodo or not.
-
CRoberson
- Posts: 2096
- Joined: Mon Mar 13, 2006 2:31 am
- Location: North Carolina, USA
Re: When will we see HOUDINI in official tournaments?
Wow, talk about not understanding your experiment.LudiBuda wrote:I couldn't agree less with you on this.
Evaluation of the engine is of almost no importance for the ELO strength. Just try to modify Ivanhoe by putting random weights for the evaluation terms. You will still have a super strong engine.
Evaluation has great influence on the playing style, but lets not kid ourselves. Most people care about ELO and ELO only.
About 3 or 4 years ago, I devised a test to reveal the importance of modern eval vs search. The results were drastic and I reported them in the tech forum here. Bob Hyatt independently verified the results with a greater number of games for accuracy and the results were very similar.
In the end, an excellent eval vs only piece counting is worth 700 to 1000 Elo or more. The piece counting programming (all else
identical) scored only 1% vs the full version of the program. The reduced eval program gained 3 ply in search. So, it gained Elo. Add that gain to the 700 performance difference and you see that eval is paramount.
-
LudiBuda
- Posts: 76
- Joined: Sat Mar 03, 2012 7:53 pm
Re: When will we see HOUDINI in official tournaments?
What are you talking about? Did you read my post at all?
What I am suggesting is to take Ivanhoe, run the test games against 10 opponents, then put random weight between lets say 0 and 20 for each eval term and run the test again.
Your experiment is bogus. Try doing the same test for the search. Have a brute force search with the state of the art eval and see what you get.
What I am suggesting is to take Ivanhoe, run the test games against 10 opponents, then put random weight between lets say 0 and 20 for each eval term and run the test again.
Your experiment is bogus. Try doing the same test for the search. Have a brute force search with the state of the art eval and see what you get.