Blitz monster Twisted Logic

Mike S. · Post by **Mike S.** » Thu Oct 29, 2009 7:12 pm

1m+1s, D945/3.4 GHz
128 MB hash tables each
four short opening variations(*)
Arena 2.0.1, Vista 32 Bit
3/4- and R-R-5-piece Nalimovs
(for the opponents; 128 MB tbs cache)

Code: Select all

Twisted Logic 090922   - Colossus 2008b           6.5 - 1.5    +5/-0/=3    81.25%
Twisted Logic 090922   - Fruit 051103             6.5 - 1.5    +6/-1/=1    81.25%
Twisted Logic 090922   - Ruffian 1.0.5            6.0 - 2.0    +5/-1/=2    75.00%
Twisted Logic 090922   - Spike 1.2                5.0 - 3.0    +4/-2/=2    62.50%

If I compare with CCRL 40/4m (singlecore/32-bit) ratings, this was a performance at Fritz 11 level! T.L. had only four losses in 32 games against good opponents.

A good engine which we should keep in mind.

*) the openings:

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "E15"]
[PlyCount "7"]

1. d4 Nf6 2. c4 e6 3. Nf3 b6 4. g3 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "B54"]
[PlyCount "7"]

1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "E61"]
[PlyCount "8"]

1. Nf3 Nf6 2. d4 g6 3. c4 Bg7 4. Nc3 O-O *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "C77"]
[PlyCount "8"]

1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 *

Mike S. · Post by **Mike S.** » Thu Oct 29, 2009 11:34 pm

An attempt to verify the good performance against stronger opponents:

Code: Select all

Twisted Logic 090922   - Rybka 2.3.2a             0.5 - 7.5    +0/-7/=1    6.25%
Twisted Logic 090922   - Stockfish 1.51-si        0.5 - 7.5    +0/-7/=1    6.25%
Twisted Logic 090922   - Rybka v1.2n.w32          4.5 - 3.5    +4/-3/=1    56.25%
Twisted Logic 090922   - Bright 0.4a-si           3.5 - 4.5    +2/-3/=3    43.75%

(same conditions as above)

Relative to CCRL ratings, this second performance is much lower. Combining both parts, I come to the conclusion that T.L. peformed about at the blitz strength level of Bright 0.4a/singlecore. Would not be bad either... but I see that Bright is rated 90 Elo higher in the UEL.

Anyway, I think Twisted Logic is one of the strongest freeware engines (top-10).

Edsel Apostol · Post by **Edsel Apostol** » Fri Oct 30, 2009 5:41 am

Hi Mike,

It is even stronger on ultrafast 10 sec + 0.1 sec time control, stronger than Rybka 1.2f and Naum 3 on our tests, especially the 64 bit version. Though the performance in longer time control is not that good.

The ultimate nemesis of TL is Glaurung/Stockfish. It simply clobbered TL.

Kempelen · Post by **Kempelen** » Fri Oct 30, 2009 8:49 am

Edsel Apostol wrote:Hi Mike,

It is even stronger on ultrafast 10 sec + 0.1 sec time control, stronger than Rybka 1.2f and Naum 3 on our tests, especially the 64 bit version. Though the performance in longer time control is not that good.

The ultimate nemesis of TL is Glaurung/Stockfish. It simply clobbered TL.

Hi Edsel,

How do you configure to play with times lower than 1 sec? I have been looking into winboard help and the unit time is 1 second. what gui are you using?

thx

Edsel Apostol · Post by **Edsel Apostol** » Fri Oct 30, 2009 9:01 am

Kempelen wrote:
Edsel Apostol wrote:Hi Mike,

It is even stronger on ultrafast 10 sec + 0.1 sec time control, stronger than Rybka 1.2f and Naum 3 on our tests, especially the 64 bit version. Though the performance in longer time control is not that good.

The ultimate nemesis of TL is Glaurung/Stockfish. It simply clobbered TL.
Hi Edsel,

How do you configure to play with times lower than 1 sec? I have been looking into winboard help and the unit time is 1 second. what gui are you using?

thx

Hi Fermin,

I'm using cutechess-cli, a great tool for testing. It is a commandline tool, it's GUI is still under development.

http://talkchess.com/forum/viewtopic.php?t=27024

mcostalba · Post by **mcostalba** » Fri Oct 30, 2009 9:42 am

Edsel Apostol wrote:Hi Mike,

It is even stronger on ultrafast 10 sec + 0.1 sec time control, stronger than Rybka 1.2f and Naum 3 on our tests, especially the 64 bit version. Though the performance in longer time control is not that good.

The ultimate nemesis of TL is Glaurung/Stockfish. It simply clobbered TL.

To be strong at blitz times is a necessary evil consequence of how engines are tested during development. Unfortunatly most of us don't have the hardware resources to properly test each change at medium / long time and you use fast time more often then not. This can bias the engine balance toward be stronger at blitz then at long times.

Stockfish is clearly better at blitz then at long times as you can see from CEGT list. This is something I _really_ would like to fix but it is difficult because I am forced to test at short times during developing and you need a lot of experience to understand a change that is going so and so at short times could be anyhow good to have because is strong at long times.

For very few engines, like Zappa, is the contrary, that's the reason I find them very interesting, a pity I don't have access to sources

Regarding TL, a possible reason of the results against SF _could_ be that there are similarities of ideas, in this case the strongest engine outperforms the weak. It is like when you test the old and the new version of the same engine. The new version outperforms the old one much more then other engines. Another example is a patch we introduced before 1.5.1 that was very good against Rybka, but not so much against other engines. Now, reading Ippolit sources, I have seen that a very similar idea was already implemented in Rybka, so perhaps we fixed an hole.

I already said I consider TL one of the strongest free engines and one of the most interesting in the long term, mainly because I have read some author's post here and there and I have made up my mind he is a talented guy.

Edsel Apostol · Post by **Edsel Apostol** » Fri Oct 30, 2009 11:23 am

mcostalba wrote:
Edsel Apostol wrote:Hi Mike,

It is even stronger on ultrafast 10 sec + 0.1 sec time control, stronger than Rybka 1.2f and Naum 3 on our tests, especially the 64 bit version. Though the performance in longer time control is not that good.

The ultimate nemesis of TL is Glaurung/Stockfish. It simply clobbered TL.
To be strong at blitz times is a necessary evil consequence of how engines are tested during development. Unfortunatly most of us don't have the hardware resources to properly test each change at medium / long time and you use fast time more often then not. This can bias the engine balance toward be stronger at blitz then at long times.

Stockfish is clearly better at blitz then at long times as you can see from CEGT list. This is something I _really_ would like to fix but it is difficult because I am forced to test at short times during developing and you need a lot of experience to understand a change that is going so and so at short times could be anyhow good to have because is strong at long times.

I agree. Latest public version of TL has been optimized on ultra fast testing using cutechess-cli. It is very strong at that time control. However I noticed that there are search and eval settings that are strong in fast time controls and weak in longer time controls and vice versa. I don't know what to do with that problem for now. I think I will have to find a compromise time control where it is fast enough to have more games but slow enough so that it wouldn't have bias whatever time control is used.

For very few engines, like Zappa, is the contrary, that's the reason I find them very interesting, a pity I don't have access to sources

The 20090105 version of TL also is stronger in longer time control than in blitz. This is based on the CEGT list. This version has less conservative pruning and reduction than the latest version. I think that already is a clue.

Regarding TL, a possible reason of the results against SF _could_ be that there are similarities of ideas, in this case the strongest engine outperforms the weak. It is like when you test the old and the new version of the same engine. The new version outperforms the old one much more then other engines. Another example is a patch we introduced before 1.5.1 that was very good against Rybka, but not so much against other engines. Now, reading Ippolit sources, I have seen that a very similar idea was already implemented in Rybka, so perhaps we fixed an hole.

You are right that some of the ideas in Stockfish is similar to TL. The biggest Stockfish influence on TL is the king attacks implementation.

I already said I consider TL one of the strongest free engines and one of the most interesting in the long term, mainly because I have read some author's post here and there and I have made up my mind he is a talented guy.

Thanks Marco. I also have the same opinion of the Stockfish team.

mcostalba · Post by **mcostalba** » Fri Oct 30, 2009 11:40 am

Edsel Apostol wrote: The 20090105 version of TL also is stronger in longer time control than in blitz. This is based on the CEGT list. This version has less conservative pruning and reduction than the latest version. I think that already is a clue.

Normally evaluation terms are quite immume to time control, but not always, for instance king safety and passed pawns seems to be very time control dependent.

Futility pruning remains a big mistery for me

we have tried countless tweaks and never got something out of that we have now and we don't believe that our futility parameters are already balanced, but simply we can find a way to correctly tune them.

The problem I am seeing in these weeks is that you end up with circular sterngth comparisons bewteen versions after a tournment.

For instance suppose you have 3 versions A, B, C each one with a slightly different futility setting, it is unfortunatly common to end up in situations where, after a tournment, you have A > B > C > A so that you get nowhere.

Uri Blass · Post by **Uri Blass** » Fri Oct 30, 2009 4:55 pm

mcostalba wrote:
Edsel Apostol wrote:Hi Mike,

It is even stronger on ultrafast 10 sec + 0.1 sec time control, stronger than Rybka 1.2f and Naum 3 on our tests, especially the 64 bit version. Though the performance in longer time control is not that good.

The ultimate nemesis of TL is Glaurung/Stockfish. It simply clobbered TL.
To be strong at blitz times is a necessary evil consequence of how engines are tested during development. Unfortunatly most of us don't have the hardware resources to properly test each change at medium / long time and you use fast time more often then not. This can bias the engine balance toward be stronger at blitz then at long times.

Stockfish is clearly better at blitz then at long times as you can see from CEGT list. This is something I _really_ would like to fix but it is difficult because I am forced to test at short times during developing and you need a lot of experience to understand a change that is going so and so at short times could be anyhow good to have because is strong at long times.

For very few engines, like Zappa, is the contrary, that's the reason I find them very interesting, a pity I don't have access to sources

Movei is also relatively stronger at long time control based on ccrl list so maybe it may be interesting to you to see movei's code(I did not suggest you to send you movei code because the program is relatively weak and the code is relatively ugly and I have many irrlevant comments that I wrote at some time and did not delete later when they were not relevant).

Maybe the reason that movei is relatively stronger in long time control is the fact that I never tested it seriously in games
but I did not expect movei to be stronger in long time control because I guess that the order of moves of it is relatively poor and I think that better order of moves is important for long time control.

Note that I do not believe that movei is stronger in long time control relative to stockfish inspite of smaller gap in the rating list and my guess is that diminishing returns is the reason for the smaller gap.

In other words I do not believe that movei is going to need smaller time handicap to get 50% against stockfish at long time control(I remember that I tried it in the past against rybka and found that movei probably needs bigger time advantage to score 50% against rybka2.3.2a at long time control).

Uri

Edsel Apostol · Post by **Edsel Apostol** » Fri Oct 30, 2009 5:24 pm

mcostalba wrote:
Edsel Apostol wrote: The 20090105 version of TL also is stronger in longer time control than in blitz. This is based on the CEGT list. This version has less conservative pruning and reduction than the latest version. I think that already is a clue.
Normally evaluation terms are quite immume to time control, but not always, for instance king safety and passed pawns seems to be very time control dependent.

Futility pruning remains a big mistery for me we have tried countless tweaks and never got something out of that we have now and we don't believe that our futility parameters are already balanced, but simply we can find a way to correctly tune them.

The problem I am seeing in these weeks is that you end up with circular sterngth comparisons bewteen versions after a tournment.

For instance suppose you have 3 versions A, B, C each one with a slightly different futility setting, it is unfortunatly common to end up in situations where, after a tournment, you have A > B > C > A so that you get nowhere.

I can vouch for king safety as time control dependent. We have settings like King Attacks Aggression Level in TL, the default is 4, when set to a higher value like 11 for example, it performs better than default at ultrafast time control but is not that effective at longer time control.

Futility pruning in TL is very aggressive as it prunes up to 9 plies. It is not that tuned also due to lack of testing resources. I think that Futility Pruning has a limitation, I mean the algorithm can only achieve as much, when you have reached that stage no matter what tweaks you're going to do, it wouldn't get better than that.

About the circular strength comparison, I think it is not a good idea to play the versions with each other if you are tuning the search settings, in my opinion it can only be effective when tuning the eval settings.

Blitz monster Twisted Logic

Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic

Re: Blitz monster Twisted Logic