Nebula 2.0 v Djinn 0.979- How Far Has Nebula Actually Come?

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Dragan
Posts: 108
Joined: Mon Aug 06, 2012 1:55 pm

Re: Djinn v Nebula- A Pre-Get-Some-Sleep 31 Game Update!

Post by Dragan »

I am so intrigued by the much better performance of Djinn compared to bullet that I just started 100 game Silver suite match at 40/5min. It should be done in about 6 hours since I use 5 concurrent games on my 6 core machine. Will post the result here for comparison.
Dragan
Posts: 108
Joined: Mon Aug 06, 2012 1:55 pm

Re: Djinn v Nebula- A Pre-Get-Some-Sleep 31 Game Update!

Post by Dragan »

Ok. My test is finished and the results are as hinted by Tom.

i7-3930 @ 4.2 GHz , Hash 128MB , Silver suite (100 games)

40/5min : +49 -16 =35 +119 ELO for Nebula 2.0
Average depth reached in middlegame was 13.17 Nebula vs 14.09 Djinn and average depth in endgame was 14.10 Nebula vs 14.38 Djinn

40/5sec : +68 -11 =21 +225 ELO
Average depth in middlegame was 8.63 vs 8.17 and in endgame 11.09 vs 10.46

This shows that Djinn doesn't handle bullet well. Because of the fact xboard doesn't send milliseconds info to Djinn, it gets less thinking time in bullet, so it basically gives Nebula time odds.
It could also be that Djinn isn't polling input frequently enough during bullet games. Just something for Tom to look at.

Cheers, Dragan
Dragan
Posts: 108
Joined: Mon Aug 06, 2012 1:55 pm

Re: Djinn v Nebula- A Pre-Get-Some-Sleep 31 Game Update!

Post by Dragan »

I used wrong parameters in my utility that calculates average depths. The correct values are:

Nebula vs Djinn avg depths
40/5min MG avg depths 13.24 vs 15.17 and EG avg depths 16.41 vs 17.57

40/5sec MG avg depths 8.63 vs 8.23 and EG avg depths 11.91 vs 11.67

So the difference in depths reached by Djinn is even more extreme.
Tom Likens
Posts: 303
Joined: Sat Apr 28, 2012 6:18 pm
Location: Austin, TX

Re: Djinn v Nebula- A Pre-Get-Some-Sleep 31 Game Update!

Post by Tom Likens »

Dragan wrote:Ok. My test is finished and the results are as hinted by Tom.

i7-3930 @ 4.2 GHz , Hash 128MB , Silver suite (100 games)

40/5min : +49 -16 =35 +119 ELO for Nebula 2.0
Average depth reached in middlegame was 13.17 Nebula vs 14.09 Djinn and average depth in endgame was 14.10 Nebula vs 14.38 Djinn

40/5sec : +68 -11 =21 +225 ELO
Average depth in middlegame was 8.63 vs 8.17 and in endgame 11.09 vs 10.46

This shows that Djinn doesn't handle bullet well. Because of the fact xboard doesn't send milliseconds info to Djinn, it gets less thinking time in bullet, so it basically gives Nebula time odds.
It could also be that Djinn isn't polling input frequently enough during bullet games. Just something for Tom to look at.

Cheers, Dragan
Dragan,

Thanks for running this. I really need to address the bullet game time issue. I have some thoughts on it, but I haven't tried anything yet.

I really think the new Nebula is too strong for Djinn (at least for now). My testing shows Nebula 2.0 to be about at the same strength level as Fruit 2.1, which is about 40 elo better than the best Djinn I've been able to create so far. I think it's something you can be really proud of, it plays an interesting game.

regards,
--tom
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Djinn v Nebula- A Pre-Get-Some-Sleep 31 Game Update!

Post by geots »

Dragan wrote:Ok. My test is finished and the results are as hinted by Tom.

i7-3930 @ 4.2 GHz , Hash 128MB , Silver suite (100 games)

40/5min : +49 -16 =35 +119 ELO for Nebula 2.0
Average depth reached in middlegame was 13.17 Nebula vs 14.09 Djinn and average depth in endgame was 14.10 Nebula vs 14.38 Djinn

40/5sec : +68 -11 =21 +225 ELO
Average depth in middlegame was 8.63 vs 8.17 and in endgame 11.09 vs 10.46

This shows that Djinn doesn't handle bullet well. Because of the fact xboard doesn't send milliseconds info to Djinn, it gets less thinking time in bullet, so it basically gives Nebula time odds.
It could also be that Djinn isn't polling input frequently enough during bullet games. Just something for Tom to look at.

Cheers, Dragan



Each to his own, Dragan. But personally, I would never test at milliseconds because I feel the results are worse than lacking in quality- for most engines. But I understand, because we are in the middle of this "30,000 games or shut up" mentality. Maybe it is the position of the moon...............


Best,
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Djinn v Nebula- A Pre-Get-Some-Sleep 31 Game Update!

Post by geots »

Tom Likens wrote:
Dragan wrote:Ok. My test is finished and the results are as hinted by Tom.

i7-3930 @ 4.2 GHz , Hash 128MB , Silver suite (100 games)

40/5min : +49 -16 =35 +119 ELO for Nebula 2.0
Average depth reached in middlegame was 13.17 Nebula vs 14.09 Djinn and average depth in endgame was 14.10 Nebula vs 14.38 Djinn

40/5sec : +68 -11 =21 +225 ELO
Average depth in middlegame was 8.63 vs 8.17 and in endgame 11.09 vs 10.46

This shows that Djinn doesn't handle bullet well. Because of the fact xboard doesn't send milliseconds info to Djinn, it gets less thinking time in bullet, so it basically gives Nebula time odds.
It could also be that Djinn isn't polling input frequently enough during bullet games. Just something for Tom to look at.

Cheers, Dragan
Dragan,

Thanks for running this. I really need to address the bullet game time issue. I have some thoughts on it, but I haven't tried anything yet.

I really think the new Nebula is too strong for Djinn (at least for now). My testing shows Nebula 2.0 to be about at the same strength level as Fruit 2.1, which is about 40 elo better than the best Djinn I've been able to create so far. I think it's something you can be really proud of, it plays an interesting game.

regards,
--tom



Tom, a unique way to address the bullet game issue- trash it. Just a thought.................


Personally, I think your thoughts on the relative strength of Djinn as compared to Nebula are way off. Way, way off. You have done great work- Dragan has done great work. But this huge difference in strength you come up with............... I just don't see it.


Best,
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Nebula v Djinn- And Now We Have A 60-Game UPDATE!

Post by geots »

60 games down, and it has turned into one hell of a battle, as was to be expected. 40 games to go, and the way these 2 engines are playing- I wish it was 140. I am enjoying this one!




Inspiron 620 Intel i5-4 True Cores
Fritz 11 gui
1CPU/64-bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 2012b.ctg w/12-move limit
40/11 Repeating (Benched to adapt to 40/20)
Match=100 games




Code: Select all

Nebula 2.0 pcnt x64    +17/-17/=15
Djinn 0.979 x64        +17/-17/=15


At this point, Vegas would have the odds at "pick-em".



We shall be back,

george
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Nebula v Djinn- Please Note Error In My Update!

Post by geots »

geots wrote:60 games down, and it has turned into one hell of a battle, as was to be expected. 40 games to go, and the way these 2 engines are playing- I wish it was 140. I am enjoying this one!




Inspiron 620 Intel i5-4 True Cores
Fritz 11 gui
1CPU/64-bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 2012b.ctg w/12-move limit
40/11 Repeating (Benched to adapt to 40/20)
Match=100 games




Code: Select all

Nebula 2.0 pcnt x64    +17/-17/=15
Djinn 0.979 x64        +17/-17/=15


At this point, Vegas would have the odds at "pick-em".



We shall be back,

george



Please excuse my error above!! The draw number should be 26! I am very, very sorry.


george
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Djinn v Nebula- What A Difference 9 Games Can Make!

Post by geots »

Now we are thru 69 games, and as you can see the tables have turned a bit. Djinn has come forward and took his first lead of the match. This is my kinda match for sure!





Inspiron 620 Intel i5-4 True Cores
Fritz 11 gui
1CPU/64-bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 2012b.ctg w/12-move limit
40/11 Repeating (Benched to adapt to 40/20)
Match=100 games




Code: Select all

Djinn 0.979 x64       +21/-19/=29
Nebula 2.0 pcnt x64   +19/-21/=29 


And still some great chess left to play!





george
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Another Pre-Get-Rest Update At the 71-Game Mark!

Post by geots »

Heading to get some rest, and we have completed 71 games, at which Nebula has come back to tie things all up again! Unbelievable!




Inspiron 620 Intel i5-4 True Cores
Fritz 11 gui
1CPU/64-bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 2012b.ctg w/12-move limit
40/11 Repeating (Benched to adapt to 40/20)
Match=100 games




Djinn and Nebula- +21/-21/=29


Damn if I am not worn out like it is me playing and I need the rest. What can I say- I have just gotten caught up in this one.



Later,

george