I understand that there is only one position duplicated between our two test suites. Maybe for a larger test suite we could officially combine them with one additional position added. There is a big philosophical difference between our two test sets, however, maybe people could benefit from the difference more so than just having more of the same from either of us. Just a thought! Let me know.Albert Silver wrote:It's cool. Bear in mind it could have been some weird fluke. The new suite is really only a modification/revision of my first version, and over thousands and thousands of gauntlets, I have come to view it as very reliable and comprehensive. I do plan to try and make an expanded 100 position version in the near future, specifically for these large testing situations, but the new suite has shown itself to be very solid so far.BubbaTough wrote:OK,Albert Silver wrote: It's a deal. Just so you know, I won't limit the test to one opponent. For a 50 Elo diff, I would test against at least 2, for both the before and after, since there is nothing uncommon about gaining a couple of % against one opponent, only to lose it against another.
I couldn't find a version that had been directly trained, but I identified a version a few iterations upstream from that such that I figured it would have enough in common with the trained version to perform much stronger on silver than other positions. I then ran 100 games each with 4 different programs. I also ran 150 others positions using the same time control and same programs to compare with.
The result: I failed to show any difference. I was mildly surprised by this, and still believe 50 positions is not enough to prevent over-training, but must admit I have no evidence of this assertion on hand (besides memory which is for me is always a bit risky to rely on). When I get around to rewriting an automated tuning system for Hannibal, I may run another experiment.
So I guess you are under no obligation to review the beta version of Hannibal. Nevertheless, if you want a copy feel free to PM me with an email address.
-Sam
As to Hannibal, PM is sent.
Fire 1.3 Burns Rybka 4
Moderator: Ras
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: Fire 1.3 Burns Rybka 4
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through