A00 - Irregular Openings / Orangutan-Sokolsky

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

ozziejoe
Posts: 811
Joined: Wed Mar 08, 2006 10:07 pm

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by ozziejoe »

Hi Alexander,

don't be too put off by the harsh tone of some of the criticisms. I think some people on this forum don't always follow the most polite conventions (though what people say, even harshly, is often of use). I for one would be glad to see how your site progresses and your new developments.

if you do think there are some ways to improve your database (e.g., the headers), but you don't have the resources, maybe one thing you can do is offer a few people on this forum a free membership, in exchange for them doing the research and improving the database (i am not asking for this for myself, since i don't have that kind of database knowledge)

I think rybka was a nice example of a collaborative effort. Rybka 3 was driven by the author of rybka (vas focusing on search), the im larry kaufman (who probably added 50 elo by improving evaluation), and the numerous testors. If vas had worked on rybka 3 by himself, i think you might have seen a 30 to 50 pnt elo improvement, instead of the 100+ improvement we are actually seeing.


best
Joseph
User avatar
Rolf
Posts: 6081
Joined: Fri Mar 10, 2006 11:14 pm
Location: Munster, Nuremberg, Princeton

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Rolf »

ozziejoe wrote:Hi Alexander,

don't be too put off by the harsh tone of some of the criticisms. I think some people on this forum don't always follow the most polite conventions (though what people say, even harshly, is often of use). I for one would be glad to see how your site progresses and your new developments.

if you do think there are some ways to improve your database (e.g., the headers), but you don't have the resources, maybe one thing you can do is offer a few people on this forum a free membership, in exchange for them doing the research and improving the database (i am not asking for this for myself, since i don't have that kind of database knowledge)

I think rybka was a nice example of a collaborative effort. Rybka 3 was driven by the author of rybka (vas focusing on search), the im larry kaufman (who probably added 50 elo by improving evaluation), and the numerous testors. If vas had worked on rybka 3 by himself, i think you might have seen a 30 to 50 pnt elo improvement, instead of the 100+ improvement we are actually seeing.


best
Joseph
Just now you can see Walter Eigenmann writing here. He's a dabase man for computerchess in the German language regions. Just ask him about his experiences when he's talking about his products.

Another aspect for me is a clear difference between a very serious and humid presentation of Vasik right from the start and the somewhat contradictional presentation Alexander gave here from a somewhat "shy" guy to someone who criticised then his critics who had given him just their feedback about his databases. And note also that Alexander is already offering the same bases for a money fee. Somehow the process is different to what Vasik did. The whole topic interested me because the OM bases should already be linked with the Aquarius gui of Rybka. And since I had the weak bases already in R. 2.3.2a package. I tried to prevent another deception. I simply dont want to believe that Vas knows what going on with these databases.
-Popper and Lakatos are good but I'm stuck on Leibowitz
budfit

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by budfit »

Hi Joseph,
thank you for your suggestion. And yes there many members with many characteristics who write many things - this is ok, we don't give up and try to answer all post with professional manner. In regards to the headers we will think about it. From the very beginning we mentioned three things. Each of our game has headers. (i.e. there are no games without the header) , the headers are independent from the quality of the analysis and all 5.2 millions of games are deduplicated (i.e you won't find one game twice)

Now whether the headers are in perfect shape ''normalized'' with first name, last name, perfect date or whatever (e.g. Vladimir Kramnik is written as V.Kramnik or Vlad. Kramnik or just Kramnik) for us it doesn't matter. The analysis of the body of the games does not depend on the headers. And that is what we've been trying to explain Rolf and other members. We focus on completeness of data (body of the game), we focus on the quality e.g. there is body, there are more moves than 8 etc... all these attributes are of higher value then the ''perfect name''

BUT (there is always but) if we see chess players from different countries demand even perfect headers we will think of the way how to implement it. What we don't want to end up with, we spend more time on headers than on actual games and analysis. For us the ultimate value add is complete game, classified according to ECO code so programs like Chess Base or Chess Assistant can use it and advise us (the players) the best way to play.

Best Regards
Alexander Horvath SIM ICCF
http://www.openingmaster.com

PS Yesterday we booked the trip to Bonn, Germany for World Championship between Kramnik and Anand. It will be great event. So perhaps we see all of you there and discuss in person what contemporary chess player demand from his/her database.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by jdart »

Well, Rolf is a bit confrontational. That is just how he is. But I found two games with wrong header info just browsing the database. I didn't spend very long on it at all. And if I can find two games with bad info that quickly there are probably many, many more. I care about it being right. So I'm not going to be a customer for this database, sorry.

Re similarity and de-duping: I agree this is hard. I personally haven't tried to measure how many dups you might have so I don't have an opinion on that. But if you go by exact match of moves you will miss many near-similar games that might be identical. Some people might not care if there are dups and near-dups in the database, but I am more concerned about it.
budfit

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by budfit »

you have a full right to do so... as written in the posts before... headers are headers and bodies of the games are different thing.

let's close this post with differences of opinions...

Best Regards
Alexander Horvath SIM ICCF
http://www.openingmaster.com
User avatar
Rolf
Posts: 6081
Joined: Fri Mar 10, 2006 11:14 pm
Location: Munster, Nuremberg, Princeton

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by Rolf »

budfit wrote:you have a full right to do so... as written in the posts before... headers are headers and bodies of the games are different thing.

let's close this post with differences of opinions...

Best Regards
Alexander Horvath SIM ICCF
http://www.openingmaster.com
Where did you get the authority to make such almost Paypal declarations, Alexander? BTW I agree with you body is more importent than header. I have it from a feeling.
-Popper and Lakatos are good but I'm stuck on Leibowitz
budfit

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by budfit »

... because we love PayPal ;-))

PS one thing still bothers me though, John still mentions in his last post that he found mistakes and he won't bother with the database anymore. He has a full right to do so. He mentioned some two headers or some dedubs or almost debubs.

Could we ask John to provide us specifics, so we can correct it it ''if necessary''. Appreciated

Best Regards,
Alexander
User avatar
Rolf
Posts: 6081
Joined: Fri Mar 10, 2006 11:14 pm
Location: Munster, Nuremberg, Princeton

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by Rolf »

budfit wrote:... because we love PayPal ;-))

PS one thing still bothers me though, John still mentions in his last post that he found mistakes and he won't bother with the database anymore. He has a full right to do so. He mentioned some two headers or some dedubs or almost debubs.

Could we ask John to provide us specifics, so we can correct it it ''if necessary''. Appreciated

Best Regards,
Alexander
Why should anyone help you in your business for free?
-Popper and Lakatos are good but I'm stuck on Leibowitz
budfit

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by budfit »

Rolf,
I am was not asking you... we know you don't do anything for free. I was referring to John as he was the one who found the ''mistakes''. So unless this was a ''general statement'' we wanted to hear some specifics. This is not something to do for ''free'' this is something to add as evidence of his claim. We made the analysis ourselves so we know the result.
Quote from Joseph : if you do think there are some ways to improve your database (e.g., the headers), but you don't have the resources, maybe one thing you can do is offer a few people on this forum a free membership, in exchange for them doing the research and improving the database (i am not asking for this for myself, since i don't have that kind of database knowledge) I think rybka was a nice example of a collaborative effort. Rybka 3 was driven by the author of rybka (vas focusing on search), the im larry kaufman (who probably added 50 elo by improving evaluation), and the numerous testors. If vas had worked on rybka 3 by himself, i think you might have seen a 30 to 50 pnt elo improvement, instead of the 100+ improvement we are actually seeing.


I think I owe one more clarification to Joseph, my previous post was too broad. The thing regarding Vasik Rajlich and his Rybka and cooperation on OM.

The cooperation between Vasik / Rybka and his corespondents is based on commercial basis and not friendship increase of ELO of his engine. The construction of the library Rybka II is also made on controversial basis. The usage of Internet portal games as a basis for statistical analysis and construction of the database is uncommon if not wrong. Vasik deals with optimization of search and cutting of bad variants.
This was quite successful in Rybka 3.

If you are talking about the community and cooperation on OM, we see it in the nearby future where we have plans to release first annotated Open Source database - FOR FREE. You will be able to post and download annotated games without conflict of copyright (all members will agree before posting their games to freely distribute them among other members of the free community). It is the plan, and everything goes in phases. The free annotated games without copyright conflict can move the entire chess community few levels up. All is needed is coordination of the quality of the database. We are not against the idea of having things for free but it needs to have input from all interested parties.

Best Regards,
Alexander Horvath, SIM ICCF
http://www.openingmaster.com
James Constance
Posts: 358
Joined: Wed Mar 08, 2006 8:36 pm
Location: UK

Re: A00 - Irregular Openings / Orangutan-Sokolsky - FACTS on

Post by James Constance »

Alexander

It is good that you offer this database as a sample and explain how you dedupe - that means no one should be disappointed.

However doing a default double search in chessbase including similar names finds 1258 doubles. Having the same game with a small difference in the name (perhaps only 1 letter) undermines the quality claim a little.

Also the base seems to include correspondence games - so quantity-wise shouldn't it be compared to megabase + correspondence 2009?

best

James