A00 - Irregular Openings / Orangutan-Sokolsky

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Norm Pollock »

Rolf wrote:
Norm Pollock wrote:
Dann Corbit wrote:You've got a few thousand errors in the headers.
Most of the problems are bad escaped quotes or quotes that should have been escaped but were not.

Code: Select all

Missing closing quote in [Event "It (open) "]
Missing closing quote in [Event "It (open) "]
Missing closing quote in [Event "It (open) "]

etc, etc, ...

[code]
[/quote]

Besides doing other things, my utility "cleanTag" solves the "escape+quote at the end of a tag" problem by removing any and all "" directly before the ending quote. The utility only works on a PGN file.

"cleanTag" can be downloaded as part of my "40H" package of 30 utilities at 

http://www.hoflink.com/~npollock/chess.html[/quote]

That's good enough for low interest people but normally the entries shouldnt be deleted in the first place. So that a tool cleartag looks strange somehow. I see a possible advantage if you want to obfuscate the origin of your bases.

Another question: in the nineties I had a similar tool in DOS. Did you take that to Java? I had it from the collection on Pitt.[/quote]

I don't think someone who wants to clean up a messy pgn game description and get rid of the "" escape character before the quotation terminator of a tag value should be classified as a "low interest person"

I have no idea what utility at Pitt you are talking about. More information about it would be needed to see what similarities it has to "cleanTag". What services did it do? Did it work on pgn format? For the record, the coding in "cleanTag" (not "clearTag") is 100% original as are all the programs in 40H and no coding was taken from any outside source. 

Here is an overview of the major things "cleanTag" does:

"cleanTag" removes many excess tags, but keeps 13 important tags.
Kept tags are: "Event", "Site", "Date", "Round", "White", "Black",
"Result", "WhiteElo", "BlackElo", "ECO", "TimeControl", "SetUp" and
"FEN". Tags are arranged in the order listed.
User avatar
Rolf
Posts: 6081
Joined: Fri Mar 10, 2006 11:14 pm
Location: Munster, Nuremberg, Princeton

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Rolf »

Norm Pollock wrote:
Rolf wrote:
Norm Pollock wrote:
Dann Corbit wrote:You've got a few thousand errors in the headers.
Most of the problems are bad escaped quotes or quotes that should have been escaped but were not.

Code: Select all

Missing closing quote in [Event "It (open) "]
Missing closing quote in [Event "It (open) "]
Missing closing quote in [Event "It (open) "]

etc, etc, ...

[code]
[/quote]

Besides doing other things, my utility "cleanTag" solves the "escape+quote at the end of a tag" problem by removing any and all "" directly before the ending quote. The utility only works on a PGN file.

"cleanTag" can be downloaded as part of my "40H" package of 30 utilities at 

http://www.hoflink.com/~npollock/chess.html[/quote]

That's good enough for low interest people but normally the entries shouldnt be deleted in the first place. So that a tool cleartag looks strange somehow. I see a possible advantage if you want to obfuscate the origin of your bases.

Another question: in the nineties I had a similar tool in DOS. Did you take that to Java? I had it from the collection on Pitt.[/quote]

I don't think someone who wants to clean up a messy pgn game description and get rid of the "" escape character before the quotation terminator of a tag value should be classified as a "low interest person"

I have no idea what utility at Pitt you are talking about. More information about it would be needed to see what similarities it has to "cleanTag". What services did it do? Did it work on pgn format? For the record, the coding in "cleanTag" (not "clearTag") is 100% original as are all the programs in 40H and no coding was taken from any outside source. 

Here is an overview of the major things "cleanTag" does:

"cleanTag" removes many excess tags, but keeps 13 important tags.
Kept tags are: "Event", "Site", "Date", "Round", "White", "Black",
"Result", "WhiteElo", "BlackElo", "ECO", "TimeControl", "SetUp" and
"FEN". Tags are arranged in the order listed.[/quote]


You are right. If you read my message in this direction. But I meant something opposite to that. To delete such rests of former entries, like these brackets, is a good and required tool, but in regard of the promoted database from OM (Alexander)  it's a direct proof of inexistent quality.if you need to run such a process with your (good) tools. Because, I try to repeat it, a real dedicated player or lover of databases, he doesnt get rid of header entries but he wants to have them. And the main critic of me and Dann was that the OM lacked of enough entries. The errors come as a plus damage. - No need to be worried by what I said. Your tool collection is inevitable to own for bad pgn collections of games.

For a comparison of your tools, please take a look here

http://www.enpassant.dk/chess/softeng.htm  or here

ftp://ftp.pitt.edu/group/student-activities/chess/CONV/

but this now is the hype of endlessl tools for game utilities:

ftp://ftp.pitt.edu/group/student-activities/chess/UTIL/

This is mainly what I had in mind from over 10 years ago. Also with wonderful game collections in the different directories. Sensational work of that university.
-Popper and Lakatos are good but I'm stuck on Leibowitz
User avatar
smirobth
Posts: 2307
Joined: Wed Mar 08, 2006 8:41 pm
Location: Brownsville Texas USA

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by smirobth »

Hi Jon,

Here is a third source for that same game, Tim Harding's Megacorr 4 CD. The date and names agree with Chessbase, but the game score is even shorter than both Chessbase and Opening Master. I suspect Megacorr is probably correct. It is quite frequent for published games to be longer than what occured in the actual game, since they sometimes show unplayed analysis to explain why a player resigned. This is perhaps even more true of correspondence games, where resignations often occur earlier than they might in OTB.

[Event "USSR ch-08 6768"]
[Site "corr"]
[Date "1967.??.??"]
[Round "?"]
[White "Sokolsky, Aleksey Pavlovich"]
[Black "Zagorovsky, Mikhail Pavlovich"]
[Result "0-1"]
[ECO "A00"]
[PlyCount "56"]
[EventDate "1967.??.??"]
[EventType "corr"]
[EventCountry "URS"]
[Source "Chess Mail Ltd"]
[SourceDate "2005.07.27"]

1. b4 e5 2. Bb2 f6 3. e4 Bxb4 4. Bc4 Nc6 5. f4 d6 6. c3 Ba5 7. Ne2 Qe7 8. O-O
Bb6+ 9. Kh1 Bd7 10. d4 O-O-O 11. Nd2 Nh6 12. Bd5 Na5 13. a4 f5 14. Nc4 exd4 15.
cxd4 fxe4 16. Nxa5 Bxa5 17. Bc3 Bxc3 18. Nxc3 c6 19. Bxe4 d5 20. Bf3 Rhe8 21.
a5 Nf5 22. Rb1 Ne3 23. Qb3 Bf5 24. Rbe1 Qf6 25. a6 b6 26. Rc1 Nxf1 27. Nxd5
Rxd5 28. Qxd5 Ng3+ 0-1
- Robin Smith
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Norm Pollock »

Just want to add one thing. Technically it is not an error or a violation of pgn protocol to have an escape character "\" just before the terminating quotation mark of a tag value. But it does cause major problems when the pgn is later used by a windows-dos program.

If you want to add that game to a database manager program, or to use it in building an opening book, for example, the terminating quotation mark will not be seen as a terminator. That is because the escape character in windows-DOS is used to insert a quotation character into a string. For example, if I wanted to have the string value of "Open" instead of Open without the quotation marks, I would use:

[Event "\"Open\""]
as opposed to
[Event "Open"]

So obviously putting an escape character as the last character of a tag value will cause problems as in:

[Event "Open\"]
because the terminator quotation mark will not be seen by other programs as a terminator, but instead as an inserted quotation mark into the string.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by jdart »

You are right about correspondence games but I am also concerned about the player and date info. Here is another error: this game was played in 1895 (not 2003!):

[Event "Hastings"]
[Site "?"]
[Date "2003.??.??"]
[Round "?"]
[White "Blackburne J"]
[Black "Schlechter C"]
[Result "1/2-1/2"]
[ECO "A00"]
[WhiteElo "2570"]
[BlackElo "2600"]
[PlyCount "40"]
[EventDate "2003.12.28"]
[Source "Opening Master"]
[SourceDate "2008.09.09"]

1. d3 d5 2. g3 e5 3. Bg2 c6 4. Nc3 Be6 5. e4 dxe4 6. Nxe4 Nf6 7. Ne2 Nxe4 8.
Bxe4 Bd5 9. O-O Bxe4 10. dxe4 Qxd1 11. Rxd1 Na6 12. Be3 Bc5 13. Bxc5 Nxc5 14.
f3 Ke7 15. Kf2 Rhd8 16. Ke3 Rxd1 17. Rxd1 Rd8 18. Rxd8 Kxd8 19. Nc1 Ke7 20. Nd3
Nd7 1/2-1/2

The ratings are also bogus (no FIDE rating system was in effect in 1895).
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Dann Corbit »

budfit wrote:Hi Dann,

thank you for you post, appreciate your interest. Let me clarify first things with you before we conclude on this topic:

1) Which program did you use to examine our database? We are talking here about uploaded A00 database right? (not the ''replay games which are evailable also on our web site but are for educational purposes - here we don't guarantee the quality as it's really for replaying games only)
It couldn't have been Chess Base, you would never get these results out. Our suspicion is that you used some unqualified freeware program which produced these mistakes and mixed up things!
Chess Assistant 9.1
Scid
Pgn-Extract by Barnes

The mistakes are in your database.
2) You mention PGN file, but honestly we didn't use this format at all, so I don't know where you come from with the PGN. Again, if you used some freeware program to transition it from .cbv into PGN there you can mix up things pretty badly and results you see right away.
I unloaded the data as PGN from ChessBase so that it could be processed by other programs.
3) the sources from the INTERNET are are like they are - not perfect. We guarantee the body is there, the result is there and the heading should be fair. But it would be impossible to correct hundreds of thousands of headings. (but still refer to answer #2, as such a mess couldn't be created in ChessBase)

4) The missing results is NON-SENSE. We would advise you to use normal / commercial programs for analysis which would immediately provide you with the answers / results to see you are wrong. One of the foundational stones of the databases we produce = always with results! Therefore we would like to request from you which program and parameters did you use to perform your analysis.
The problems are the problems. I examined many of them and they are bugs.
5) We have no idea what you mean by A00p, q, r, s, can you please clarify your statement? The list of A00 openings was given in the upper post.
[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Opening "Polish: 1. b4 e5 2. Bb2 Bxb4 3. Bxe5 Nf6 4. c4 O-O 5. e3 *"]

1. b4 e5 2. Bb2 Bxb4 3. Bxe5 Nf6 4. c4 O-O 5. e3 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Opening "Polish: 1. b4 e5 2. Bb2 Bxb4 3. Bxe5 Nf6 4. c4 O-O 5. e3 *"]

1. b4 e5 2. Bb2 Bxb4 3. Bxe5 Nf6 4. c4 O-O 5. e3 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Opening "Polish:"]
[Variation "Tuebingen variation"]

1. b4 Nh6 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Opening "Polish: 1. b4 d5 2. Bb2 Bf5 *"]

1. b4 d5 2. Bb2 Bf5 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Variation "Polish: (Sokolsky; Orang-Utan)"]

1. b4 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Variation "Polish: Birmingham Gambit"]

1. b4 c5 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Variation "Polish: 1...Nf6"]

1. b4 Nf6 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Variation "Polish: 1...c6"]

1. b4 c6 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00p"]
[Variation "Polish: Schühler Gambit"]

1. b4 c6 2. Bb2 a5 3. b5 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00q"]
[Variation "Polish: 1...d5"]

1. b4 d5 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00r"]
[Variation "Polish: 1...e5"]

1. b4 e5 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00r"]
[Variation "Polish: Bugayev Attack"]

1. b4 e5 2. a3 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00r"]
[Variation "Polish: 1...e5 2.Bb2"]

1. b4 e5 2. Bb2 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00r"]
[Variation "Polish: Wolfertz Gambit"]

1. b4 e5 2. Bb2 c5 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00r"]
[Variation "Polish: Tartakower Gambit"]

1. b4 e5 2. Bb2 f6 3. e4 Bxb4 *

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "*"]
[ECO "A00s"]
[Variation "Polish: 2...Bxb4"]

1. b4 e5 2. Bb2 Bxb4 *
6) The de-duplication from 52.000 to 43,000, again please confirm which program and parameters did you use, our ChessBases strongly confirms 52.000 games deduplicated!
ChessBase is wrong. I examined the dups and they definitely were dups.
I used Scid, ChessAssistant, and PGN-Extract to remove the dups.
The whole analysis therefore needs clarification from your side before accusing some party not doing their homework. To all other readers, it seems like a science work you performed, and you meant it actually good and we do appreciate you performed it as IF there are any small bugs found (not those you mentioned as they are not counting yet) we would like to hear from you. But we are afraid you didn't use the right tools performing the analysis and therefore your results could be misleading IF NOT WRONG.

Best regards,
Alexander Horvath SIM ICCF
http://www.openingmaster.com
I did not accuse you of not doing your homework. I was simply pointing out that there were a few warts in the data and that it could use a little cleanup.
budfit

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by budfit »

Dann, John, Rolf

where to start, firstly I would thank you all for your deep analysis and posts into this subject. A special note to Rolf, I do accept criticism, actually I love it as it moves things forward and you learn from it. I am getting used to your psychological point of view by know, (it took me some time but I don't give up on you don't worry) You misunderstood my message to John when I was asking for clarifications) to support his statements. As you know without these clarifications it is simple accusation and there were too many questions in the air. (perhaps I used more aggressive tone for which I apologize) One thing he did good, he analyzed it and made an effort to write a feedback on this. (even though it was negative and hurting our quality claims - but this is why we are all here, to prove or reject the theories by SUPPORTING FACTS - note the capital letters.)

One quote from Rolf :
So again here you claim quality games collection but it's not. Dann made the relevent arguments. The worst that you could do is announcing the collection as Orang Utan but in truth you present the whole INFOMATOR content of A00. Alexander, also this is a bis mistake because you are not aware of the tradition in INF which is in itself absolutely fine but which makes confusion if you lose context with the INF key. In that key someone who is familiar with has no problems to be orientated well, but without that context the key separation is confusing.
yes, we and I still claim the Opening Master has the highest quality standards but as we just learnt only for certain types of players. The subject of the post is called A00 - Irregular Openings / Orangutan - Sokolsky. Now as I wrote already, I announced all A00 including the ''eye catcher" Orangutan-Sokolsky which we all know is bit controversial so I very well know the INFORMATOR content and the entire A00. On the ChessBase usage:... well, I can only say that I have been using it for more than 6 years and can say that I can spot the difference between the various setups if not all. And yes, I agree , if you don't know what you are doing with the thing rather then don't do it, because you may come up with different answers every time you run it (with slight change in your settings)
OK, seems like we are going to the defense mode, but I just felt I had to explain few general things.

Now let's get back to the FACTS (bit offense play). If you are using any DOS program or export/extract as a PGN and then wash it in some 3rd party program, sorry but here I have a right to say, we don't guarantee the results anymore. By doing this you can just mess up things very badly. For simple statements like "CB is wrong", I would appreciate to provide some proof of the duplicates so we can see if CB is making mistakes in running the checks or where could be the error. (and yes we can admit the mistakes on our side but only once we see some tangible proofs)

Talking about the beloved headings. Now I mentioned this I think before, it all depends for what you want to use the database for :

1) analysis of chess games from the beginning until the end with clear results. (hopefully the end is not in the first 8 moves as this can only happen when an amateur is playing with the GM)

2) library with nice headings and collection of good looking names with perfect dates and games without bodies (only results)

we chose 1) and therefore we still consider any of the incorrect headers as irrelevant to the point of ANALYSIS of the games. But as you see, there are different players in the world and not always all share our pragmatic approach....

Now I see big fans of Chess Base Mega. That is just fine, I don't say they are bad they just use different strategy then us. We went for the analysis of the games, i.e. option #1...

Just for the sake of comparing things around, I would really recommend Dann and John to look at the CB Mega and their A00 with 69.740 games

1/ games without bodies = 43.718. Now you can analyze this
2/ games 1-7 moves = only 947 but you can deeply analyze these games
3) loads of ''BYE'' games. Have you lost to ''BYE'' recently? Well according to Mega many players did.
3/ thus running just a short comparable games Mega 08 = 25.075 in total

this is just a sample of one of the highest quality databases but of course the rest of the ECOs could be just in perfect shape. (I don't comment on this as with all the respect and effort the guys behind CB done through out the years, they have our full admiration as we know how painful it is to collect things but somebody should also clean up in the house too)

ok, it's 1 AM here, wish I could stay longer, tomorrow I will provide more analysis to confirm our statements. And again, it is very appreciated from all of you guys of raising your comments here and challenge our statements of quality. Rolf must know better - when I see his Avatar - it's all subject of relativity...

Best Regards
Alexander Horvath
SIM ICCF
http://www.openingmaster.com
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Dann Corbit »

I am sorry I offended you, I was just trying to be helpful.

Try the following experiment yourself:
1. Unload the programs using ChessBase itself as PGN or convert the archive to a decompressed CB file.
2. Load the data into ChessAssistant 9.1
3. Run the check for duplicates function.
4. Look at the duplicate games. There will be a huge pile of them but if you examine them one by one you will see very clearly that they are duplicates.

The header problems are header problems. Fix them or don't -- I don't care.
budfit

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by budfit »

ok thanks for the hint Dann, I will do it tomorrow and reply with the findings...

PS and don't worry, no offense from my side... we all must take it professional enough, but sometimes we all use bit more aggressive notes.

Best Regards
Alexander Horvath
SIM ICCF
http://www.openingmaster.com
User avatar
Rolf
Posts: 6081
Joined: Fri Mar 10, 2006 11:14 pm
Location: Munster, Nuremberg, Princeton

Re: A00 - Irregular Openings / Orangutan-Sokolsky

Post by Rolf »

At first I gave you, also because of the connect with Rybka, feedback. By nature if I detect crap I make a harsh negative comment. BTW it was me who told you to do your homework.

Now all this was in public because your promotion isnt simply based on facts. So this is a) a feedback for you in the business and b) for potential customers.

From the high level of claiming high quality I can well argue that it's false claim if I found proof for a principal low quality.

Then another problem appears. From whatg you write about quality I can see that you dont even know what high quality really is. If I would explain it now I would already be your advisor for free. No. I told you to review the whole data because it's faulty, but you resisted. Now this is a simple story. This is your business, not mine and I think I already have said enough.

I have many more arguments for proving that your databases are low levelled. And we could well debate about it but then I would expect that you didnt continue to appear in the clothes of the chief of a company, because if I then tell you that something is crap I could risk a damage of your business although you caused it in the first place.

So please, if you want to discuss here leave out the repetition of commercial PR.

The best I could do is giving Vas information about the low quality of your databases. Rybka is the number one prog also for GM and journalists. They must have complete documentation about the tournaments which is destroyed in your collections. Also, perhaps for a corr player BYE are irrelevant or won unplayed games. But not for people who follow chess as a sport over the internet. Period.

And at last good luck inspite the amount of work before you. Give it a try, otherwise the customers wont pay for your product.

I mean I will buy Rybka anyway and then throw the databases into the bin. Sorry.
-Popper and Lakatos are good but I'm stuck on Leibowitz