Normalizing the eval

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Normalizing the eval

Post by Edmund »

I don't think pgn is the correct format for that purpose. It would be better to use some other database format as it is far easier to access for computers.

Furthermore what do you need these millions of games for? Gigantic opening books for example? Anyway I think you should look for a format that is most suitable for the aims.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Normalizing the eval

Post by bob »

Codeman wrote:I don't think pgn is the correct format for that purpose. It would be better to use some other database format as it is far easier to access for computers.

Furthermore what do you need these millions of games for? Gigantic opening books for example? Anyway I think you should look for a format that is most suitable for the aims.
That's what I have so far, which is mainly a large set of BayesElo results for tests as I run them. I save some of the PGN collections while a test is ongoing, but once I move on to the next topic, I generally remove them.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Normalizing the eval

Post by Don »

bob wrote:
jhaglund wrote:
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years. :)
I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.
I can store terabytes on our current cluster. And if our new cluster happens next year, it will have petabytes of storage. Saving the PGN isn't the problem, the problem becomes somehow organizing it so that it is understandable as to what each PGN collection represents.

I probably need to start from scratch and find some rational way of naming the pgn directories so that I can easily find the results from a specific test in the past, when needed...
Here is a common solution to this type of problem: Use a sql database. You can store all the PGN files in the database, or just use it to organize the pgn files that are stored on disk. You can keep all the relevant data that you many someday need to access such as dates, reason for test, players, hardware, level, versions, etc. sqlite would be a good choice for a private database like this as it's trivial to manage and easy to use but any database would do fine of course.

Then you don't really need to think too much about how to organize the data because you can query it any way you need to when the time comes. All you have to think about is what it is you want to capture about each game sequence.

If you are like me, however, it's probably not worth it unless you really think you would need to reference the data. I have only occasionally wished I had saved results form discarded test data and then it was very recently discarded data.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Normalizing the eval

Post by bob »

Don wrote:
bob wrote:
jhaglund wrote:
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years. :)
I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.
I can store terabytes on our current cluster. And if our new cluster happens next year, it will have petabytes of storage. Saving the PGN isn't the problem, the problem becomes somehow organizing it so that it is understandable as to what each PGN collection represents.

I probably need to start from scratch and find some rational way of naming the pgn directories so that I can easily find the results from a specific test in the past, when needed...
Here is a common solution to this type of problem: Use a sql database. You can store all the PGN files in the database, or just use it to organize the pgn files that are stored on disk. You can keep all the relevant data that you many someday need to access such as dates, reason for test, players, hardware, level, versions, etc. sqlite would be a good choice for a private database like this as it's trivial to manage and easy to use but any database would do fine of course.

Then you don't really need to think too much about how to organize the data because you can query it any way you need to when the time comes. All you have to think about is what it is you want to capture about each game sequence.

If you are like me, however, it's probably not worth it unless you really think you would need to reference the data. I have only occasionally wished I had saved results form discarded test data and then it was very recently discarded data.
The problem is, to use any sort of database SQL or whatever, you need some sort of "naming/indentifying" convention so that you can figure out what feature was being tested in a specific PGN game. That's the thing I have not done. And it is the very thing I need to do. I've not given it any thought since I really don't care much about looking at old results. I run tests to tune something, and then move on and tune something else. Most likely the old tests are no longer valid now anyway since other changes have been made.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Normalizing the eval

Post by Don »

bob wrote:
Don wrote:
bob wrote:
jhaglund wrote:
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years. :)
I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.
I can store terabytes on our current cluster. And if our new cluster happens next year, it will have petabytes of storage. Saving the PGN isn't the problem, the problem becomes somehow organizing it so that it is understandable as to what each PGN collection represents.

I probably need to start from scratch and find some rational way of naming the pgn directories so that I can easily find the results from a specific test in the past, when needed...
Here is a common solution to this type of problem: Use a sql database. You can store all the PGN files in the database, or just use it to organize the pgn files that are stored on disk. You can keep all the relevant data that you many someday need to access such as dates, reason for test, players, hardware, level, versions, etc. sqlite would be a good choice for a private database like this as it's trivial to manage and easy to use but any database would do fine of course.

Then you don't really need to think too much about how to organize the data because you can query it any way you need to when the time comes. All you have to think about is what it is you want to capture about each game sequence.

If you are like me, however, it's probably not worth it unless you really think you would need to reference the data. I have only occasionally wished I had saved results form discarded test data and then it was very recently discarded data.
The problem is, to use any sort of database SQL or whatever, you need some sort of "naming/indentifying" convention so that you can figure out what feature was being tested in a specific PGN game. That's the thing I have not done. And it is the very thing I need to do. I've not given it any thought since I really don't care much about looking at old results. I run tests to tune something, and then move on and tune something else. Most likely the old tests are no longer valid now anyway since other changes have been made.
This would have to be by keyword search. I have looked at Crafty code and your comments are good - they would be appropriate for this. For instance if you remembered testing a hash table change a year ago, you would have a field (or I should say a column) in a table that describes what you are testing in a comment like style. Just like a change log in a git, cvs, subversion or whatever you use for version control. So in sql you can search for the string "hash" in your comment field.

It's not worth the trouble unless you can make it automatic and painless. If I took a little time I could build this kind of support into my own tester, since almost all this information is hanging around somewhere when I start a test. My tester builds a web page for viewing the results which has the reason for the test, what is being tested, etc. So I'm practically there. Presumably you already have that information laying around too. If I did this I would be a little more conscientious about making sure the description of the test had good desciptions with appropriate keywords in them.

Anway, just a crazy idea. I know for you, and probably for me too it's not worth the time and trouble since I don't think I would look back very often.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Normalizing the eval

Post by bob »

Don wrote:
bob wrote:
Don wrote:
bob wrote:
jhaglund wrote:
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years. :)
I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.
I can store terabytes on our current cluster. And if our new cluster happens next year, it will have petabytes of storage. Saving the PGN isn't the problem, the problem becomes somehow organizing it so that it is understandable as to what each PGN collection represents.

I probably need to start from scratch and find some rational way of naming the pgn directories so that I can easily find the results from a specific test in the past, when needed...
Here is a common solution to this type of problem: Use a sql database. You can store all the PGN files in the database, or just use it to organize the pgn files that are stored on disk. You can keep all the relevant data that you many someday need to access such as dates, reason for test, players, hardware, level, versions, etc. sqlite would be a good choice for a private database like this as it's trivial to manage and easy to use but any database would do fine of course.

Then you don't really need to think too much about how to organize the data because you can query it any way you need to when the time comes. All you have to think about is what it is you want to capture about each game sequence.

If you are like me, however, it's probably not worth it unless you really think you would need to reference the data. I have only occasionally wished I had saved results form discarded test data and then it was very recently discarded data.
The problem is, to use any sort of database SQL or whatever, you need some sort of "naming/indentifying" convention so that you can figure out what feature was being tested in a specific PGN game. That's the thing I have not done. And it is the very thing I need to do. I've not given it any thought since I really don't care much about looking at old results. I run tests to tune something, and then move on and tune something else. Most likely the old tests are no longer valid now anyway since other changes have been made.
This would have to be by keyword search. I have looked at Crafty code and your comments are good - they would be appropriate for this. For instance if you remembered testing a hash table change a year ago, you would have a field (or I should say a column) in a table that describes what you are testing in a comment like style. Just like a change log in a git, cvs, subversion or whatever you use for version control. So in sql you can search for the string "hash" in your comment field.

It's not worth the trouble unless you can make it automatic and painless. If I took a little time I could build this kind of support into my own tester, since almost all this information is hanging around somewhere when I start a test. My tester builds a web page for viewing the results which has the reason for the test, what is being tested, etc. So I'm practically there. Presumably you already have that information laying around too. If I did this I would be a little more conscientious about making sure the description of the test had good desciptions with appropriate keywords in them.

Anway, just a crazy idea. I know for you, and probably for me too it's not worth the time and trouble since I don't think I would look back very often.
The reason it sits on the back of my mind is that the testing I have done represents a _lot_ of data. Much of which I have not even extracted. For example, I test and tune A, and then B, and then C. But I don't have any "memory" about the A results and how they might have interacted with B or C. Even if I later come back and twiddle with A again. Whether there is anything interesting there or not is one issue, but at present, it is not even possible to do this at all. Once the particular set of tests is done, that information is consigned to the toilet at some point in time. Would be nice to go back and ask "OK, from version 22.0 to version 23.0, what was the effect on overall search depth compared to overall Elo improvement?" The information is not all present, because I do not record depths and such (yet) in the PGN, but I could. And the information that might be extracted could be useful.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Normalizing the eval

Post by Don »

bob wrote:
Don wrote:
bob wrote:
Don wrote:
bob wrote:
jhaglund wrote:
I save all the results from BayesElo, yes. However, there is so much data it is very difficult to look at over time. I couldn't even guess how many games I've played in my testing. The fast games go at around 32K per hour. The more usual tests take 2-3 hours per 32K. Divide 24 by 3, you get 8. 8x32K games every day for a couple of years. :)
I am a PGN whore since 1997 lol

I have many gigs. They are scattered amongst hdds, etc..

No program would work, handling a large database or to my work speed desired. I think I need a supercomputer.
I can store terabytes on our current cluster. And if our new cluster happens next year, it will have petabytes of storage. Saving the PGN isn't the problem, the problem becomes somehow organizing it so that it is understandable as to what each PGN collection represents.

I probably need to start from scratch and find some rational way of naming the pgn directories so that I can easily find the results from a specific test in the past, when needed...
Here is a common solution to this type of problem: Use a sql database. You can store all the PGN files in the database, or just use it to organize the pgn files that are stored on disk. You can keep all the relevant data that you many someday need to access such as dates, reason for test, players, hardware, level, versions, etc. sqlite would be a good choice for a private database like this as it's trivial to manage and easy to use but any database would do fine of course.

Then you don't really need to think too much about how to organize the data because you can query it any way you need to when the time comes. All you have to think about is what it is you want to capture about each game sequence.

If you are like me, however, it's probably not worth it unless you really think you would need to reference the data. I have only occasionally wished I had saved results form discarded test data and then it was very recently discarded data.
The problem is, to use any sort of database SQL or whatever, you need some sort of "naming/indentifying" convention so that you can figure out what feature was being tested in a specific PGN game. That's the thing I have not done. And it is the very thing I need to do. I've not given it any thought since I really don't care much about looking at old results. I run tests to tune something, and then move on and tune something else. Most likely the old tests are no longer valid now anyway since other changes have been made.
This would have to be by keyword search. I have looked at Crafty code and your comments are good - they would be appropriate for this. For instance if you remembered testing a hash table change a year ago, you would have a field (or I should say a column) in a table that describes what you are testing in a comment like style. Just like a change log in a git, cvs, subversion or whatever you use for version control. So in sql you can search for the string "hash" in your comment field.

It's not worth the trouble unless you can make it automatic and painless. If I took a little time I could build this kind of support into my own tester, since almost all this information is hanging around somewhere when I start a test. My tester builds a web page for viewing the results which has the reason for the test, what is being tested, etc. So I'm practically there. Presumably you already have that information laying around too. If I did this I would be a little more conscientious about making sure the description of the test had good desciptions with appropriate keywords in them.

Anway, just a crazy idea. I know for you, and probably for me too it's not worth the time and trouble since I don't think I would look back very often.
The reason it sits on the back of my mind is that the testing I have done represents a _lot_ of data. Much of which I have not even extracted. For example, I test and tune A, and then B, and then C. But I don't have any "memory" about the A results and how they might have interacted with B or C. Even if I later come back and twiddle with A again. Whether there is anything interesting there or not is one issue, but at present, it is not even possible to do this at all. Once the particular set of tests is done, that information is consigned to the toilet at some point in time. Would be nice to go back and ask "OK, from version 22.0 to version 23.0, what was the effect on overall search depth compared to overall Elo improvement?" The information is not all present, because I do not record depths and such (yet) in the PGN, but I could. And the information that might be extracted could be useful.
Implementing this, or something like it would not be difficult at all. I honestly think that it would "earn" it's keep now that I think about it more. On more than one occasion I have testing something I tried before, perhaps many months ago. And there have been times when I really wished I had kept some result. Even though this is usually in the short term, it's still a real issue.

Also, I've had the experience of trying to explain some test I did months or even years ago, but with no data to back me up. It would be great to be able to say, "oh yes, I tested that a couple of years ago, here is the results ..."