bayeselo games limit ?
Moderator: Ras
-
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: bayeselo games limit ?
I guess I could try that. One of the processors calculates elo every three minutes which means it has to read all the pgns join them everytime. I never had this problems for 40/1" tc tests( I rarely loose one game here) . But when I tried to do ultra-fast games (say depth=5) for evaluation tuning, it began to loose hundreds of games. I guess that when two instances try to write to the same file at the same time, one of them will get "access denied" message (doesnt busy wait) so that game is lost.
-
- Posts: 438
- Joined: Mon Apr 24, 2006 8:06 pm
Re: bayeselo games limit ?
There is no need to join the PGNs. Bayeselo can read several separate PGNs. If you use more than one "readpgn" command, it will accumulate data.Daniel Shawul wrote:I guess I could try that. One of the processors calculates elo every three minutes which means it has to read all the pgns join them everytime. I never had this problems for 40/1" tc tests( I rarely loose one game here) . But when I tried to do ultra-fast games (say depth=5) for evaluation tuning, it began to loose hundreds of games. I guess that when two instances try to write to the same file at the same time, one of them will get "access denied" message (doesnt busy wait) so that game is lost.
Rémi
-
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: bayeselo games limit ?
Thanks Remi. I will certainly do that.
Daniel
Daniel
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: bayeselo games limit ?
Install LinuxFrancoisK wrote:Well i guess it means my hard disk is getting really really old and fragmented (and my computer very slow). Have you also extended pgn info (in the winboard sense) ? I have.bob wrote:It only takes about 30 seconds to parse a million games on our cluster "head" node during my testing...FrancoisK wrote:I currently use it for millions of games, but they are divided into several pgns. (up to 512000 games per PGN if i am not mistaken).
Never found any limit so far, but it is slow of course as it has to parse PGNs (between half an hour and one hour to read all pgns). I would also be very interested if it could take result grids as input instead of full pgns.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: bayeselo games limit ?
Yes and no to the extended PGN info. Not in my cluster data, as I produce so many games the extra crap would burn too much disk space. But I have used bayeselo on files with heavily nested comments and such...FrancoisK wrote:Well i guess it means my hard disk is getting really really old and fragmented (and my computer very slow). Have you also extended pgn info (in the winboard sense) ? I have.bob wrote:It only takes about 30 seconds to parse a million games on our cluster "head" node during my testing...FrancoisK wrote:I currently use it for millions of games, but they are divided into several pgns. (up to 512000 games per PGN if i am not mistaken).
Never found any limit so far, but it is slow of course as it has to parse PGNs (between half an hour and one hour to read all pgns). I would also be very interested if it could take result grids as input instead of full pgns.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: bayeselo games limit ?
Are you running multiple instances of the program that is writing to the PGN file? If so, you are going to have to go to some sort of locking mechanism as you can't allow two processes to open and write to the same file at the same time. It will end up having corrupted data, most likely with extra EOF characters scattered around. Fix that and the problem will go away.Daniel Shawul wrote:Pgn sent.
I am using mutiple instances (processes) of cutechess-cli which write their games to the same PGN file.
At fast time controls this sometimes looses a couple of games and probably screws the format too ..
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: bayeselo games limit ?
Bob,bob wrote:Are you running multiple instances of the program that is writing to the PGN file? If so, you are going to have to go to some sort of locking mechanism as you can't allow two processes to open and write to the same file at the same time. It will end up having corrupted data, most likely with extra EOF characters scattered around. Fix that and the problem will go away.Daniel Shawul wrote:Pgn sent.
I am using mutiple instances (processes) of cutechess-cli which write their games to the same PGN file.
At fast time controls this sometimes looses a couple of games and probably screws the format too ..
I think posix defines (FILE *) operations to be atomic, so I believe if Daniel is on a linux Posix standard system it will work if he writes the file in a single operation.
You probably know more about this than I do and can verify or debunk this.
In C you would build your PGN record then write it (in append mode) to a stream using either a single fprintf or fwrite statement. For obvious reasons you cannot write the record a line at a time using several calls or it would not be atomic.
I'm not familiar with cutechess-cli so I don't know if this applies in Daniels situation.
I have no clue about how Windows would work with this.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: bayeselo games limit ?
Doesn't work like that. Two processes can open the same file, the first writes out 100 bytes, the second then writes out what it thinks is only a 50 byte chunk. That 50 bytes overwrites the first 50 bytes from the first write, leaving 50 bytes from one write, 50 bytes from the second.Don wrote:Bob,bob wrote:Are you running multiple instances of the program that is writing to the PGN file? If so, you are going to have to go to some sort of locking mechanism as you can't allow two processes to open and write to the same file at the same time. It will end up having corrupted data, most likely with extra EOF characters scattered around. Fix that and the problem will go away.Daniel Shawul wrote:Pgn sent.
I am using mutiple instances (processes) of cutechess-cli which write their games to the same PGN file.
At fast time controls this sometimes looses a couple of games and probably screws the format too ..
I think posix defines (FILE *) operations to be atomic, so I believe if Daniel is on a linux Posix standard system it will work if he writes the file in a single operation.
You probably know more about this than I do and can verify or debunk this.
In C you would build your PGN record then write it (in append mode) to a stream using either a single fprintf or fwrite statement. For obvious reasons you cannot write the record a line at a time using several calls or it would not be atomic.
I'm not familiar with cutechess-cli so I don't know if this applies in Daniels situation.
I have no clue about how Windows would work with this.
POSIX guarantees that the _file_ won't be corrupted. It doesn't care a hoot about whether the data is valid or not, however. Simplest solution is to avoid sequential writes, and use record-locking to prevent two interlaced writes. In my cluster stuff, I started off doing this but decided to just write a ton of different PGN files and then collect them together after a run has finished. Locking a file introduces delays I wanted to avoid, particularly when I can play so many games at one time. If we ramp up to use both clusters and run 100K games per hour, the delays add up drastically, so different files solves that.
But a good rule of thumb is to _never_ do sequential writes to the same file from two different processes. You will almost never end up with what you expect to get. Windows uses an EOF character that we don't see in Linux, and with interleaved writes, you can end up with EOF characters scattered throughout the file, and the first one you encounter will terminate BayesElo's reading.
For Linux, the headache is that if two processes open the same file, you get two different descriptors, with two different file pointers, etc. So there is no coordination. When we were using the old IBRIX (a sorry mechanism if ever there was one) we ran into a lower-level issue. Since IBRIX is a sort of parallel NFS mechanism, I was opening a file, appending a game, closing the file and repeating. We could lose games that way due to IBRIX and we eventually sh**-canned the thing and went with normal NFS, which is reliable, if slower. Bottom line is, as I said, one process, one file, and all is well. More than one process on the same file requires programming safeguards. And may still fail if you happen to try IBRIX.
-
- Posts: 12790
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: bayeselo games limit ?
I just completed calculation using mm of Elo for junkbase (more than 10 million games, of which more than 9.5 million had recognizable termination).
It used 1 GB of RAM and took several hours.
It used 1 GB of RAM and took several hours.