Question about files

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Question about files

Post by hgm »

Well, the way I had in mind would not need to store any explicit info on games that still have to be done. That would be implied from the tournament parameters. Of course the tournament file could also be organized as a play list, but I think that would needlessly drive up the size, because you would have to repeat engine names for each game, etc.

But reserving the space for all games, and indicating the unplayed games by spaces, is not a bad idea. So what I was thinking of is something like this:

Code: Select all

-processes "A C      "
-participants {fruit
fairymax
glaurung
crafty
}
-tourneyType 0
-tourneyCycles 1
-gamesPerPairing 4
-loadGameFile "Nunn.pgn" 
-loadGameIndex -2
-saveGameFile "RR1293.pgn"
-results "+=--++=+==-+-=A+C       "
The string for tourney results measures 24 characters here (4x3/2 pairings in a four-player round-robin, times 4 games per pairing, times 1 cycle). Tourney type 0 would be round-robin, type = 1 would be gauntlet (in this case fruit against the 3 others), type > 1 would be multi-gauntlet (e.g. type = 2 would be fruit + fairymax against all others).

The -processes string reserves a single character position for each busy process; starting a new game process on the same tourney file would look in the string for the first un-assigned position (in this case B), take on that identity, (writing the B to reserve it), and grab the first space from the -results string (also marking it as B).

OTOH, having the length of the -result string not fixed could have some advantage as well. e.g. you could want to add another cycle to the tourney, perhaps with another -loadGameFile, and it would be good if you could do that by just editing the file to change the -tourneyCycles number, without having to count out spaces to add in the -result string (which might be hundreds...).
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Question about files

Post by bob »

hgm wrote:Well, the way I had in mind would not need to store any explicit info on games that still have to be done. That would be implied from the tournament parameters. Of course the tournament file could also be organized as a play list, but I think that would needlessly drive up the size, because you would have to repeat engine names for each game, etc.

But reserving the space for all games, and indicating the unplayed games by spaces, is not a bad idea. So what I was thinking of is something like this:

Code: Select all

-processes "A C      "
-participants {fruit
fairymax
glaurung
crafty
}
-tourneyType 0
-tourneyCycles 1
-gamesPerPairing 4
-loadGameFile "Nunn.pgn" 
-loadGameIndex -2
-saveGameFile "RR1293.pgn"
-results "+=--++=+==-+-=A+C       "
The string for tourney results measures 24 characters here (4x3/2 pairings in a four-player round-robin, times 4 games per pairing, times 1 cycle). Tourney type 0 would be round-robin, type = 1 would be gauntlet (in this case fruit against the 3 others), type > 1 would be multi-gauntlet (e.g. type = 2 would be fruit + fairymax against all others).

The -processes string reserves a single character position for each busy process; starting a new game process on the same tourney file would look in the string for the first un-assigned position (in this case B), take on that identity, (writing the B to reserve it), and grab the first space from the -results string (also marking it as B).

OTOH, having the length of the -result string not fixed could have some advantage as well. e.g. you could want to add another cycle to the tourney, perhaps with another -loadGameFile, and it would be good if you could do that by just editing the file to change the -tourneyCycles number, without having to count out spaces to add in the -result string (which might be hundreds...).
I see two reasonable choices...

1. use a separate file for each referee or whatever you call it.

2. Use a common file, but give each referee a distinct piece of the file space to work in, and then carefully fseek() before you start writing to make sure that each process writes into its specific area. This should be a multiple of the blocksize of the filesystem for obvious reasons, and it is really probably an idea that is best left alone as there is a lot of room for debugging and future filesystem changes that will break such a piece of code. If you do that plus use flock() and fflush() you have a prayer. But only a prayer. :)
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Question about files

Post by hgm »

It would be great if this would actually work: :wink:

Image
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Question about files

Post by Dann Corbit »

hgm wrote:If two independent processes, which opened the same file for writing independently, both write...

will data be lost, or does the OS guarantee that data will always be appended at the end?
It depends on the opening modes, on the locking modes, etc.

There is no ANSI C way to do this efficiently.

If you want this capability, then a Database file is the best answer.
stevenaaus
Posts: 608
Joined: Wed Oct 13, 2010 9:44 am
Location: Australia

Re: Question about files

Post by stevenaaus »

One solution would be a separate third process whose job is to receive messages from the clients then write them to the file.

You could use sockets for interprocess communication, and I'm familiar with TCL sockets, but which other socket implementations are cross platform, i don't know.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Question about files

Post by Dann Corbit »

stevenaaus wrote:One solution would be a separate third process whose job is to receive messages from the clients then write them to the file.

You could use sockets for interprocess communication, and I'm familiar with TCL sockets, but which other socket implementations are cross platform, i don't know.
I don't see how that helps. You can easily have hundreds or thousands of simultaneous connections using TCP/IP.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Question about files

Post by sje »

Dann Corbit wrote:
stevenaaus wrote:One solution would be a separate third process whose job is to receive messages from the clients then write them to the file.

You could use sockets for interprocess communication, and I'm familiar with TCL sockets, but which other socket implementations are cross platform, i don't know.
I don't see how that helps. You can easily have hundreds or thousands of simultaneous connections using TCP/IP.
Actually, having a client/server model as described is the way to do this. Well, at least on Unix/Linux Posix-compliant systems.

A server can have hundreds of active connections and can easily serialize access to a shared resource such as a writing to a single file. One server thread listens for connections, another queues client requests onto a FIFO queue guarded by a pthread lock, and a third removes requests from that FIFO queue one at a time. There must be some open codebases on the net with examples. The daemon syslogd does pretty much what is needed although I think that's line oriented and would have to be changed to handle multiple text lines atomically.

Bob is right about fooling around with flock() and fflush(). They might work on some platforms. For a while. Maybe.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Question about files

Post by Dann Corbit »

sje wrote:
Dann Corbit wrote:
stevenaaus wrote:One solution would be a separate third process whose job is to receive messages from the clients then write them to the file.

You could use sockets for interprocess communication, and I'm familiar with TCL sockets, but which other socket implementations are cross platform, i don't know.
I don't see how that helps. You can easily have hundreds or thousands of simultaneous connections using TCP/IP.
Actually, having a client/server model as described is the way to do this. Well, at least on Unix/Linux Posix-compliant systems.

A server can have hundreds of active connections and can easily serialize access to a shared resource such as a writing to a single file. One server thread listens for connections, another queues client requests onto a FIFO queue guarded by a pthread lock, and a third removes requests from that FIFO queue one at a time. There must be some open codebases on the net with examples. The daemon syslogd does pretty much what is needed although I think that's line oriented and would have to be changed to handle multiple text lines atomically.

Bob is right about fooling around with flock() and fflush(). They might work on some platforms. For a while. Maybe.
That will work, but it will also serialize all the threads through the write operation. If the writes were to a database, thousands can happen simultaneously.
Aleks Peshkov
Posts: 892
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: Question about files

Post by Aleks Peshkov »

AFAIK writing to a file in APPEND mode is atomic in all modern OSes. Otherwise widely used log-files practice would be impossible.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: Question about files

Post by sje »

Dann Corbit wrote:That will work, but it will also serialize all the threads through the write operation. If the writes were to a database, thousands can happen simultaneously.
A difficulty here is that the term "database" can mean several different things, some of which may be useful in this context and some not so useful nor relevant. Without knowing the exact definition in use, it's hard to make concrete statements about their utility.