Well, some of those log file practices are going through a server application that performs the necessary serialization. Relying solely on append mode will not work in part because the application run time support, separate from the OS, may buffer data writes and the data record bounds may not match the buffer length. And it's possible for the buffer length to vary depending on the native block size of the filesystem used for the output stream. Not an easy bug to find!Aleks Peshkov wrote:AFAIK writing to a file in APPEND mode is atomic in all modern OSes. Otherwise widely used log-files practice would be impossible.
Question about files
Moderators: hgm, Rebel, chrisw
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: Question about files
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Question about files
I don't know of any "logger" that uses threads. Even the linux kernel /var/log/messages does not do this. It grabs stuff from the linux kernel logging ring-buffer (that all kinds of things insert data in via a system call) and then writes it out using a single process. The common advice found in parallel programming books is "multiple threads should never read or write the same file or descriptor"sje wrote:Well, some of those log file practices are going through a server application that performs the necessary serialization. Relying solely on append mode will not work in part because the application run time support, separate from the OS, may buffer data writes and the data record bounds may not match the buffer length. And it's possible for the buffer length to vary depending on the native block size of the filesystem used for the output stream. Not an easy bug to find!Aleks Peshkov wrote:AFAIK writing to a file in APPEND mode is atomic in all modern OSes. Otherwise widely used log-files practice would be impossible.
Shoot, even on a threaded process crash, you get N core files for that very reason.
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: Question about files
I've written one that does; I needed something that had microsecond resolution timestamped event logging. And it was efficient and portable across Posix systems.bob wrote:I don't know of any "logger" that uses threads.
A better phrasing would be "multiple threads should never access the same writable stream unless that stream is guarded by a lock".bob wrote:The common advice found in parallel programming books is "multiple threads should never read or write the same file or descriptor"
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Question about files
The problem with the lock is latency. Lots of threads trying to update a single file is a way to have a bunch of threads blocked waiting to acquire the lock.sje wrote:I've written one that does; I needed something that had microsecond resolution timestamped event logging. And it was efficient and portable across Posix systems.bob wrote:I don't know of any "logger" that uses threads.
A better phrasing would be "multiple threads should never access the same writable stream unless that stream is guarded by a lock".bob wrote:The common advice found in parallel programming books is "multiple threads should never read or write the same file or descriptor"
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Question about files
Well, latency (or efficiency in general) is not really much of an issue for the application I have in mind. But reliability is. I guess in 99.9% of the cases it would work properly without locks anyway, but I want to get rid of that hypothetical 0.1%...
Just to make sure I got it right:
what I have done in XBoard is this:
The lock is taken off the file when I fclose the latter, right? At that point (one of the) other XBoard instances, that might have wanted starting to add its own PGN to the SAVEFILE, but is blocking in its flock call, will resume the saving, untill finally all are done.
Or would I have to fseek for the end of the file after the flock, before writing? (I thought that in "a" files this was automatic.)
In the tournament-manager function I have another critical section for reserving a game to play:
Both these code sections are only executed once per game, and I was not planning to use this XBoard tournament function for sub-second games. That should make the lock overhead bearable if you have, say eight XBoard instances playing 1-min games and all writing them to the same PGN file, right?
Just to make sure I got it right:
what I have done in XBoard is this:
Code: Select all
void SaveGameOnFile()
{
FILE *f = fopen(SAVEFILE, "a");
flock(fileno(f), LOCK_EX);
SaveGameAsPGN(f);
fclose(f);
}
Or would I have to fseek for the end of the file after the flock, before writing? (I thought that in "a" files this was automatic.)
In the tournament-manager function I have another critical section for reserving a game to play:
Code: Select all
int ReserveGame()
{
FILE f = fopen(TOURNEYFILE, "r+");
flock(f, LOCK_EX);
ReadEntireFile(f);
// decide which game to play dependin on file contents,
// and determine offset of that game in the file
...
fseek(f, NrToOffset(gameNr), SEEK_END);
fprintf(f, RESERVED_MARKER);
fclose(f);
return GameNr;
}
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Question about files
You certainly would want to check for some error cases which can occur in real scenarios, or in case of programming errors.hgm wrote:Well, latency (or efficiency in general) is not really much of an issue for the application I have in mind. But reliability is. I guess in 99.9% of the cases it would work properly without locks anyway, but I want to get rid of that hypothetical 0.1%...
Just to make sure I got it right:
what I have done in XBoard is this:
The lock is taken off the file when I fclose the latter, right? At that point (one of the) other XBoard instances, that might have wanted starting to add its own PGN to the SAVEFILE, but is blocking in its flock call, will resume the saving, untill finally all are done.Code: Select all
void SaveGameOnFile() { FILE *f = fopen(SAVEFILE, "a"); flock(fileno(f), LOCK_EX); SaveGameAsPGN(f); fclose(f); }
Or would I have to fseek for the end of the file after the flock, before writing? (I thought that in "a" files this was automatic.)
In the tournament-manager function I have another critical section for reserving a game to play:
Both these code sections are only executed once per game, and I was not planning to use this XBoard tournament function for sub-second games. That should make the lock overhead bearable if you have, say eight XBoard instances playing 1-min games and all writing them to the same PGN file, right?Code: Select all
int ReserveGame() { FILE f = fopen(TOURNEYFILE, "r+"); flock(f, LOCK_EX); ReadEntireFile(f); // decide which game to play dependin on file contents, // and determine offset of that game in the file ... fseek(f, NrToOffset(gameNr), SEEK_END); fprintf(f, RESERVED_MARKER); fclose(f); return GameNr; }
fopen() may return NULL on error. flock() may return -1 on error. fseek() may return a nonzero value on error.
Adding these checks is always cheap.
Regarding your question about using fseek(), I have never tested it together with flock(). In the simple case without locking you are right, fopen(..., "a") ensures that each write operation appends to the end of the file. However, I don't know whether simultaneous access to the same file by several processes will change things. You'll have to test it. If the write operation does not perform an internal seek to the current end of file then something gets overwritten, which can certainly be verified in a debug version somehow (new file length smaller than expected, for instance).
Sven
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Question about files
RE: close. Yes, as far as I know. You can also use flock() with an argument (something like UN_LOCK or LOCK_UN) to remove the lock. So long as all your code uses that same approach, it should work. Note that does not work on remote files if you are doing that. You can use fcntl() instead to lock on local or remote files if that is important.hgm wrote:Well, latency (or efficiency in general) is not really much of an issue for the application I have in mind. But reliability is. I guess in 99.9% of the cases it would work properly without locks anyway, but I want to get rid of that hypothetical 0.1%...
Just to make sure I got it right:
what I have done in XBoard is this:
The lock is taken off the file when I fclose the latter, right? At that point (one of the) other XBoard instances, that might have wanted starting to add its own PGN to the SAVEFILE, but is blocking in its flock call, will resume the saving, untill finally all are done.Code: Select all
void SaveGameOnFile() { FILE *f = fopen(SAVEFILE, "a"); flock(fileno(f), LOCK_EX); SaveGameAsPGN(f); fclose(f); }
That I have not done (append mode). What I did when using this was to fopen, then flock, then use fseek(fid, 0, SEEK_END) which will seek to the end of the file. I'd be concerned about the window between fopen() and flock() where some other process could append to the file after you open it, and lead to corruption. If you lock, then seek to the end, I _think_ you will be safe. I am still not certain about that "window". When I open the file, the library might determine the size at that instant, so that the fseek() might not be to the right byte. If I were going to do this, I would beat the crap out of it with a bunch of programs that write recognizable strings out where I can tell if there are duplicates or missing strings.
Or would I have to fseek for the end of the file after the flock, before writing? (I thought that in "a" files this was automatic.)
I don't think the overhead will hurt. You write everything at once, and not move by move??? move by move might introduce some lag...
In the tournament-manager function I have another critical section for reserving a game to play:
Both these code sections are only executed once per game, and I was not planning to use this XBoard tournament function for sub-second games. That should make the lock overhead bearable if you have, say eight XBoard instances playing 1-min games and all writing them to the same PGN file, right?Code: Select all
int ReserveGame() { FILE f = fopen(TOURNEYFILE, "r+"); flock(f, LOCK_EX); ReadEntireFile(f); // decide which game to play dependin on file contents, // and determine offset of that game in the file ... fseek(f, NrToOffset(gameNr), SEEK_END); fprintf(f, RESERVED_MARKER); fclose(f); return GameNr; }
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Question about files
Sure, I left out such details. At least I test for fopen to succeed. Not sure how the fseek could ever fail, though. (Or what I should do when it does.)Sven Schüle wrote:You certainly would want to check for some error cases which can occur in real scenarios, or in case of programming errors.
fopen() may return NULL on error. flock() may return -1 on error. fseek() may return a nonzero value on error.
Well testing this is not so easy, as the chance two processes would write at the same time is virtually zero. I guess I could make special test versions that intensionally stall during writing, e.g. do a sleep(1) between each move of the game, (and flush the move before that).You'll have to test it. If the write operation does not perform an internal seek to the current end of file then something gets overwritten, which can certainly be verified in a debug version somehow (new file length smaller than expected, for instance).
I guess it would not hurt to put in an extra fseek(f, 0, SEEK_END) before writing. Or even a plain lseek on the underlying fileno(f) file descriptor, to avoid problems when the stream routines try to be smart, and cache things without realizing they might be invalidated by other processes. Just to be safe.
I got the impression the stream would always seek, because in the specs of "a+" files, it says you can seek to somewhere internal to the file, to read there, and when you then write, it would still go at the end, and not where you last had been reading. (That would happen with "r+".) So it must seek in that case, and I figured the simplest implementation of that would be to always seek to the end before issuing a write call to flush the buffer. But of course there is no guarantee that "a" is treated the same way as "a+", so better safe than sorry...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Question about files
Most common failure is on a bad file descriptor being passed in. Second is a bogus option telling it where to seek from beginning, end, or current position. If you are careful, none of those happen and the test for error return is not needed and can muddle the code a bit.hgm wrote:Sure, I left out such details. At least I test for fopen to succeed. Not sure how the fseek could ever fail, though. (Or what I should do when it does.)Sven Schüle wrote:You certainly would want to check for some error cases which can occur in real scenarios, or in case of programming errors.
fopen() may return NULL on error. flock() may return -1 on error. fseek() may return a nonzero value on error.
Well testing this is not so easy, as the chance two processes would write at the same time is virtually zero. I guess I could make special test versions that intensionally stall during writing, e.g. do a sleep(1) between each move of the game, (and flush the move before that).You'll have to test it. If the write operation does not perform an internal seek to the current end of file then something gets overwritten, which can certainly be verified in a debug version somehow (new file length smaller than expected, for instance).
I guess it would not hurt to put in an extra fseek(f, 0, SEEK_END) before writing. Or even a plain lseek on the underlying fileno(f) file descriptor, to avoid problems when the stream routines try to be smart, and cache things without realizing they might be invalidated by other processes. Just to be safe.
I got the impression the stream would always seek, because in the specs of "a+" files, it says you can seek to somewhere internal to the file, to read there, and when you then write, it would still go at the end, and not where you last had been reading. (That would happen with "r+".) So it must seek in that case, and I figured the simplest implementation of that would be to always seek to the end before issuing a write call to flush the buffer. But of course there is no guarantee that "a" is treated the same way as "a+", so better safe than sorry...
-
- Posts: 892
- Joined: Sun Nov 19, 2006 9:16 pm
- Location: Russia
Re: Question about files
Code: Select all
$fp = fopen($filename, 'c+b');
if ($fp) {
if (flock($fp, LOCK_EX)) {
$result = file_get_contents($filename);
flock($fp, LOCK_UN);
}
fclose($fp);
}