fsync
Pitfalls
This is a non-comprehensive list of the pitfalls of the fsync
syscall.
Linux man 2 fsync
man 2 fsync
fsync()
transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) so that all changed information can be retrieved even if the system crashes or is rebooted. This includes writing through or flushing a disk cache if present. The call blocks until the device reports that the transfer has completed.As well as flushing the file data,
fsync()
also flushes the metadata information associated with the file (see inode(7)).Calling
fsync()
does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicitfsync()
on a file descriptor for the directory is also needed.
fdatasync()
is similar tofsync()
, but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see inode(7)) do not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush.The aim of
fdatasync()
is to reduce disk activity for applications that do not require all metadata to be synchronized with the disk.
I will expand this list as I have more questions about all the questionable filesystems used and created by operating system enthusiasts.
fsync
does not ensure that a fsync
'd file is visible in its parent directory
From the manpage:
Calling
fsync()
does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicitfsync()
on a file descriptor for the directory is also needed.
This means that that you cannot rely on a file being in the directory after
fsync
ing the file itself. You have to fsync
the directory too.
Speaking about fsync
ing a directory:
fsync
on a directory does not ensure children are fsync
'd
From the manpage:
Calling
fsync()
does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicitfsync()
on a file descriptor for the directory is also needed.
The assumption that fsync
a directory will fsync the files themselves is also
wrong. You can imagine a directory as a file containing a list of children, and
the list is just pointers to inodes. So fsync
ing a directory will just write
the list of pointers to disk.
More reading on fsync
and other things related to files
- (danluu) Fsyncgate: Errors on
fsync
are unrecoverable - (danluu) Files are hard
- (puzpuzpuz) The secret life of
fsync
- (stackoverflow) Difference between
syncfs
(Linux only) andfsync
(POSIX) (TL;DR:syncfs
is "pretty please" fsync and doesn't block until the operation is done) - (transactional.blog) Userland Disk I/O
- (LWN) Feathersticth: Killing
fsync
softly - (stackoverflow) Your Program ---
---> Your OS ---fflush
---> Your Diskfsync
- (despairlabs)
fsync()
afteropen()
is an elaborate no-op - (Postgres Wiki)
fsync
errors
RAMBLINGS OF A MADMAN
RAMBLINGS OF A MADMAN
-
Darwin
Darwin-
Disable MacOS doodoo garbage shinies with Nix Darwin
Disable MacOS doodoo garbage shinies with Nix Darwin
-