fsync Pitfalls

This is a non-comprehensive list of the pitfalls of the fsync syscall.

Linux man 2 fsync

fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) so that all changed information can be retrieved even if the system crashes or is rebooted. This includes writing through or flushing a disk cache if present. The call blocks until the device reports that the transfer has completed.

As well as flushing the file data, fsync() also flushes the metadata information associated with the file (see inode(7)).

Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed.

fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see inode(7)) do not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush.

The aim of fdatasync() is to reduce disk activity for applications that do not require all metadata to be synchronized with the disk.

I will expand this list as I have more questions about all the questionable filesystems used and created by operating system enthusiasts.

fsync does not ensure that a fsync'd file is visible in its parent directory

From the manpage:

Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed.

This means that that you cannot rely on a file being in the directory after fsyncing the file itself. You have to fsync the directory too.

Speaking about fsyncing a directory:

fsync on a directory does not ensure children are fsync'd

From the manpage:

Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed.

The assumption that fsync a directory will fsync the files themselves is also wrong. You can imagine a directory as a file containing a list of children, and the list is just pointers to inodes. So fsyncing a directory will just write the list of pointers to disk.

RAMBLINGS OF A MADMAN