Sparse Files Howto

Unix file systems like ext3/4 can store files which are partly empty more efficiently by not storing blocks with all zeros. These files are called sparse files. When reading these files every things works as normal but “all zero” blocks don’t wast space on the drive.

This can be useful for different application. For example a database can make a big file for random access, without using the space on the drive. The actual size on the disk grows with every used block. Another example are raw disk images for virtualization like KVM. You can make a 10GB disk image which uses almost no space, and grows only when used.

Usefull sysadmin commands:

"du -h --apparent-size FILE" shows the full file size including sparse areas
"du -h FILE" shows the actual space used on the file system
"ls -lh FILE" show the full file size
"ls -sh FILE" shows the actual space used on the file system
"fallocate -d FILE" make a file sparse, which means "digs holes" for "all zero" blocks
"rsync -S ..." the -S option makes rsync sparse file aware and produces sparse files at the receiver
"truncate -s 1G FILE" makes a sparse file with 1GB that uses no file system space