Software RAID under Linux next

Selecting chunk size

Chunk is the "atomic" mass of data that is written to the devices. If we have 4K chunks and two disks in a RAID, chunks 0 and 2 are written to the first disk, chunks 1 and 3 - to the second disk. Overhead for large files is lower if chunks are large, but small files benefit from smaller chunks. Chunk size is specified in /etc/raidtab in kilobytes.
  • Linear RAID: chunk size must be specified but is not used
  • RAID-0: chunk-size bytes is written to each disk, in parallel. Bad things will happen if the filesystem writes more data to every N-th chunk where N is the number of disks: all writes will fall on the same disk. Ext2 writes more at the beginning of each block group, block group size is 32K. Chunk size of 32K solves the problem.
  • RAID-1: chunk size has no effect for writes, for reads at least one chunk is read from the disk
  • RAID-5: Chunk size affects both data and parity chunks. For reads chunk size has the same effect as for RAID-0.
    For writes, optimal chunk size greatly depends on the type of disk activity (linear writes or scattered writes, small or large files), usually varies between 32K and 128K.

RAID, chunk size, and ext2 filesystem

Performance of RAID-5 (and RAID-4) running ext2 filesystem can be significantly improved if the filesystem knows how many ext2 blocks fit into one chunk. This information is given to mke2fs using option -R stride=nn. If chunk size is 32K, then 8 blocks of size 4K fit into one chunk:
raidbox# mke2fs -b 4096 -R stride=8 /dev/md0