Embedded Linux - Managing Flash Memory
Most embedded Linux systems lack a traditional
PC hard drive. Instead, the Linux kernel, associated programs
and user data reside on flash devices. The Memory Technology
Device (MTD) project provides flash support for Linux. This
document answers some common questions about the device
driver, application and file system aspects of flash devices
in embedded Linux.
See http://www.linux-mtd.infradead.org/faq/general.html
for the official MTD Frequently Asked Questions.
Common Questions about Embedded Linux flash drivers
(MTD):
How is the flash partition
layout specified?
What filesystem types are suitable
for flash?
What is the purpose of the
MTD character device?
How do I access the flash from user-space
applications?
How do I upgrade the boot
loader or kernel or root filesystem?
How much redundancy or error recovery
do I need for my upgrade procedure?
How should I create an
empty JFFS2 partition?
How much overhead space does
JFFS2 use for its own book-keeping?
Should I provide any options when
I mount a JFFS2 filesystem?
Is flash chip X supported?
How do I add support for my platform?
How is the flash
partition layout specified?
Most programmers are used to dealing with hard disks. The
hard disk partition table resides in a reserved sector on
the hard disk. In an IBM-PC, all system software reads the
partition layout from the partition table on the disk.
If a flash device emulates a hard disk, the partition table
method may still be used. For example, CompactFlash flash
modules look like an IDE hard drive to the Linux kernel.
In most embedded system, the flash device is used directly
with no hard disk emulation. The MTD mapping driver provides
accessor functions to read and write flash memory. The mapping
driver can either specify a hard-coded partition layout,
read the partition layout from the kernel command line passed
in from the boot loader (i.e. U-Boot), or read the partition
layout from flash storage (i.e. Redboot boot loader).
Even without partition support, the MTD layer provides
access to the entire flash chip as an MTD device. With partition
support, each MTD partition will be exported as a separate
MTD device. Each device has a descriptive name which can
viewed using the following command:
cat /proc/mtd
What filesystem
types are suitable for flash?
For a conventional NOR flash, the MTD block device provide
a crude block device similar to a hard disk. Traditional
hard drives use a 512-byte sector. The MTD block device
emulates this sector layout, but there is a severe performance
penalty for writes. Since most flash device sectors are
>= 64KiBytes in size, updating a 512-byte sector requires
a read-modify-write sequence for the entire flash sector!
This kind of write is slow and causes many extra erase cycles
on the flash - typically a flash sector is rated for 100K
– 1 million erase cycles over the device lifetime,
so it is wise to limit erase cycles.
Because of this write performance issue, the MTD block
device is suitable for read-only filesystems. Some typical
read-only filesystems for embedded use are CRAMFS and ROMFS.
CRAMFS has the advantage of compressing each 4Kbyte cluster,
providing 2:1 compression. Read-write capability is possible
using flash-oriented filesystems such as JFFS2.
JFFS2 is a journaling flash filesystem (hence the name)
– the ‘2’ distinguishes JFFS2 from the
JFFS filesystem, a largely defunct predecessor. JFFS2 bypasses
the block device layer (with its associated buffer cache)
and writes directly to the underlying flash device. Naturally,
JFFS2 supports both read-write and read-only modes of operation.
For a NAND flash, the filesystem MUST be NAND-aware because
both reads and writes must implement ECC error detection/correction.
The JFFS2 and YAFFS filesystems are NAND-compatible.
What is the
purpose of the MTD character device?
The MTD devices come in two flavors: MTD block device drivers,
and MTD character device drivers. The block devices provide
a 512 bytes-per-sector layout, for use by the filesystems
(HYPERLINK TO What filesystems are suitable for flash?).
The character device provides a linear view of a MTD device
or an MTD partition. You can read this device as you would
any file. Standard UNIX utilities may be used to read the
flash. Assuming MTD device 0 is the entire flash, the following
command will dump the entire flash image to a file:
cat /dev/mtdchar0 > /tmp/flash.bin
Writing the flash is different. What happens if you run
the following commands on a flash partition that already
contains valid data?
cat /dev/mtdchar0 < new.bin
cmp /dev/mtdchar0 new.bin
/dev/mtdchar0 new.bin differ: char n, line x
The MTD character device will write the data to the flash,
but it will not perform a flash erase command. On a NOR
flash device, the write command can only change 1 bits into
0 bits. To change a bit from 0 to 1 requires an erase command.
The MTD character device provides IOCTL’s to facilitate
erasing. The flash sector geometry may be determined approximately
using the MEMGETINFO command: it returns the ‘least
common denominator’ erase size (usually 64KiB or 128KiB),
ignoring the smaller boot blocks if present. The exact flash
layout may be determined using the MEMGETREGIONCOUNT and
MEMGETREGIONINFO commands. Once the flash sector geometry
is determined, the MEMERASE command may be issued to erase
the desired blocks.
MTD provides user-space applications to automate the erasing
process. The following commands will correctly write the
new image to flash. We assume that the flash does not support
locking, or the sectors are already unlocked; otherwise
the flash_unlock could be used to unlock the appropriate
sectors.
flash_eraseall /dev/mtdchar0
cat /dev/mtdchar0 < new.bin
Table 1: MTD IOCTL commands
Name Description Argument
MEMGETINFO Get layout and capabilities struct mtd_info_user
*
MEMERASE Erase flash blocks struct erase_info_user *
MEMLOCK Lock flash blocks to disallow changes struct erase_info_user
*
MEMUNLOCK Unlock flash to allow changes struct erase_info_user
*
MEMGETREGIONCOUNT Return number of erase block regions int
*
MEMGETREGIONINFO struct region_info_user *
MEMWRITEOOB NAND only: write out-of-band info (ECC) struct
mtd_oob_buf *
MEMREADOOB NAND only: read out-of-band info (ECC) struct
mtd_oob_buf *
MEMSETOOBSEL NAND only: set default OOB info struct nand_oobinfo*
How do I access the flash
from user-space applications?
If the flash is mounted as a filesystem, the normal open/close/read/write
system calls will work (obviously write() will not function
on a read-only filesystem).
Otherwise, the flash may be accessed using the MTD character
device (HYPERLINK TO What is the purpose of the MTD character
device?)
How do I upgrade
the boot loader or kernel or root filesystem?
Each component (boot loader, kernel, root filesystem) usually
has its own MTD device partition, which can be accessed
by the MTD character device. Usually the kernel is executing
instructions from RAM – although some handheld computers
do execute in flash, a.k.a. XIP (execute-in-place). When
the kernel is executing from RAM, the kernel flash partition
may be updated freely. The root filesystem is a special
case – if files are open in the root filesystem (i.e.
executables) during the update, confusion will result. Even
without open files, a root JFFS2 filesystem would get its
internal data structures out-of-sync with the flash contents.
Upgrading the root filesystem usually is done on a file-by-file
basis. Sometimes it is convenient to package the upgrade
as a .tar or .tar.gz archive.
How much redundancy
or error recovery do I need for my upgrade procedure?
Most redundancy schemes require some support from your
boot-loader. At a minimum, you should store the image in
RAM or a ramdisk and verify the image before writing it
into flash. The amount of redundancy needed depends on your
application reliability and cost requirements. Many inexpensive
Linux devices, such as the Linksys WRT54G, do not have redundant
images due to cost concerns.
How should
I create an empty JFFS2 partition?
As noted in the JFFS2 FAQ, the JFFS2 filesystem uses marks
erased blocks with ‘cleanmarkers’. The cleanmarker
was introduced to address the scenario where the device
powers down during a flash block erase. If the cleanmarker
or another node type is not present in the block, JFFS2
will redo the erase operation and write the cleanmarker
at the beginning of the block.
The ‘-j’ option to the flash_eraseall command
inserts the cleanmarker at the beginning of each block,
so that the JFFS2 won’t redo the erase operation.
How much overhead
space does JFFS2 use for its own book-keeping?
JFFS2 requires five spare erase blocks to implement garbage
collection. On a two bit-per-cell device such as Intel StrataFlash
or Spansion MirrorBit, the erase block size is 128KiB, so
the wasted space is more than half a megabyte.
The spare erase blocks requirements are defined in fs/jffs2/nodelist.h.
The JFFS2_RESERVED_BLOCK_BASE parameter is 3 by default.
If you change this value to 1, you’ll save two erase
blocks. If you change this value, you should do some stress
testing to verify nothing was broken – the default
has been left at 3 to maximize reliability.
Should I provide any
options when I mount a JFFS2 filesystem?
A useful mount option for a read-write JFFS2 filesystem
is ‘noatime’. The ‘noatime’ option
turns off the updating of file access times, which would
cause a flash write every time a file is read. If the filesystem
is the root filesystem, the option can be supplied one of
two ways:
1) Pass the following parameter to the kernel command line:
rootflags=noatime
2) Remount the root filesystem with the noatime option:
mount –t jffs2 –o remount,noatime /dev/mtdblock3
/
Is flash chip
X supported?
Most NOR flash chips are supported. In the old JEDEC drivers,
you had to add an entry for each new flash to specify the
sector layout and programming algorithms. The entry was
indexed by the Manufacturer and Device ID numbers.
The MTD CFI driver uses the Common Flash Interface (CFI).
The following description of CFI is excerpted from the latest
CFI 2.0 standard []:
The Common Flash Interface (CFI) specification outlines
device and host system
software interrogation handshake that allows specific vendor-specified
software
algorithms to be used for entire families of devices. This
allows device-independent,
JEDEC ID-independent, and forward- and backward-compatible
software support for the
specified flash device families. It allows flash vendors
to standardize their existing
interfaces for long-term compatibility.
The MTD CFI support probes the hardware for the CFI data.
The CFI data includes the chip ID, command set ID, flash
geometry, and supported command types. The MTD CFI code
supports three command sets: Intel (0001), AMD (0002), and
ST Advanced Architecture (0020). You can even compile in
support for all of these command sets in the same kernel.
Once the command set support is present, you can use any
CFI-compliant chip, assuming your low-level chip select
timings and address range are compatible with the new flash
device.
How do I add support for
my platform?
In the kernel source, the drivers/mtd/maps directory contains
the mapping drivers. You may be able to use the generic
physmap.c driver. Specify the base address, chip size, and
bus width in the kernel configuration, and the physmap.c
driver will probe the flash type. Generic memory accesses
are used to read the flash. The physmap.c driver can even
handle several flash chips in the contiguous memory area.
Some flashes do not use straightforward memory mappings,
due to external bus addressing limitations. Or you may have
more than one flash with non-adjacent memory mappings. In
this case, you should write your own mapping driver. You
can use physmap.c as a reference.
See http://www.linux-mtd.infradead.org/faq/general.html
for the official MTD Frequently Asked Questions.