An SSD stores data on semiconductor fabric. Currently, the most popular technology for SSDs is NAND Flash, a non-volatile memory (NVM) component. Based on a single transistor per bit of data, NAND comes in a few variations:

  1. The single layer cell (SLC)
  2. The multi-level cell (MLC)
  3. The triple-level cell (TLC)

NAND Flash can include more than three levels, but with additional levels the data access slows and reliability is compromised.

Today, NAND Flash dominates the SSD landscape, with forecasted growth expected to go from 100Exabytes (EB) in 2016 to 750EB in 2020. The SSD business includes five companies manufacturing NAND Flash memories, including Samsung, SK Hynix, and Micron; 19 companies developing SSD application specific integrated circuit (ASIC) controllers; and more than 160 companies integrating the controllers in SSD devices.

Unlike the HDD business, the barrier to entry for a new SSD business consists of assembling all the parts and selling a complete product. Third-party integrators buy off-the-shelf components; i.e., they purchase a controller from one of the 19 companies developing them and combine this with Flash from one of the big five. They then develop software or firmware on top, encapsulate the whole into a package, and sell it as a finished storage device.

SSD technology

An SSD has no moving mechanical parts, a fundamental difference from the traditional electromechanical storage paradigm. Instead, an array of NAND Flash memory cells makes up the storage media. Since there are no spinning mechanical parts, the SSD operates much faster than traditional HDD devices, and is virtually free of the data access latencies present in electromechanical storage devices. Additionally, solid-state storage consumes less power, produces less heat, and runs quietly with no vibrations during operation. SSDs are more resistant to physical shock, and data is not erased by proximity to magnetic sources. All combined, these characteristics contribute to their significantly higher reliability than HDDs. The typical mean-time-between-failure (MTBF) of an HDD hovers at around one million hours, compared to well over two million hours for an SSD.

There are, however, a few downsides to SSD technology. Although there are no moving parts inside an SSD, each memory cell has a finite life expectancy—namely, a limit on the number of times it can be written to and read from before it stops working. This is due to the physics of the writing/erasing mechanism of the semiconductor cells. Logic and firmware built into the drives dynamically manage the SSD operations to minimise problems and extend their life.

The programming process for a Flash cell requires the use of high voltage to charge/discharge the floating-gate transistor sandwiched between the control-gate transistor and an oxide layer. Charging the floating gate is referred to as programming the cell and equates to a logic "0." Discharging it is referred to as erasing the cell and equates to a logic "1." Over time, the charging/discharging or erasing/programming cycles via high voltage stresses the oxide layer and shortens the cell life. This leads to a finite number of erase/write cycles of the NAND Flash.

Further, SSD devices have higher per-gigabyte prices than electromechanical storage devices and generally support smaller capacities.

Data stored on a NAND Flash fabric is organised in a hierarchical structure. From the bottom up, the memory cells are organised in strings, pages, and blocks. Strings are typically comprised of 32- or 64-NAND cells and provide the minimum readable units. Multiple strings make up a page, which typically includes 64K or 128K cells, referred to as 2kilobytes (Kbytes), 4KBytes, 8KBytes, etc. Pages are the minimum programmable units, and multiple pages form a block, which is the minimum erasable unit. Currently, the maximum pages per block are approaching 512 and block sizes are reaching 8megabytes.

SSD devices possess a few characteristics that set them apart from HDD devices. Among the most important are write amplification, wear levelling, garbage collection, and performance degradation over time. Other issues such as overprovisioning and Self-Monitoring, Analysis, and Reporting Technology (SMART) add complexity to the drive.

Write Amplification: In an SSD, NAND Flash cells must be erased before they can be rewritten again. The net result is that data is rewritten multiple times, thereby increasing the number of program/erase cycles over the life of the device. This leads to a write amplification issue that decreases the lifespan of the device and impacts performance.

Wear Levelling: Unlike in an HDD where data can be written over and over in the same location on the disk surface without causing any problem, new data is written in an SSD to different NAND cells for the purpose of wear levelling. The physics overlooking the programming mechanism of a NAND cell impose erasing the cell prior to writing new data. Since the smallest erasing unit is a block, typically made up of 32 to 64 pages, erasing a block is time consuming. Due to the limited number of erase/write cycles inherent to the technology, also known as endurance, erasing/writing a block repeatedly without involving other blocks would wear out that block before all other blocks, prematurely ending the life of the SSD. To avoid the catastrophic event, SSD controllers use a technique called wear levelling that distributes writes as evenly as possible across all blocks in the SSD. The technique is based on complex algorithms implemented in firmware that require exhaustive testing in the development stage.

Garbage Collection: Another difference between an HDD and an SSD concerns the deleting of a file on the host computer.

In an HDD, deleting a file in the OS running on the host leaves bits on the hard drive. Since writing a file on empty blocks in an SSD is faster than on a written block, a TRIM feature automatically deletes the entire file's data as soon as the file is deleted in the OS running on the host. As the SSD fills up, fewer and fewer empty blocks become available. In their place are partially filled blocks. The SSD cannot write the new data onto these partially filled blocks since it would erase the existing data. Instead, the SSD reads the data of the block into its cache, modifies the old data with the new, and then writes it back.

Garbage Collection is the process of reclaiming previously written blocks of data so they can be rewritten with new data. It is implemented in algorithms mapped in firmware in the SSD controller and critical to the SSD's operations.

Performance Degradation: In a brand new SSD device, all NAND Flash cells have never been written, and their floating-gate transistors have never been charged. In other words, they include logic "1s." When deployed, the erase/write process begins. At this stage, the performance measured in input/out operations per second (IOPS) exhibits its maximum value.

20170613_SSD_01 Figure 1: The degradation curves for eight types of SSD devices are obvious from the chart. Source: Tom's IT Pro  

As the workload increases, the SSD controller is forced into an erase/write cycle for every pending write operation. Due to this process, the SSD performance gradually drops until it finally settles into a steady state. The chart shown in Figure 1 shows the degradation curves for eight types of SSD devices. IOPS in a steady state is typically less than 50% and as low as 5% of its maximum value when the device is new.

SSD controller

The conceptual simplicity of the NAND Flash fabric belies the complexity of the SSD operations. The burden to manage the SSD's uniqueness and peculiarities falls on the SSD controller.

Figure 2 represents the block diagram of an SSD controller. It encompasses six sections: host controller interface; controller System-on-Chip (SoC); NAND Array; DDR RAM for caching both user data and internal SSD metadata; and, the most critical component, system firmware stored in large SRAMs, plus a NOR Flash.

20170613_SSD_02 Figure 2: The SSD controller includes a host controller interface, controller SoC, NAND Array, DDR RAM, system firmware, and NOR Flash. Source: Mentor Graphics  

Host interface: The host interface of an SSD controller is typically based on one of the industry standard interface specifications, the most popular being SATA/SAS and PCIe. Since the host interface is the performance bottleneck, the preferred standard is PCIe. An emerging standard called Non-Volatile Memory express (NVMe), using PCIe as the fabric, provides the fastest performance.

Controller SOC: The Controller SoC is built around a CPU/RISC processor complex that may include multiple processors. All the processors communicate with each other but perform different tasks, such as managing the PCI traffic, read and write caching, encryption, error correction, wear levelling, and garbage collection, to name a few.

Firmware: The firmware is the most complex part of the controller. The embedded microcode manages all the operations that set the SSD apart from the traditional HDD. It implements the algorithms that perform the Garbage Collection, the wear levelling and several other tasks. If not properly designed, the algorithms may affect efficiency, reliability, and performance of the SSD. A major cause of data loss in SSDs is due to firmware bugs.

The firmware is a challenge faced by third-party integrators. Attracted by the growing business opportunities, they purchase a few off-the-shelf components, assemble them on a PCB, add firmware and assume the drive is ready to be sold, only to discover that the controller does not work as expected or not at all.

Vis-à-vis the complexity of the firmware, the SSD hardware is not particularly large when compared with designs in other market segments like networking or processor/graphics. An enterprise-level controller design with loads of functionality implemented in hardware may reach a capacity of 60-100 million gates. Meanwhile, a client-level controller in which most of the functionality is realised in firmware may have a gate count of 20 million.

Next: Optimising SSD development »