Part 1 of 2.
Flash memory, in the form of Secure Digital cards, USB drives and other non-volatile memory, has been in the marketplace for a long time. These forms of storage have been around since the time of the first personal computer. As technology rapidly evolves year after year, it was only a matter of time before the speed and responsiveness of that hardware advanced into solid-state disk (SSD) drives. While disk storage vendors began adding Flash drives to their exiting enclosure portfolios, several new vendors went much further to bring to the market with the All Flash Array (AFA).
Historically speaking, Storage Processor-based storage vendors introduced Flash disk to the market by using them to accelerate their arrays. These enclosures often contained lesser performing storage, such as SATA disks. By adding Flash disk to the acceleration plane , this helped to keep the cost of the array down without sacrificing performance. Flash was only installed to enhance performance without holding any end-user data. 3PAR (Pre-HP) and Sun are two array vendors that come to mind when I think about the early integrators of this type of Flash storage for acceleration. Other vendors, such as EMC, began integrating Flash disk into their arrays to actually hold end-user data. With storage tiers, Enterprise Flash Drives (EFDs) provided the highest level of performance out of a mixture of Fibre Channel/SAS connected drives and SATA/Nearline (NL) SAS connected drives. The amount of Flash utilized in these pools was typically small, in an effort to keep the costs of enterprise storage enclosures manageable. This practice continues today in nearly every storage processor-based array: Flash for ultra-performance, SAS for general performance, and NL-SAS for data that doesn’t typically get accessed frequently.
The downside of having Flash drives as targets for data in storage processor-based arrays is that SSDs were not designed to deal with every type of workload. For instance, workloads that contain sequential writes typically perform worse on Flash disks than on spinning drives. In other words, in places where you would expect Flash to substantially enhance performance like in SQL or Oracle database workloads, the performance is actually worse, or at least less than you expected. This performance problem exists for a few reasons: first is the architecture of the drives that are often used for data in storage processor-based arrays; and second is that many storage processor-based arrays have been adapted for SSDs rather than built from the ground up for them. This presents a challenge for storage architects and administrators who are trying to get the most out of their new storage array, which often is a substantial investment out of the IT budget. In practice, we at LPS get the occasional performance-based support issue where we find that Flash is being utilized to sustain a workload where it will not perform well. We have also spent a substantial amount of hours with customers helping to redesign RAID groups and pools to support the workloads that Flash is designed for. This costs money in not only hours for engineers’ time, but also the hours that full-time storage administrators spend with us to create a new layout, work out a migration plan and then implement the plan.
As the check-writers in organizations continue to press for getting the most out of their investments, storage vendors have also integrated data-reduction features into their arrays. New words were invented as well, like deduplication (two years ago my spell checker would have flagged this word!). Nearly all of the big players in the storage market have added thin provisioning (allocating whatever you want in storage, while only consuming disk space for in-use data), deduplication (storing only single instances of duplicate data), and compression (attempts to save space utilized by a dataset).
Thin provisioning on many storage processor-based arrays typically causes a reduction in performance, and sometimes that performance reduction is significant. A heavily utilized storage pool full of thin-provisioned storage has been known to go offline unexpectedly in our experiences. This happens because the array has to work extra hard to turn that thinly-provisioned data into readable content by the resource that is requesting that data. Deduplication and compression in storage processor-based arrays are almost always done after data has been written, such as on a schedule or triggered by an event. Having these processes run after data has been written consumes resources that are often required to get the expected performance of the array.
The aforementioned issues are problems that the AFA has solved, making the life of the storage architect, storage administrator or storage integrator much easier. For the most part, the AFA does not care what type of workload you are writing to it. The AFA has an absolutely insane amount of performance to it, exceeding the performance that a storage processor-based array the size of a couple of rows in a data center can provide. Data reduction services like thin provisioning, deduplication and compression are all included and operate very efficiently. In some cases, even the footprint is smaller, saving often-precious datacenter rack space.
In my next blog, I’ll get deeper into the systems available on the market that are changing the way we design, implement and consume disk storage in Information Technology.
Part 1 of 2.
Brian Ethington, Engineer