A discussion with the California Institute of Technology's (Caltech's) Infrared Processing and Analysis Center (IPAC) evokes memories of Carl Sagan. While Caltech has some unique data attributes that are not necessarily widely applicable to many CIO’s, organizations facing enormous growth may be able to learn a few things from the Caltech case example.
At IPAC, it's not so much capacity growth that's the challenge--even though growth comes in large 100TB chunks that present some non-trivial issues. Rather it's the number of files (many billions and even trillions) that created the primary challenge for the organization-- reliable recovery.
As such, Caltech architected its infrastructure to address recovery ahead of other requirements. This was not necessarily an obvious strategy based on the initial requirement to analyze and house massive amounts of telescopic mission data. But after thinking through the type of data and data flows, Caltech realized the main challenge was not so much how to scale for capacity but how quickly it could recover from a data access problem.
The nuance of managing many small files places unique requirements on IT, especially with respect to recovery. This is a main reason that Caltech chose not to go with a large centralized SAN and instead architected a series of mini storage nodes using a building block approach around Sun servers, ZFS, QLogic switches and Nexsan SATABeast arrays.
The lesson here is that as requirements are established, they go through many revisions. Often IT organizations (ITOs) are rightly concerned with scope creep, but it's imperative that ITOs don’t take the requirements as the Bible. Organizations need to decode requirements and think through the implications of architectural choices. After doing just this, Caltech chose to go with a Lego building-block approach. While this may not be appropriate for all use cases, the point is that for Caltech, this dramatically simplified recovery and reduced the risk of losing access to a centralized storage archive.
Action Item: The number of files or volumes, more than amount of capacity, will often be the gating factor in terms of managing storage growth. Especially in exceedingly high growth environments, organizations must understand how data is created, processed, accessed and protected in order to truly meet business requirements. Taking a building block approach, where standardized server, switch and storage components are used, can simplify infrastructure and create commonality across applications which lowers risk and costs.
Footnotes: