How does Caltech build its server and storage infrastructure for its very custom infrared astronomy research application with literally trillions (yes, trillions) of files? We heard from Eugean Hacopians, Senior Systems Engineer at the California Institute of Technology on the September 29, 2009 edition of the Wikibon Peer Incite. Eugean described his cookie-cutter approach to building out server and storage infrastructure for their very custom application.
In order for the very small IT staff to provide server and infrastructure resources for a large number of scientists and their accompanying huge amount of infrared astronomy research data, Caltech uses a standard configuration of servers and “SAN in a box” storage infrastructure. It builds and replicates each block of server and storage infrastructure using identical configurations to provide a consistent experience for the users and to standardize their own IT maintenance and training processes.
Caltech builds redundancy into its Sun Solaris servers, using the ZFS file system, QLogic Fiber Channel host bus adapters (HBAs), QLogic 5602 Fibre Channel switches and Nexsan Technologies SATABeast storage systems. The ZFS file system handles the huge number of files that are required, and the SATABeast storage handles the volume of data required. Each block of server and storage infrastructure in its server farm has redundant components and is designed for high availability and ease of component replacement in the relatively rare event of failure. As much as possible, the configurations are identical, including RAID configurations, storage connectivity, etc. In addition, it keeps some spare switches and other components on standby as needed.
Caltech also keeps the equipment for several years in order to maximize its investment of project funding. In keeping the equipment beyond the life of the project originally requiring the equipment, it deploys the older equipment for other projects with inadequate funding to support their storage needs. Because it keeps the equipment for long periods of time, Caltech has built up its own supply of spare parts, sometimes becoming a second source for the original supplier after the production run of particular models of equipment has ended.
Action Item: Sometimes the best approach for a custom application is to provide simple, cookie-cutter server and storage configurations, so that the focus can remain on doing the work of the business and less on the IT infrastructure. In so doing, IT can maximize the use of its people, budgets, and equipment.
Footnotes: