Contents |
Highlights
When your business depends on fast download, storage, and analysis of large amounts of data, where do you turn? For GrayHair Software, a New Jersey-based, highly leveraged direct mail service house with about 20 employees that numbers several large banks among its clients, the answer was 3PAR virtualized storage arrays.
GrayHair tracks the progress of large direct mail campaigns through the USPS system to their final destinations, working closely with the letter shops that bar code, address, and mail the pieces. It provides the companies behind these campaigns with the data they need to refine the arrival dates of their repetitive mailings for maximum impact. A typical mailing might have 100 million pieces, each of which has a bar code printed on it that is scanned when it enters the USPS system and every time it is sorted until it is delivered. Thus a mailing can generate 400 million records or more, all of which GrayHair has to download and load into its SQL Server database. And that is just one mailing. The company is usually tracking 30-40 such mailings at a time for as many customers, and its business is expanding.
The business issue, explains IT Director Tom Howard, is that in exchange for a special third-class mail rate the mailer gives up the fine control over delivery timing that comes with first class mail. Whether these mailings are promotions from a retailer or monthly statements from a large financial institution or utility, the mailer often wants to time the arrival of its mailing at homes carefully. For instance, a national retailer planning a sale wants its mailing to arrive at the optimal time – far enough ahead of the sale to give shoppers warning but not so far ahead that they lose the flier and forget the event.
The customer wants to see when and where each batch is mailed and then how quickly they move through the system to reach their destinations. This data shows them the performance of their vendors and of the various routings which can help them predict when mail will arrive and improve their strategy for the next mailing. GrayHair provides the reports that show this information, with drill-down to the raw data. It also identifies undelivered pieces and scrubs the customer's mailing list to eliminate or correct bad addresses to improve the percentage of mail delivered on the next mailing. GrayHair also reconciles the customer's postage bill with the corresponding mailing.
To provide customers with the latest data, GrayHair conducts hourly data downloads from the USPS 24X7 for each of the 30-40 mailings it tracks at any given time. This data is loaded into the main database (10 Tbytes and growing rapidly) and then analyzed. The results of the analysis are maintained in a separate summary databases which are also growing rapidly.
Original Systems and Pain Points
GrayHair's customers want the results available online within 15 minutes of the download. The technical storage challenge is throughput in an unpredictable environment that is difficult to tune; GrayHair cannot predict what data will be downloaded or what analysis will be done when by which customers. The storage technology challenge, Howard says, is simple; blazing throughput performance that is self-tuning.
The business challenge was ensuring that IT in general and storage in general did not get in the way of growing the business. “Our president is a tech guy, and he recognized that we needed a system that could scale quickly and easily as we grow the business, so we do not become a bottleneck,” Howard says. Customer service is GrayHair's primary market differentiator.
GrayHair originally had devices from several suppliers, and they could not keep up with the requirement to provide the latest data within 15 minutes of the download. Therefore, GrayHair had to find a better solution before customers started complaining or worse, taking their business to a competitor.
In 2004, GrayHair tested several systems from leading suppliers, including 3PAR, Hitachi and IBM using Iometer to measure the top database loading performance they could deliver. The 3PAR system provided two-to-three times the performance and throughput in handling the large data updates GrayHair receives hourly. The reason for this advantage was 3PAR's virtualization architecture that spread the data across multiple disk drives, and dynamically self-tuned the performance to keep the optimal throughput. Although today there are alternative solutions that could match the throughput, 3PAR still has a significant edge in the quality and automation of its virtualization self-tuning management system.
Solution strategy
As a result, the company purchased its first 15 terabyte 3PAR system in 2004. In December 2007 it took possession of a second, 35 terabyte 3PAR array, bringing its total storage capacity to 50 terabytes, of which is 60%-70% is allocated. Howard expects to update again in mid-2009. This aggressive growth rate is driven in part by the huge volumes of data it downloads and also by business expansion that includes the introduction of a new product suite and growth of its sales group to sell into new vertical industries this year.
In November 2007 GrayHair migrated its hotchpotch of servers to 50-60 virtual machines using VMware in its main location in New Jersey. Notably, this consolidation eliminated a rack of servers, making room for the new 3PAR array. These servers handle the internal business applications, the FTP servers that deliver data to its customers, and an Exchange server. Interestingly, the main SQL database that hold the USPS data and summaries are not thinly provisioned because it is easy to manage manually and doesn’t warrant the expense of thin provisioning software. However, GrayHair plans to use thin provisioning for the smaller applications running on the VMware virtual systems that access data in a partition on its 3PAR array.
Howard says adoption issues were minor, with no unexpected or unusual events, and that 3PAR has provided excellent service. Asked whether he would consider other vendors for the next system expansion, he said that unless someone offers some incredible new technology he anticipated staying with 3PAR. “They have been good to work with, the devices are easy to manage, I can allocate storage to different machines easily, and it has a good management interface,” he says. “The only decision with 3PAR is deciding between RAID 1 versus RAID 5. So overall, 3PAR gets an A+ from us.”
GrayHair is now planning a relocation of its DR site closer to its main location in New Jersey, in part using servers replaced by the VMware installation. When it has this set up, it will look into data replication solutions. Howard plans to look at all the possible candidates for this and is not sure if he will use 3PAR replication technology.
Conclusions
“We never know when a customer may want to look at the latest data on his mailing, or exactly how he will want to analyze that data,” says Howard. “Analysis is pretty much ad hoc. So yes, the 3PAR solution is expensive, but the lower-cost, slower storage systems of other vendors simply do not cut it in our environment. If we bought a lot more less-expensive storage and spread the data over twice the spindles, we still couldn't get close to the performance and throughput 3PAR gives us.”
Action Item:
Footnotes: Legal: © Wikibon 2009. This document is copyright protected by Wikibon and does not fall under the GNU general license terms for Wikibon.org. Links to this article from external sources are allowed, however any other re-distribution of this content for commercial purposes is strictly prohibited. Please contact Wikibon for more information. Wikibon case studies are developed independently and their development is not initiated for or funded by any single company. Wikibon reports actual customer experiences and results with no attempt to emphasize any one vendor’s strengths or weaknesses. Read the full disclaimer.