#memeconnect #emc
The term “big data” has become all the rage in the storage, database, and analytics industries. But what exactly does it mean. At Monday's EMC Corp. analyst briefing, which preceded EMC's U.S. announcements on Tuesday and European announcements on Wednesday, Wikibon.org Co-Founder and CEO David Vellante asked two EMC experts that question on SiliconAngle.TV.
EMC Chairman, President, and CEO Joe Tucci suggested that big data is best defined by example. “Big data would be the mass of seismic data an oil company accumulates when exploring for new sources of oil,” he said. “It would be the imaging data that a health care provider generates with multiple MRIs and other medical imaging techniques. It is the data that supports the rendering of video in 3D movies.” These, he said, are only a few of the many examples of big data in different industries and applications.
“The important thing is that this is petabyte scale from the start and grows in huge chunks to multi-multi-multi petabytes. How do you handle that, manage that, and store that more economically, more efficiently?” It is data that does not come from transactions and is not measured in I/Os per second.
Sujal Patel, president and founder of Isilon, which EMC has purchased for an estimated $2 billion to add to its growing stable of home-grown and purchased big data technologies, had a more generalized operational definition. He defines “big data” as data blocks that require a new storage architecture, either because of their size, performance constraints, distribution constraints, and/or presentation requirements.
One major reason it demands a new architecture, he says, is that the overall data growth rate is far outpacing the growth of disk-drive storage density, forcing companies to spread the data over much more hardware than previously. At the same time, “performance of Intel servers is exceeding Moore's Law. Applications are leveraging that performance with technologies like virtualization and newer applications based on clustered computing” which is spreading applications over entire data centers.
The result, he says, is a need for scale and performance levels that exceed the capabilities of traditional architectures. “You've got to move to what we call scale-out architectures.”
Isilon is one of the pioneers developing these new architectures. Mr Patel's basic breakthrough, which grew out of his five-year experience with streaming media pioneer Real Networks, was to create a single global name space and clustered file system across multiple storage servers. A user can store a billion big data files across that single file system or allocated it in large chunks – for instance 100 Tbytes each – to different users or applications.
Whatever definition is used, says Mr. Vellante, it definitely is an important part of EMC's growth strategy for the coming decade. At the same time it has added Isilon to its stable, EMC is signaling that it will move its home-grown Atmos storage technology into greater prominence. Atmos is a Hadoop-style big-data architecture in which analysis applications are moved to the data rather than the traditional approach of “pushing the data through a pipe into a data temple, a big box.” Last year EMC brought Greenplum, a big data analysis system developer, and of course it purchased an 80% share in VMware, making EMC the largest player in the booming application virtualization market.
Mr. Tucci says that with Isilon and Atmos, EMC now has “the best two technologies to address big data.”
Mr. Vellante sees this as an important growth strategy, supplementing the VMware purchase. Since Mr. Tucci joined EMC in 2002, it has grown from a $5.2 billion revenue base to close to $17 billion. Today it faces a challenge in transitioning to the next computing generation, cloud computing, and the strategic acquisitions he has made and continues to make are designed to position EMC for that transition.
“We are moving into a new wave of technology with cloud computing,” says Mr. Vellante. “Each of those waves has brought in new players, new leaders, and he [Mr. Tucci] intends to make EMC and VMware one of them.”
Action Item:
Footnotes: