Last week, O’Reilly Media held the inaugural Strata “Making Data Work” Conference. The buzz of the conference was well beyond the 1400 people on-site (originally capped at 1000). SiliconAngle and Wikibon covered the event with a live video broadcast with over 30 guests (see the videos here). This wave has strong ties to the other big technology trends of the day: cloud computing, mobile and social. Big data is still nascent, so the definitions and boundaries of what is and isn’t big data are up for debate. What is clear is that a lot of people and companies believe that we can flip the challenges of the explosive growth of data into opportunities by creating new products and services.
Remember when a gigabyte was a lot of data? Now even consumers have terrabytes of information, big companies are storing petabytes and the industry is measuring things in exabytes and zettabytes. So, what is the difference between a lot of data and “big data”? David Floyer has outlined a definition of Enterprise Big Data, we welcome the community to help us refine this definition (just hit edit or add a comment). The inflection point is that big data requires companies to adopt new architectures to capture, manage and analyze data to realize business value. The nature of the data may be real-time, tends to be loosely structured and is often distributed. Unlike traditional enterprise data, the uses of this data may not be fully known at the time of collection and could change over time. We are at the point where machines are creating more data than people, such as from mobile phones and sensors. Cloud computing comes into the picture by allowing 100-1000x faster processing of data compared to a typical data center. This large amount of information and processing allows for what Scott Yara of Greenplum referred to as data heroics. An example of this is Google Flu Trends, a data visualization that analyzes a large amount of data to create a mashup that is very easy to consume. Here’s a short video with Gary Orenstein of The Cloud Computing Show discussing what big data is and why now:
The rockstars of the Strata Conference were the data scientists. They are the ones helping to create the new products and services. Just as computer science majors were popular for the Web 2.0 trend, statisticians could be the new “sexy” field of study. The impact of analytics and harnessing the power of information should be applicable across many sectors. Government agencies have massive amounts of data and with initiatives such as Apps for Democracy, there is opportunity for communities to find innovative ways of putting publicly available information to new uses. Healthcare is another candidate to leverage information to increase decision making, but privacy and competition are impediments. Companies like bit.ly (interview with Chief Scientist Hilary Mason) are working on creating offerings to harness the real-time flow of data on the social web. Enterprise companies should also be looking for new ways to covert information into business value. As more companies leverage mobile, web and social data, privacy of the consumer is a concern, so don’t let Dogbert set your policies.
While Strata Conference had a lot of startups and academics, many large companies including Microsoft, Google and Amazon presented keynotes. Analytics companies have become a hot commodity; last year, IBM acquired Netezza for $1.7B and EMC acquired Greenplum (price not disclosed). Oracle is a primary competitor in this space, and while HP has been a big Oracle partner, as its new CEO Leo Apotheker looks to grow the software portfolio, perhaps a big data related acquisition (is Teradata too big at $7B market cap?) would make sense. For enterprise and cloud storage companies, the discussion goes beyond storing the information and helping to extract data from it.
The energy and excitement from the Strata Conference was palpable. Big Data builds on the waves of Web 2.0 and cloud computing, both of which had plenty of hype. It is exciting to think about how companies and industries could be transformed by unlocking the power of information. The big data revolution will be broadcast (and covered) by SiliconAngle and Wikibon, we are always interested in feedback and stories from the community – let us know your thoughts.
The future belongs to those who understand how to collect and use their data successfully.