Big Data in 2012: Hadoop, Big Data Apps, Data Science Tools, Cloud Collision and More

For sure, Big Data was a Big Deal in 2011. With the New Year fast approaching, here’s Wikibon’s take on what to expect in 2012.

    • 2012 Will Be the Year of Big Data Applications. Thanks to the intense competition between The Big Three distribution vendors, Hadoop developed rapidly in 2011 and is, by most accounts, enterprise-ready (there are always areas for improvement, of course, notably around Hadoop’s single point of failure issue.) This, along with readily available capital, will result in significant innovation from both existing and new start-up Big Data Application vendors now confident that Hadoop is for real. Expect to see new vertical Hadoop-based Big Data Applications for healthcare, retail, financial services and manufacturing in the year ahead, as well as horizontal applications focused on human capital management and enterprise resource planning. Adoption will start slow, but for traditional enterprises, Big Data Applications are the key to realizing impactful business value from Hadoop. 2012 should be a good year on this front.

 

    • Analytic Platform Vendors Add Improved Functionality, Social Capabilities for Data Scientists. Analytic Platforms are to Data Scientists what playgrounds are to five-year-olds: They’re both areas where exploration and socialization should occur. That means a useful analytic platform is one that a) allows Data Scientists to manipulate data with the tools of their choice and b) makes it easy for Data Scientists to collaborate with colleagues and share findings. As it stands now, most analytic platforms don’t meet these requirements, meaning Data Scientists often must rely on multiple tools to perform data analysis and manual efforts to share and collaborate with each other. In 2012, analytic platform vendors will take steps to rectify these shortcomings by adding support for more data analysis tools and languages, such as R and RapidMiner, as well as social collaboration and visualization capabilities.

 

    • The Cloud and Big Data Collide. A key concept behind Hadoop is that it is more efficient to process Big Data where it lives (distributed data processing) than to collect and process the data in a central location, such as an enterprise data warehouse. It follows then that the cloud is an ideal environment for Big Data processing and analytics, as much of the social media and other unstructured data that enterprises want to mine for insights lives in the cloud. Instead of bringing all that data into on-premise data centers, enterprises that lack the experienced engineers, developers and Data Scientists needed to deploy and exploit Hadoop will increasingly take advantage of Big Data Analytics in the cloud. In 2012, expect to see increasing interest from traditional enterprises in cloud-based Hadoop deployments and Big-Data-as-a-Service application offerings, as well as the emergence of start-ups offering such services. An important caveat is that there are still significant data movement challenges that need to be addressed, namely the difficulty associated with moving large volumes of data from internal data centers to external cloud-based Big Data environments and back.

 

    • Big Data Appliances Gain Steam. With a notable exception or two, the general consensus around Hadoop and MPP data warehouses is that the two Big Data approaches are complimentary, not mutually exclusive. As such, preconfigured Big Data appliances that pull the two together into a single environment usually deployed on commodity hardware are likely to enjoy a surge of interest in 2012. The main benefit of Big Data Appliances is that they make it easier to move data between Hadoop and MPP data warehouses, allowing users to perform analysis on both structured and unstructured data. Their single SKU, Big-Data-in-a-Box approach also makes Big Data Appliances easier to deploy than roll-your-own Big Data environments. On the downside, Big Data Appliances from the mega-vendors are not cheap, especially those that run on proprietary hardware, nor are they easy to customize. Still, thanks to the their ease of deployment and ability to bring together structured and unstructured data, expect to see more Big Data Appliances hit the market in 2012 from the mega-vendors, as well as appliances the result of partnerships between hardware and Big Data software vendors.

 

    • Industry Responds to Big Data Skills Gap with Training and Education Resources. Perhaps the biggest obstacle standing between Big Data and traditional enterprises is the lack of skilled Big Data practitioners. This includes both engineers and developers with experience deploying and managing enterprise-scale Hadoop clusters, as well as Data Scientists with the skills needed to analyze and derive impactful insights from Big Data. Vendors and industry groups started addressing this issue with the establishment of a handful of Big Data education and training courses in 2011, but expect to see a much more concerted effort in 2012 to permanently close the Big Data skills gap. Specifically, expect to see the vendor community come together to establish vendor-neutral Big Data training and education resources through partnerships with one another, with industry associations, and with colleges and universities.

 

  • The Big Data Privacy Discussion Begins In Ernest. In 2012, Big Data will make its way not just into mainstream enterprises, but also into the mainstream public’s consciousness. As Facebook’s recent privacy dispute with the European Union illustrates, the more the public understands about who is collecting their personal data, the more questions they have about how it is being used. While we in the industry tend to focus on all the fantastic new products and services Big Data enables, the public at large is equally concerned with issues like identity theft, being bombarded with ads, and even the potential for insurers and financial institutions to consider a person’s social media data when making decisions about insurance policies and home loans. Expect to see even more media coverage on the privacy implications of Big Data in 2012. The industry must be proactive about addressing these issues in the coming year to prevent a Big Data Backlash.

 

So what do you think? Let us know if you think our predictions are spot on or way off. And share your thoughts on the year ahead in Big Data. You can leave comments below, post on our Facebook page, take the conversation to Twitter (@wikibon) or even write your own Big Data predictions post on our wiki.

With interest in Hadoop and all-things Big Data about to explode, rest assured Wikibon, SiliconANGLE and theCUBE will have it all covered in 2012. Happy Holidays.

Share

, , , , , ,