Big Data

From Wikibon

Revision as of 20:20, 15 June 2012 by Stu (Talk | contribs)
Jump to: navigation, search

Contents

Curated Original Big Data Analysis Big Data From Wikibon

An archive of big data analysis, coverage, and content marketing news and information from the team at Wikibon and SiliconANGLE.

The Definition of Enterprise Big Data

Big Data is emerging from the realms of data science projects to help companies understand exactly, make decisions, and act in real-time to better serve their customers and target markets. The IT techniques and tools to execute big data processing are new, very important and exciting.


A Big Data Manifesto from the Wikibon Community

Providing effective business analytics tools and technologies to the enterprise is a top priority of CIOs. Effective business analytics allows organizations to extract insights from corporate data to deliver higher levels of efficiency and profitability to the enterprise. Underlying every business analytics practice is data.


Big Data Market Size and Vendor Revenues

The Big Data market is on the verge of a rapid growth spurt that will see it top the $50 billion mark worldwide within the next five years. The analysis in this article highlights Wikibon’s five-year forecast for the Big Data market as a whole


Comparison of Big Data MPP Solution & Data Warehouse Appliance

Wikibon research in defining big data, to differentiate big data projects from traditional data warehousing projects and to look at the technical requirements. This Wikibon article looks at the business case for big data projects and compares them with traditional data warehouse approaches.


Hadoop: From Innovative Up-Start to Enterprise-Grade Big Data Platform

Consider the apocryphal saying, “Nobody ever got fired for buying IBM,” but replace IBM with Hadoop. Sounds almost fanciful, no? But making that statement a truism is now the mission of every Hadoop distribution vendor.


Infographics on Big Data from The Wikibon Project

Taming Big Data | A Big Data Infographic

Big Data can be a beast. Data volumes are growing exponentially. The types of data being created are likewise proliferating. And the speed at which data is being created – and the need to analyze it in near real-time to derive value from it – is increasing with each passing hour.


Data Science & The Role of the Data Scientist

While the concept of data science has been around for decades, the notion of a data scientist has become an in-demand career leading to a rise of a new generation of data scientists. Social media platforms such as Facebook depend on data science to create innovative, interactive features that encourage users to get interested and stay that way. Here is a visual walk-through of this concept and the role of the data scientist.


The Rapid Growth in Unstructured Data

What’s critical to realize is that 35% more digital information is created today than the capacity exists to store it; and this number will jump to over 60% over the next several years. Most of these gigabytes of data will pass through the servers, network, or routers of an enterprise, which becomes responsible at that moment for managing that content, protecting user privacy, watching over account information, and protecting copyright.


Big Data Video from theCUBE and SiliconANGLE.tv


Boston: Big Data's Beast in the East

Boston is quickly emerging as the Hub of the Big Data Universe, and we've got the goods to back up the claim. Wikibon's Dave Vellante, Jeff Kelly and Stu Miniman recently paid a visit to Atlas HQ to talk with start-ups, VCs and Cowen and Company's Peter Goldmacher about the burgeoning Boston Big Data scene, all captured live inside theCUBE.
Watch the full video here.


Exclusive In-Depth Interview with O'Reilly Media Founder and CEO Tim O'Reilly

O’Reilly Media Founder and CEO Tim O’Reilly goes live inside theCUBE with Wikibon’s Dave Vellante and SiliconANGLE’s John Furrier at the Strata Conference 2012 to discuss the developing Big Data ecosystem, the role of data in society and more.
Watch the full video on SilconAngle.tv or on YouTube


Jeff Hammerbacher on What It Takes to be a Data Scientist

Companies are starting to differentiate themselves via new and innovative analytics and applications built on top of Hadoop, said Jeff Hammerbacher, Cloudera Co-Founder and Chief Scientist. Hammerbacher also discusses how he coined the term and the role of Data Scientist live inside theCUBE from Hadoop World 2011.
Watch the full video here.


Bit.ly Data Scientist Hilary Mason on Big Data Innovation

Hilary Mason, Chief Scientist of Bit.ly, goes live inside theCUBE at Strata Conference 2011 to chat with John Furrier of SiliconAngle and Dave Vellante of Wikibon about the opportunities of leveraging Big Data for analysis to create new products. SHe also discusses the evolving role of the Data Scientist.
Watch the full interview here.


Twitter’s Marz: Twitter Balances Storm and Hadoop

Nathan Marz, lead engineer at Twitter, goes live inside theCUBE at Strata Conference with Wikibon’s Dave Vellante and SiliconANGLE’s John Furrier to discuss Twitter’s use of both Hadoop, for historical Big Data analysis, and Storm, which Marz authored, for real-time streaming Big Data analysis.
Watch the full video here.


HP Vertica's Mahony on Flexible, Columnar MPP Data Warehousing

Because it was built from scratch, Vertica’s columnar, MPP data warehouse is more flexible than competing offerings based on the open source, relational database Postgres, said Colin Mahoney, VP of Products and Business Development at Vertica, now an HP company. Mahony, now Vice President and General Manager of HP Vertica, spoke live from HP Discover 2011.
Watch the full interview here.


Hortonworks' Murthy on Next Generation MapReduce

Hortonworks' Arun Murthy goes inside theCUBE at Hadoop World to discuss the improvements in Next Generation MapReduce, part of Hadoop-0.23. Results are more flexible data processing capabilities.
Watch the full video here.


Cloudera’s Olson: Pace of Big Data Market Growth “Overwhelming”

The next two years are going to be transformative as to how enterprises collect and use data, said Cloudera CEO Mike Olson. Speaking live inside theCUBE with Wikibon’s Dave Vellante and SiliconANGLE’s John Furrier, Olson also talked about the real-world business problems Cloudera’s customers are solving today with the help of Hadoop and complimentary Big Data tools.
Watch the full video here.


Hadoop Creator Cutting and Tresata’s Mehta go inside theCUBE at Hadoop World 2011

Hadoop creator and Cloudera Chief Architect Doug Cutting goes live inside theCUBE to discuss the evolution of his brain child. He’s joined by Abhi Mehta, founder of Big-Data-as-a-Service provider Tresata, who talks about Hadoop in financial services.
Watch the full video here.


Big Data Presentations and Full Event Archive

Big Data? No. Big Decisions are What You Need. From Interop Las Vegas 2012

A look at what Big Data is, the market opportunity, some customer use cases and how users should think about taking advantage of the opportunities. Especially targeted for explaining to an audience of infrastructure practitioners that the design requirements are different than traditional data center technologies.
View or download the full presentation here.

Additional Big Data Analysis

Big Data Blog Posts from the Wikibon Community

ServicesANGLE Coverage on Big Data

Additional Links and Resources

A Big Data Definition

Big data has the following characteristics:

  • Very large, distributed aggregations of loosely structured data – often incomplete and inaccessible:
    • Petabytes/exabytes of data,
    • Millions/billions of people,
    • Billions/trillions of records,
    • Loosely-structured and often distributed data,
    • Flat schemas with few complex interrelationships,
    • Often involving time-stamped events,
    • Often made up of incomplete data,
    • Often including connections between data elements that must be probabilistically inferred,
  • Applications that involved Big-data can be:
    • Transactional (e.g., Facebook, PhotoBox), or,
    • Analytic (e.g., ClickFox, Merced Applications).

Components of Big-data Processing

Big-data projects have a number of different layers of abstraction from abstaction of the data through to running analytics against the abstracted data. Figure 1 shows the common components of analytical Big-data and their relationship to each other. The higher level components help make big data projects easier and more productive. Hadoop is often at the center of Big-data projects, but it is not a prerequisite.

Action Item:

Footnotes:

Personal tools