Metadata For Big Data And The Cloud

Become a Member!

Why Register?

Login

Featured Research

Announcements

Technology Events

Home Profile Peers Wiki Activity Groups Feedback

Metadata for Big Data and the Cloud

Currently 5/5 Stars.
1
2
3
4
5

rate this

Last Update: Nov 05, 2010 | 09:38

Viewed 10663 times | Community Rating: 5

Originating Author: David Floyer

You are driving with the family on a long journey at 7p.m. Your dashboard computer displays a selection of restaurants and hotels that meet your budget, culinary preferences, and location, with a special offer for a family room. There is an attractive offer from a hotel if you drive another 20 miles. Behind this display is derived from a large amount of data put into context – and the only way to provide such information cost effectively is to use metadata inferences in real or next-to-real time.

Metadata, the data that describes data, becomes an imperative in the world of “Big Data” and the cloud. As more of the data is distributed in the cloud and across the enterprise, the model of holding central databases becomes less relevant, especially for unstructured and semi-structured data. Moving vast amounts of data from one place to another within or outside the enterprise is not economically viable. It is faster and more efficient to select the data locally by shipping the code to the data, the Hadoop model. Good metadata is a key enabler of this approach.

There is already some metadata in place; files have a date created/modified and file size, JPEGs have data about the camera settings and location, and there are many other examples. But metadata standards are fragmented and incomplete, and cracking open files to investigate properties requires too much compute and elapsed time.

A paper by Tom Coughlin and Mike Alvarado entitled "Angels in our Midst: Associative Metadata in Cloud Storage" is an interesting attempt to put a framework model (Figure 1) in place for metadata. The authors have taken an OSI-like layered model, split into to major components:-

Basic Data Levels – four layers that focus on traditional metadata
Meaning Data Levels – three layers that focus on meaning and context

Figure 1 – Metadata Layer Model
Source Angels in our Midst: Associative Metadata in Cloud Storage, Tom Coughlin and Mike Alvarado downloaded 11/3/2010 from http://tinyurl.com/28t2fq3

IT organizations and vendors should recognize that completely new models of doing business are evolving that are enabled by an effective metadata model that has industry acceptance. Within IT, metadata can be used to assist in deleting data, as well as enabling more effective utilization of data value. Current methods of inferring metadata retrospectively are inadequate.

Metadata should be captured as close as possible to the time that the data is created or accessed, and there must be automation with immediate override in the capture of metadata. To meet national and international concerns about privacy, metadata must include strong access and security controls, with an emphasis on the user override (what Coughlin and Alvarado describe as a “Guardian Angel”).

Action Item: There should be strong cross-industry support from ISVs, hardware vendors and users for the creation of metadata models and standards. Apple, Google, Microsoft and other software developers of semi-structured data have particular responsibilities to create open metadata standards.

Footnotes:

Comments on 'Metadata for Big Data and the Cloud'

David, Excellent piece. This really puts the issue of metadata in perspective for non-technical people like myself.

Posted By:Bert Latamore| Thu Nov 04, 2010 09:46

Revision ID	Author	Timestamp	Comment
31398	Wikibon Daemon	10 Nov 05 09:38:01
31391	Bert Latamore	10 Nov 04 09:45:06
31388	David Floyer	10 Nov 03 19:23:54
31385	David Floyer	10 Nov 03 19:14:31	Created page with 'You are driving with the family on a long journey at 7pm. On your dashboard computer there a selection of restaurants and hotels that meet your budget, culinary pref...'

Wikibon is a professional community solving technology and business problems through an open source sharing of free advisory knowledge.

Become a Member!

Login

Featured Research

Announcements

Technology Events

Comments on 'Metadata for Big Data and the Cloud'

Post A Comment

most recent wikibon articles

latest wikibon blog posts

company profiles

wikibon community information