To use a term from Robin Harris and Chuck Hollis, I’ve been ‘squinting’ through the EMC Source One announcement. Michael Brown’s Chalk Talk on the architecture is worth a look to see what’s new here.
It looks to me like this announcement includes lots of catch up and plenty of vision with the implied promise that SourceOne will deliver. EMC’s done some good work at integrating multiple piece parts, but this still appears to be a shove everything in a central archive approach. And as my colleagues and I have been saying on Wikibon, this won’t solve the problem of managing information risk, which is the main driver of email archiving. Let’s face it, legal is steering this bus right now, not IT and while maybe you can take a centralized approach to solve email archiving problems, files and content distributed throughout the organization present more pressing challenges.
With this announcement, EMC is resurrecting its ILM vision, which is cool. I’ve always liked the concept but it’s never been actionable. I’m not sure SourceOne makes it so but it’s a step in the right direction. What the SourceOne announcement seems to do is fix well-known problems with EmailXtender (by throwing it out) and introduce a new architecture that from what I can tell, breaks up the task of archiving emails and parses the work to different resources so the system can perform better and scale. SourceOne also uses concepts like stubbing, .PST ingestion, single instancing, de-dupe and the like. All good things but nothing really that radicallly new or exciting. So you’re left with a new architecture that coordinates work across multiple resource components but still appears to be a centralize-everything approach.
Symantec has predictably responded by soliciting EmailXtender customers to join the Enterprise Vault bandwagon. The open letter to EmailXtender users basically says – why shove everything into a new version 1.0 centralize everything archive from EMC when you can shove it all into a mature centralize everything architecture like EV from Symantec? My advice is there are better approaches coming so look before you leap. My colleague Gary MacFadden and I are pretty charged up about some new innovators like Digital Reef and Rational Retention that are taking what we see as more sensible approaches to unstructured ESI. Yes they’re newbies and don’t have the baggage but we think their auto-categorization, search and metadata approaches point the way of the future.
Missing in the SourceOne discussion are clear descriptions of things like auto-categorization and automated policy management and smokin’ enterprise search and the ability to manage unstructured content other than emails. And I’d like to see less talk about retention and more tools to get rid of stuff (GRS). I can’t tell if it’s in here. EMC certainly alludes to some of these capabilities but they don’t seem to be a centerpiece of this announcement.
Let’s look at it another way. For true ILM for Unstructured ESI you need four components with regard to content:
1. Find it – search and categorize (in synch w/policies hopefully) – and create a structured metadata layer.
2. Analyze it – i.e. leverage the content and metadata layer to gain insight.
3. ‘Rule’ it – information policies applied to what can and can’t be done in an automated layer to direct an execution engine.
4. Execute it – after finding and analyzing and understanding the policy you have to do something with it, like copy, delete, freeze, alert, etc.
This is by no means an architecture but it lays out the pieces that are important to ILM.
Typically vendors have taken the approach of jamming as much info as possible into a central repository to control it. It’s easier this way. Maybe this is possible for email (although what about email attachments saved locally?); but how do you do this for files which are distributed on laptops and desktops and blackberries and wikis? Centralizing those is not practical.
Even if you can control it, how do you automate ILM without auto-categorization? This is the only hope we have of catching up with volume growth. Auto-classification is the mainspring of scaling for the business.
Maybe I’m missing something obvious– if so, help me through the haze.