Originating Author: G Berton Latamore
Contents |
Executive summary
An international health care insurer with 20,000+ employees and approximately 60,000 mailboxes decided it needed to create a searchable archive after being required to produce historic emails during the discovery phase of a legal action. This project was complicated by the legal requirement to encrypt and restrict access to any emails or attachments containing personally identifiable health information to meet the privacy and security requirements of HIPAA (the Health Insurance Portability and Accountability Act of 1996) in the United States, and by the issue of identifying the authoritative version of emails and attachments that are forwarded several times. Other issues include the geographic area involved, which includes the United States and several European and Asian nations, and by the shear size of the archive. Although theoretically individual mailboxes are limited to 50 MB each, in fact many are much larger, due in part to the large number of large attached files commonly sent with emails as part of routine operations.
This concern caused the insurer to change its policy concerning access to personal email boxes running on Internet-based services such as Gmail, Hot Mail and Yahoo. The organization now encourages employees to maintain such mailboxes and allows them to access those mailboxes from their desks in an effort to keep personal email separate from business messaging in part to decrease the size of the archive.
Because of the need to restrict access to personally identifiable health information of insured individuals in the archive, the search engine searches headers and meta data only and restricted information remains encrypted in the archive.
Original email snapshot
- Total IT budget $250M, Enterprise Platforms and Infrastructure (EPI) division budget $4.5M.
- Total IT staff 9,000; EPI division staff 300.
- 55,000-60,000 email boxes serving more than 20,000 employees across the United States and several European and Asian countries.
Internationally distributed email system with an Exchange front-end and a SAN-based back end with dedicated SMPT servers to optimize performance.
- Early (pre-Microsoft purchase) FrontBridge spam-control user.
- Some externalized services including encryption.
Pain points
- Total email storage is in the 10 Tbyte range, making full backup solution too costly.
- Lack of backup for local Exchange servers.
- Very long restore times for individual mailboxes makes backup/restore approach impractical.
- Encryption needs to be maintained in the archive to preserve HIPAA compliance.
Business case for archiving
- The payer is a defendant in a major legal action and had to produce a large volume of historic emails as part of the discovery process. This caused the legal office to realize the organization needed an active, searchable email archive that allows legal to locate the specific emails required in court efficiently.
- Potential damage to the brand from failure to meet court discovery requirements creating the appearance of hiding relevant information in court cases is a major concern.
- Potential for HIPAA-related compliance issues were also a concern.
- HR also wanted search capabilities to help implement/enforce policy.
- Performance and cost were also concerns. Archiving was seen as a higher performance, less costly solution than server-level backup/restore.
Solutions considered
- Microsoft email archiving would seem to be a natural choice given the organization's five-plus year history with FrontBridge, now owned by Microsoft. However, the organization had some concerns specific to the U.S. health care industry about maintaining compliance with HIPAA and over Microsoft's ability to handle a 40 Tbyte data store.
- Sun/STK's archiving system on the mainframe was also considered. While it clearly had the capacity to handle 40 Tbytes of data, IT was unsure that it could deliver the business functionality the firm needed.
- IBM was also considered and rejected due to concerns about whether it could deliver the functionality the organization needed.
- Outsourcing was also considered but rejected because of HIPAA concerns.
Solution strategy
- EMC CLARiiON active archiving system.
- EMC Centera search engine.
- Active envelope journaling was also required to establish chains of events.
- Archiving project started in 2005.
- Initial system test done in IT and legal mid-year 205, making the test an unbudgeted expense. However, the financial team also appreciated the urgency of the need for this system and helped to make funds available.
- First operational roll-out was in U.S. Midwest with a budget of $500,000-$750,000 and an implementation team of 20-25 people.
- The full implementation was originally scheduled as a three-year rollout. However, concern over potential further legal exposure caused the organization to complete the rollout of the entire 40 Tbyte system globally in 2006 at a total cost of $75M to $90M and a team of 60-100 including engineers and experts from legal and HR.
- The project includes a seek appliance to provide higher performance in searches.
- Other search engines run against the meta data fields for the emails as well to provide a reconstructive search capability.
Benefits
- Legal, financial, and HR can now do full reconstruction of events after-the-fact to satisfy court discovery requirements or investigate possible policy or compliance issues.
- HR was able to set policies for acceptable email practices and the kinds of content that may be put in emails and has developed a training course for new hires.
Next steps
- Increase automation of email classification, policy enforcement, identification of authoritative sources for specific pieces of unstructured content, and data mining of the archive to achieve business advantage beyond timely response to discovery requirements.
- Expand archive to include voicemails and other kinds of unstructured content if/when these become required by the courts. Specifically, the organization is now considering adding voicemail to the archive, potentially using either Microsoft’s or Cisco's unified messaging technology. This is definitely a growing trend, particularly in the financial trading and pharma industries. However, at this time the U. S. Federal Rules of Civil Procedure do not require voicemail, nor do any other jurisdictions, and the issue is that if the insurer voluntarily archives voicemails they become discoverable, but if it does not no legal penalty is attached at this time to the inability to produce them.
- IT is considering using an SOA architecture to make add-on applications working with unstructured information more efficient.
- IT plans to automate the ties between archived attachments and their applications to make them more accessible to authorized users.
Conclusions
- As much as 80% of corporate information is unstructured, and the volume of unstructured data is growing much faster than structured in most industries. This unstructured information is particularly valuable in the health care industry, making it important to preserve it in a form that is accessible and searchable across the enterprise.
- In retrospect the head of EPI thinks the organization should have created a certificate store using a permanent encryption module working through Exchange instead of outsourcing encryption. However, because it needed a solution quickly to meet HIPAA's April 1, 2005, deadline, it contracted an outsourcer that searches all email and encrypts all messages and attachments with the expression “secure” in the header. The firm relies on email users to mark all email that needs to be encrypted with that expression.
- Data classification for unstructured attachments is not meaningful unless that classification is tied back to the application that generated it. Thus classifying a document as “surgical procedure” has little value unless it is also attached to its application so that it can be accessed easily. This is particularly true with home grown applications such as claims management, where the filename extension may not provide an obvious connection to the application. IT is investigating solutions to automate the linking of email attachments to their applications to solve this issue.
- One error the organization made was to focus too heavily on the technical side of the project and to neglect the behavioral changes required. These are at least as important as the technology, and implementing them across a huge, geographically disbursed user population, spanning multiple cultures, languages, and expectations, is complex. Global enterprises implementing projects of this scope need to involve HR early in the process and expect to dedicate at least as much effort and budget in training and enforcement of behavioral change as in the technical challenges.
- The planning phase of a project of this scope should also include assessment of sociological impacts among users, including possible social discontent with some aspects of the project. These need to be identified and planned for from the start. This may require using a sociologist experienced in the impacts of such projects as a consultant.
Footnotes: Legal: © Wikibon 2008. This document is copyright protected by Wikibon and does not fall under the GNU general license terms for Wikibon.org. Links to this article from external sources are allowed, however any other re-distribution of this content for commercial purposes is strictly prohibited. Please contact Wikibon for more information.
The cases cited herein are real however the name of the customer (CS5) is fictitious. Wikibon case studies are developed independently and their development is not initiated for or funded by any single company. Wikibon reports actual customer experiences and results with no attempt to emphasize any one vendor’s strengths or weaknesses. Read the full disclaimer.