This Page

has been moved to new address

eDiscovery Data Mapping Should be a Top Priority for General Counsel and CIOs within the Global 2000

Sorry for inconvenience...

Redirection provided by Blogger to WordPress Migration Service
----------------------------------------------------- Blogger Template Style Name: Snapshot: Madder Designer: Dave Shea URL: mezzoblue.com / brightcreative.com Date: 27 Feb 2004 ------------------------------------------------------ */ /* -- basic html elements -- */ body {padding: 0; margin: 0; font: 75% Helvetica, Arial, sans-serif; color: #474B4E; background: #fff; text-align: center;} a {color: #DD6599; font-weight: bold; text-decoration: none;} a:visited {color: #D6A0B6;} a:hover {text-decoration: underline; color: #FD0570;} h1 {margin: 0; color: #7B8186; font-size: 1.5em; text-transform: lowercase;} h1 a {color: #7B8186;} h2, #comments h4 {font-size: 1em; margin: 2em 0 0 0; color: #7B8186; background: transparent url(http://www.blogblog.com/snapshot/bg-header1.gif) bottom right no-repeat; padding-bottom: 2px;} @media all { h3 { font-size: 1em; margin: 2em 0 0 0; background: transparent url(http://www.blogblog.com/snapshot/bg-header1.gif) bottom right no-repeat; padding-bottom: 2px; } } @media handheld { h3 { background:none; } } h4, h5 {font-size: 0.9em; text-transform: lowercase; letter-spacing: 2px;} h5 {color: #7B8186;} h6 {font-size: 0.8em; text-transform: uppercase; letter-spacing: 2px;} p {margin: 0 0 1em 0;} img, form {border: 0; margin: 0;} /* -- layout -- */ @media all { #content { width: 700px; margin: 0 auto; text-align: left; background: #fff url(http://www.blogblog.com/snapshot/bg-body.gif) 0 0 repeat-y;} } #header { background: #D8DADC url(http://www.blogblog.com/snapshot/bg-headerdiv.gif) 0 0 repeat-y; } #header div { background: transparent url(http://www.blogblog.com/snapshot/header-01.gif) bottom left no-repeat; } #main { line-height: 1.4; float: left; padding: 10px 12px; border-top: solid 1px #fff; width: 428px; /* Tantek hack - http://www.tantek.com/CSS/Examples/boxmodelhack.html */ voice-family: "\"}\""; voice-family: inherit; width: 404px; } } @media handheld { #content { width: 90%; } #header { background: #D8DADC; } #header div { background: none; } #main { float: none; width: 100%; } } /* IE5 hack */ #main {} @media all { #sidebar { margin-left: 428px; border-top: solid 1px #fff; padding: 4px 0 0 7px; background: #fff url(http://www.blogblog.com/snapshot/bg-sidebar.gif) 1px 0 no-repeat; } #footer { clear: both; background: #E9EAEB url(http://www.blogblog.com/snapshot/bg-footer.gif) bottom left no-repeat; border-top: solid 1px #fff; } } @media handheld { #sidebar { margin: 0 0 0 0; background: #fff; } #footer { background: #E9EAEB; } } /* -- header style -- */ #header h1 {padding: 12px 0 92px 4px; width: 557px; line-height: 1;} /* -- content area style -- */ #main {line-height: 1.4;} h3.post-title {font-size: 1.2em; margin-bottom: 0;} h3.post-title a {color: #C4663B;} .post {clear: both; margin-bottom: 4em;} .post-footer em {color: #B4BABE; font-style: normal; float: left;} .post-footer .comment-link {float: right;} #main img {border: solid 1px #E3E4E4; padding: 2px; background: #fff;} .deleted-comment {font-style:italic;color:gray;} /* -- sidebar style -- */ @media all { #sidebar #description { border: solid 1px #F3B89D; padding: 10px 17px; color: #C4663B; background: #FFD1BC url(http://www.blogblog.com/snapshot/bg-profile.gif); font-size: 1.2em; font-weight: bold; line-height: 0.9; margin: 0 0 0 -6px; } } @media handheld { #sidebar #description { background: #FFD1BC; } } #sidebar h2 {font-size: 1.3em; margin: 1.3em 0 0.5em 0;} #sidebar dl {margin: 0 0 10px 0;} #sidebar ul {list-style: none; margin: 0; padding: 0;} #sidebar li {padding-bottom: 5px; line-height: 0.9;} #profile-container {color: #7B8186;} #profile-container img {border: solid 1px #7C78B5; padding: 4px 4px 8px 4px; margin: 0 10px 1em 0; float: left;} .archive-list {margin-bottom: 2em;} #powered-by {margin: 10px auto 20px auto;} /* -- sidebar style -- */ #footer p {margin: 0; padding: 12px 8px; font-size: 0.9em;} #footer hr {display: none;} /* Feeds ----------------------------------------------- */ #blogfeeds { } #postfeeds { }

Tuesday, May 11, 2010

eDiscovery Data Mapping Should be a Top Priority for General Counsel and CIOs within the Global 2000

A key aspect and legal requirement of eDiscovery is the creation of a data map to determine precisely what information is available within an organization and where it resides. This is a process that should begin long before a company ever finds itself in court. Surprisingly, over the past 2 years, I haven’t found more than a hand full of General Counsel and CIOs at some of the largest companies in the world that truly understand the importance of eDiscovery Data Mapping, the risks involved in not having a working eDiscovery Data Map nor the fact that they should be leading the charge to develop and manage an enterprise wide eDiscovery Data Map.

Ganesh Vednere, a manager at Capgemini, wrote an excellent overview of the key aspects of eDiscovery Data Mapping titled, “The Quest for eDiscovery: Creating a Data Map”. that appeared on the November / December 2009 Informatics site. In this overview, Ganesh indicated that, at a minimum, the enterprise should consider completing the following tasks as a general practice to start the eDiscovery Data Mapping process:

Get a list of all systems – and be prepared for a few surprises
Begin the process by creating a list of all systems that exist in the company. This is easier said than done, as in many cases, IT does not even have a full list of all systems. Sure, they usually have a list of systems, but don’t take that as the final list! Due diligence involves talking to business process owners, employees, and contractors, which often brings to light hidden systems, utilities, and home-grown applications that were unbeknownst to IT. Ensure that all types of systems are covered, e.g. physical servers, virtual servers, networks, externally hosted systems, backups (including tapes), archival systems, and desktops, etc. Pay special attention to emails, instant messaging, core business systems, collaboration software, and file shares, etc.

Document system information
After the list of all systems is known, gather as much information about each as possible. This exercise can be performed with the help of system infrastructure teams, application support teams, development teams, and business teams. Here are some types of information that can be gathered: system name, description, owner, platform type, location; is it a home grown-package, and does it store both structured and unstructured data; system dependencies (i.e., what systems are dependent on it and what systems does it depend on); business processes supported, business criticality of the system, security and access controls, format of data stored, format of data produced, reporting capabilities, how/ where the system is hosted; backup process and schedule, archival process and schedule, whether data is purged or not; if purged, how often and what data gets purged; how many users, is there external access allowed (outside of the company firewall), are retention policies applied, what are the audit-trail capabilities, what is the nature of data stored, e.g. confidential data, nonpublic personal information, or still others.

Get a list of business processes
Inventory the list of business processes and map it to the system list obtained in the step above to ensure that all the various types of ESI are documented. The list of business processes is also useful during the discovery process, when one can leverage the list to hone in on a particular type of ESI and obtain information about how it was generated, who owned the data, how the data was processed, how it was stored, and so on. A list of business processes can also be useful when assessing information flows.

Develop a list of roles, groups, and users (custodians)
Obtain the organizational chart and determine the roles and groups across the business and the business processes. Document the process custodians and map out who had privileges to do what. Understand the human actors in the information lifecycle flow.

Document the information flow across the entire organization
Determine where critical pieces of information got initiated, how the information was/is manipulated, what systems touch the information, who processes the information, what systems depend on the information, and so on. Understanding the flow of information is key to the data mapping/discovery process.

Determine how email is stored, processed, and consumed
Given the large percentage of business information and business records that reside in email, special attention needs to be placed on email ESI. Typically email is the first thing that opposing counsel go after, so determining whether email retention and disposition policies are consistently enforced will be key to proving good faith. There are a number of automated tools that will enable you to create email maps, link threads of conversation, heuristically perform relevancy search, extract underlying metadata, and so on. Before deciding to buy the best-of-breed solution, however, perform due diligence on existing email processes. Understand how employees are using email. Are they creating local archives (.PST files), are they storing emails on a network or a repository, are they disposing of them at the end of retention periods, are they using personal emails to conduct official business, and so on. Identify deficiencies and violations in email policies before the opposing counsel does.

Identify use of collaboration tools
SharePoint will have the lion’s share of the collaboration space in many organizations, but even then you must ensure that all other tools – whether they are social networking tools, Web-based tools, or home-grown tools – are included in the data-mapping process. You need to carefully document the types of information being stored on each of these tools. Sometimes company information has a nasty habit of being found in the most unlikely of places. Wherever possible work with compliance, information management, or records management groups to establish usage policies to prevent runaway viral growth of these tools. If the organization already has thousands of unmanaged SharePoint sites, work with IT and business to institute governance controls to prevent further runaway growth.

Don’t forget offsite storage
After inventorying and mapping all systems, one would think the job is done. Alas, there is more work ahead. Offsite storage is an often under-appreciated aspect of the discovery process. It is quite reasonable to assume that there might be substantial evidence stored offsite which might become incriminating at a later date. Offsite storage may contain boxes or tapes full of records whose existence was somehow never properly documented, with the result that they cannot be located unless someone opens the box or attempts to recover the tape data. These records continue to live well past their onsite cousins. This means the organization continues to have the record in backup tapes (or paper) and other formats that it purportedly claimed to have destroyed. The search for records in offsite storage is made more complicated if the offsite storage process did not create detailed indices about the contents. If there are tapes labeled “2007 Backup Y: Drive,” then it may become quite an arduous task to determine what information is really contained in those tapes. Nevertheless the journey must be started. It could involve anything from a full-scale review of all tapes, followed by reclassifying and re-filing the tapes, to perhaps a review of just the offsite storage manifests. It could also involve a search for critical information or a clean-up of the last three years’ worth of tapes, and so on.

eDiscovery Data Mapping Platform
This is an impressive beginning best practice. However, I would add the requirement of seriously considering investing in an eDiscovery Data Mapping platform to help guide you through the eDiscovery Data Mapping process and then manage all of the information that you discovery. After all, just creating your eDiscovery Data Map is just the beginning of the process. The real value of creating the eDiscovery Data Map will be seen when your enterprise uses the eDiscovery Data Map to support your first legal matter and enables you to more fully meet the legal requirements of the the Federal Rules of Civil Procedure (FRCP) and the associated state and local rules.

Genome from Exterro, is an excellent example of a new generation of data mapping solutions that are dynamic, shifting to reflect your company's information universe as it evolves.Through intelligent workflows and automated processes, Genome enables IT departments and legal teams to quickly visualize and analyze data source information to proactively scope case parameters. Over the next couple of weeks, I will be reviewing the solutions that are currently available along with some recommendations for which platforms will provide the best return on your investment (ROI).

The full text of Ganesh Vednere’s overview is as follows:

A key aspect of ediscovery is the creation of a data map to determine precisely what information is available within an organization and where it resides. This is a process that should begin long before a company ever finds itself in court.

The phone rings. It is the general counsel. The organization may be sued over patent infringement. Counsel knows that this could be “The Big One.” All sorts of data, documents, metadata, emails, and other forms of information may be required. Counsel asks IT: Do you have, or can you get together, a list of all systems and the data they contain?” There is a long, silent pause on the phone. Then the IT manager says “Well, we do have a list of systems. Let me send it your way.” Counsel gets the list. It is nothing close to the data map it needs. Instead it is a list of servers, their IP addresses, platform configuration, and their physical rack location in the data center. Good information for disaster recovery purposes, but not particularly helpful in court.

“Well, this is the best I’ve got,” comes the retort from IT. “We do not have a data map nor would we know how to create one – and, by the way, do you really think we have the bandwidth to work on this now?”

Why You? The Challenge of Data Mapping
So who gets stuck with the job? You do. You might argue that “IT manages all the infrastructure and stuff, why couldn’t they just run an inventory on their systems?” And IT will reply that “Well, we do manage the infrastructure, but we know very little about the inputs, outputs, documents, records, and other information on these applications. Go talk to the business side.” And you go to business, and business will tell you that “I just use the system and click these buttons on the screen. The system is a black box to me. I have no idea about all of the underlying data, metadata, and data structures. I suggest you talk to the operational folks.” And you talk to the operational folks, and they say, “What are you talking about? We just execute business processes. Don’t ask us about data and metadata. Go talk to the analyst who worked on the system design.” And you look for the analyst, and you eventually learn that …“Oh, she was a consultant and she left the project three years ago.”
The challenges are many but the data map must be created. And the job is yours. So where do you start? By going back to the beginning … the very beginning.

How Did We Get in Such a Mess?
Let’s take a look at a typical mid-size organization. It has several thousand employees and contractors with offices in the U.S. and E.U. The sheer volume of information that resides in just one division is mind-boggling. New information sources keep popping up, employees keep creating new SharePoint sites on their own, and there is use (or misuse) of social collaboration tools, to say nothing of several hundred IT systems in play at any point in time. Data is moved and migrated from one place to another without proper documentation or communication, more and more tape backups are being created, and some employees are making copies of data on thumb drives or worse, emailing them to their personal email addresses.

How did things ever become such a mess? There are manifold reasons: IT is traditionally kept at arm’s length on compliance and uninvolved with information management and governance during systems design and development. Records management departments, on the other hand, often institute sound policies and retention schedules but have a tough time putting these into practice and getting people to adhere to them. On the legal side, general counsels often work against themselves: becoming increasingly exasperated over the large amount of money spent on searching, processing, and producing electronically stored information (ESI), they often push hard to cut costs, thereby shortcircuiting the process.

Must an Organization Have a Data Map?

It may seem surprising that even today, many successful organizations do not have a data map, or at best, a superficial one. It is not that organizations are lacking in will, however, but that the process seems too daunting. Consider the “typical organization” above. If there are several hundred IT systems and other home-grown business applications, one must not only know what these systems are, and where they are located, but also the types of information (documents, records, other content) that are produced from these systems and additional information such as data format, data location, whether the data is updated by other systems, or transformed into other formats, etc.

To add to the complexity, a determination also needs to be made as to whether a piece of ESI can be extracted and presented using reasonable and customary means. For example, if an IT system was retired and the data backed-up on tape, it is reasonable to assume that extracting the tape, processing the information, and presenting it in a readable format may not be easy since the underlying version of the software no longer exists. Counsel, however, must be able to assess whether this is indeed the case. Having a data map eases some of these tasks and makes it easier for counsel to relate the information as needed.

Data Mapping Considerations

In the current economic environment, companies are bracing themselves for an uptick in the number of lawsuits. Whether the matter is related to regulators, customers, consumers, employees, or business partners, companies are often required to provide ESI in court. While this should be sine qua non for most organizations, many are simply too overwhelmed to be able to react fast enough and are thus placing themselves at a much greater risk. If that’s the case in your enterprise, here are some initial steps that will help you move forward.
  1. Understand the prevailing legal environment. Organizations are not created equally, and not all have the same set of applicable legal requirements. It is therefore important to analyze the type of environment that the organization operates in, the jurisdiction it is under, and the various federal and state laws, regulations, and common industry standards that apply to it, with regard to ESI. While the contents of a data map by itself do not directly correlate to a particular law or regulation, it is useful to know what checks and controls need to be established during the datamapping process and ensure that there are no “show-stopper” questions in court around how the data map was created or what the process was.
  2. Use a partnership model and obtain buy-in from senior management. It is important that each entity within an organization have a vested stake in the success of any data-mapping project. This means that management in each of these organizational fiefdoms must understand what a data map is, how it will be used, and what the process of creating one is. Getting buy-in from these senior managers is a crucial first step and must be completed prior to the start of the process. Additionally, it is important that people of the appropriate rank are selected to work on the project. Folks who are deep in the weeds will generally have a lot more information about data flows and how processes and people work together versus the senior executive who operates in more of a decision-making capacity.
  3. There is little point in pursuing a “big-bang” approach for the data map. Instead, work towards a phased approach. Prioritize which divisions or lines of business to focus on first and then address the remaining ones later. Work with line managers to determine what, if any, information has been collected on systems and processes within their particular areas. Standard industry lists may be employed as a starting point, e.g. HR, Accounting, Communications and Marketing, etc. Begin the first phase of the process here and then iteratively build upon what’s already available.
  4. Use the right technology. As more capital is allocated towards automating ediscovery, vendors will naturally gravitate towards building specialized software for this mission. Time, cost, and relevancy of results will drive the success of vendor products. While some organizations have attempted to build custom tools, more and more prefer choosing established products or service offerings to guide them through the rediscovery and data-mapping process. Already many vendors have begun mapping their offerings to the electronic discovery reference model (EDRM) and other industry standards. This market is still maturing and organizations should not go out and immediately purchase a top-rated vendor’s software without due consideration of the organization’s unique circumstances.
Creating the Data Map
Once you’ve worked your way through each of considerations above and taken action as needed, you’re ready to start the actual data-mapping process. It is lengthy but well-defined and can be broken down into each of the following steps:

The Data Mapping Process

  1. Get a list of all systems – and be prepared for a few surprises. Begin the process by creating a list of all systems that exist in the company. This is easier said than done, as in many cases, IT does not even have a full list of all systems. Sure, they usually have a list of systems, but don’t take that as the final list! Due diligence involves talking to business process owners, employees, and contractors, which often brings to light hidden systems, utilities, and home-grown applications that were unbeknownst to IT. Ensure that all types of systems are covered, e.g. physical servers, virtual servers, networks, externally hosted systems, backups (including tapes), archival systems, and desktops, etc. Pay special attention to emails, instant messaging, core business systems, collaboration software, and file shares, etc.
  2. Document system information. After the list of all systems is known, gather as much information about each as possible. This exercise can be performed with the help of system infrastructure teams, application support teams, development teams, and business teams. Here are some types of information that can be gathered: system name, description, owner, platform type, location; is it a home grown-package, and does it store both structured and unstructured data; system dependencies (i.e., what systems are dependent on it and what systems does it depend on); business processes supported, business criticality of the system, security and access controls, format of data stored, format of data produced, reporting capabilities, how/ where the system is hosted; backup process and schedule, archival process and schedule, whether data is purged or not; if purged, how often and what data gets purged; how many users, is there external access allowed (outside of the company firewall), are retention policies applied, what are the audit-trail capabilities, what is the nature of data stored, e.g. confidential data, nonpublic personal information, or still others.
  3. Get a list of business processes. Inventory the list of business processes and map it to the system list obtained in the step above to ensure that all the various types of ESI are documented. The list of business processes is also useful during the discovery process, when one can leverage the list to hone in on a particular type of ESI and obtain information about how it was generated, who owned the data, how the data was processed, how it was stored, and so on. A list of business processes can also be useful when assessing information flows.
  4. Develop a list of roles, groups, and users (custodians). Obtain the organizational chart and determine the roles and groups across the business and the business processes. Document the process custodians and map out who had privileges to do what. Understand the human actors in the information lifecycle flow.
  5. Document the information flow across the entire organization. Determine where critical pieces of information got initiated, how the information was/is manipulated, what systems touch the information, who processes the information, what systems depend on the information, and so on. Understanding the flow of information is key to the data mapping/discovery process.
  6. Determine how email is stored, processed, and consumed. Given the large percentage of business information and business records that reside in email, special attention needs to be placed on email ESI. Typically email is the first thing that opposing counsel go after, so determining whether email retention and disposition policies are consistently enforced will be key to proving good faith. There are a number of automated tools that will enable you to create email maps, link threads of conversation, heuristically perform relevancy search, extract underlying metadata, and so on. Before deciding to buy the best-of-breed solution, however, perform due diligence on existing email processes. Understand how employees are using email. Are they creating local archives (.PST files), are they storing emails on a network or a repository, are they disposing of them at the end of retention periods, are they using personal emails to conduct official business, and so on. Identify deficiencies and violations in email policies before the opposing counsel does.
  7. Identify use of collaboration tools. SharePoint will have the lion’s share of the collaboration space in many organizations, but even then you must ensure that all other tools – whether they are social networking tools, Web-based tools, or home-grown tools – are included in the data-mapping process. You need to carefully document the types of information being stored on each of these tools. Sometimes company information has a nasty habit of being found in the most unlikely of places. Wherever possible work with compliance, information management, or records management groups to establish usage policies to prevent runaway viral growth of these tools. If the organization already has thousands of unmanaged SharePoint sites, work with IT and business to institute governance controls to prevent further runaway growth.
  8. Don’t forget offsite storage. After inventorying and mapping all systems, one would think the job is done. Alas, there is more work ahead. Offsite storage is an often under-appreciated aspect of the discovery process. It is quite reasonable to assume that there might be substantial evidence stored offsite which might become incriminating at a later date. Offsite storage may contain boxes or tapes full of records whose existence was somehow never properly documented, with the result that they cannot be located unless someone opens the box or attempts to recover the tape data. These records continue to live well past their onsite cousins. This means the organization continues to have the record in backup tapes (or paper) and other formats that it purportedly claimed to have destroyed. The search for records in offsite storage is made more complicated if the offsite storage process did not create detailed indices about the contents. If there are tapes labeled “2007 Backup Y: Drive,” then it may become quite an arduous task to determine what information is really contained in those tapes. Nevertheless the journey must be started. It could involve anything from a full-scale review of all tapes, followed by reclassifying and re-filing the tapes, to perhaps a review of just the offsite storage manifests. It could also involve a search for critical information or a clean-up of the last three years’ worth of tapes, and so on.
Conclusion
In today’s highly litigious world, creating a data map is one of the primary steps in responding to litigation requests. It is vital that organizations get a solid foundation by focusing time, energy and resources in doing it right – and creating it long before it’s needed.

Labels: , , , , , , , , , , ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home