Forensic Implications of Metadata in Electronic Files

By John Ruhnka and John W. Bagby

E-mail Story
Print Story
JUNE 2008 - In this digital age, most business activities are transacted and recorded using networked information systems. Business and accounting records are prepared, reviewed, audited, and preserved in electronic form, commonly called electronically stored information (ESI). It is estimated that 94% to 99% of all business records are created and maintained in electronic form (National Law Journal, July 17, 2006) and most are never transformed into hard copy. A unique characteristic of electronic records is that they include hidden metadata that comprise extensive information about the creation of a file, including “MAC dates” (the dates a file was modified, accessed, and created), the date last printed, and, if deleted, when it was deleted and by whom. Metadata can also reveal the location of a file on a computer or network, the computer on which it was created, the name of the person who last saved the document, the number of revisions made, and any document ID or properties added to the document. E-mails contain metadata that indicates the sender’s address book information, the date a message was sent, received, replied to, forwarded, to whom copies were sent, and the existence of attachments.

Metadata has been called “the electronic equivalent of DNA” and it can shed light on the origins, context, authenticity, and distribution of electronic evidence (Craig Ball, Beyond Data about Data: The Litigator’s Guide to Metadata 2005). CPAs—especially those involved in forensic accounting and litigation support—should be aware of how metadata is generated by software, and the potential significance of metadata in electronic business records and communications.

Metadata Production, Location, and Access

Metadata can be broadly divided into two categories. Application metadata is automatically created by an application and is embedded in every file created or edited using that software. Operating systems that control individual computers, servers, and other devices create systems metadata, which assigns file allocation table fields (file name, creation, length, and use) to all files stored on the system so that the operating system can identify and locate that file. Systems metadata resides in the system registry of the computer system or server used to access and store that file.

Many CPAs use Microsoft Office programs, including Word, Excel, PowerPoint, and Outlook. All of these applications automatically produce dozens of fields (types) of application metadata for each file they create. Application and systems metadata fields are created and updated for Word, Excel, and PowerPoint files each time a file is created, opened, or used, as well as the optional information about changes or versions that a user may intentionally track in a file. Adobe Acrobat software creates detailed metadata path information that can provide forensic information on PDF files.

Significance of Metadata in Litigation

A 2007 survey of litigation activities of 253 U.S. corporations revealed that 83% of respondents had new lawsuits filed against them in 2006 (Fulbright & Jaworski, Fourth Annual Litigation Trends Survey Findings, October 2007). The most common subjects of these lawsuits were labor/employment, contract enforcement, and personal injury. Litigation was also significant at the smaller companies surveyed: 17% had at least one lawsuit claiming $20 million or more, and 98% of mid-sized companies reported one or more lawsuits of $20 million or larger. After a lawsuit is filed, a pre-trial discovery phase occurs during which the litigants are required to identify and disclose (produce) all information in their possession that is requested by the opposition as potentially relevant to the subject of the litigation. Because most settlements in litigation occur before a trial is held, electronic records and e-mails disclosed and evaluated by the parties during the discovery phase can often determine the outcome.

Once a lawsuit is filed or a party has been served with a document preservation request, a “litigation hold” prevails that requires the parties to preserve all evidence under their control that is potentially relevant to the subject of the litigation. In some circumstances, a legal duty to preserve potentially relevant evidence can even arise before a lawsuit is filed. The watershed 2003 Zubulake discovery ruling imposed legal duties to preserve potentially relevant evidence as soon as litigation is “reasonably anticipated” (Zubulake v. UBS Warburg, 2003 WL 22410619 at 4, S.D.N.Y. 2003).

Because metadata in electronic files reveals forensic information about the creation, authorship, history, and even intent of a document, it can play a potentially critical role in litigation outcomes. In the Vioxx product liability litigation that resulted in a large judgment against its producer, Merck, the New England Journal of Medicine reported that residual “tracked changes” accidentally left in a Merck internal document indicated that Merck knew of potential dangerous side effects of Vioxx (including heart attacks) two years before placing the drug on the market (Forbes, Dec. 8, 2005). The general rule on disclosing the metadata associated with files demanded in litigation has been stated as follows:

[W]hen a party is ordered to produce electronic documents as they are maintained in the ordinary course of business, the producing party should produce the electronic documents with their metadata intact, unless that party timely objects to production of metadata, the parties agree that the metadata need not be produced, or the producing party requests a protective order” (Williams v. Sprint/United Mgmt. Co., 230 F.R.D. 640, D. Kan. Sept. 29, 2005) [emphasis added].

Preserving Metadata when Reviewing or Producing Files

Larger organizations are often involved in frequent litigation, which requires the entity to identify, preserve, and disclose potentially relevant electronic files in response to successive legal discovery demands. To effectively manage this complicated process, which can have significant implications on liability, it is sound practice to institute an enterprisewide “ESI discovery team” to manage and coordinate this complex and costly discovery. An ESI discovery team includes key decision-makers who need to be involved in the on-going process of planning for and responding to discovery requests. This typically includes the CIO, IT system managers, in-house legal counsel, representatives from administrative units most closely involved in the litigation (e.g., an HR director in a wrongful termination lawsuit), as well as outside counsel and any third-party legal and electronic discovery consultants and forensic experts who will be involved. The ESI discovery team designs an organization’s “litigation hold” procedures, and deploys litigation holds often involving multiple and overlapping litigation for all enterprise locations.

A business subject to a litigation hold must act quickly to prepare a written “preservation plan” identifying all potentially relevant information at all enterprise locations. The identification process can use keywords describing the subject matter of the litigation; identify specific users whose e-mails, instant messaging, and voicemails may be relevant; and notify identified users to preserve all data on desktop computers, laptops, and messaging devices. The 2006 Federal Rules of Civil Procedure (FRCP) require both accessible and “inaccessible” ESI, such as network and server back-up tapes, to be preserved. Any over-writing or reuse of back-up tapes that may include e-mails potentially relevant to the litigation must be immediately halted. The 2006 FRCP Rule 26(f) requires a “meet-and-confer” to occur early in the litigation to negotiate the scope of discovery by and for each side. This conference should decide which files are to be collected, reviewed for potential relevance; and, if relevant, produced, as well as the format in which files are to be produced and whether they will include metadata, along with a timetable for discovery.

The FRCP contains a default preference for delivery of electronic files in “native file format” (the format in which the data is ordinarily preserved), including all associated metadata. Thus, potentially relevant ESI needs to be preserved in native file formats with metadata intact before the multiple steps involved in collecting and reviewing files for relevance are initiated. Opening a file for review alters its metadata and could be viewed as “tampering” with evidence. Parties producing files requested for discovery need to be able to show an unbroken chain of custody to assure its admissibility as evidence and to avoid judicial sanctions. To ensure this, it is advisable to make a secure “snapshot” digital record of all potentially relevant enterprise server systems and files that is separately archived before any review for potential relevance is conducted, so that original files and metadata remain intact.

If the parties disagree about the format in which files are to be produced or whether file and system metadata are to be included, the federal courts will look at the potential relevance of metadata to the issues in dispute. In an options-backdating case, for example, metadata showing the dates of successive entries contained in options documents could be critical. A second consideration is the cost of producing metadata. If metadata already exists in the native file formats, it is more likely to be required whereas if it is not present in the native file format and must be reconstituted from other sources, it is less likely to be required.

Forensic Uses of Metadata

Forensic accounting provides an evidentiary basis for economic transactions and reporting events by identifying the process of capturing, using, storing, and transmitting business and financial data. This can involve manual processes, such as data entry, computations, verifications, and interpersonal communications, in conjunction with a company’s IT and network systems. Metadata can help to identify the human and system actions in information systems; can be used to investigate and verify fraud, abuse, mistakes, or system failures; and can help to establish elements such as causation, timing, and the extent of knowledge or mens rea (guilty knowledge)—all of which are at issue in criminal or civil litigation. An example of the forensic use of metadata is in stock options-backdating investigations, where the integrity of the dates entered on written option documents is often the crux of the dispute. In Ryan v. Gifford (2007 Del. Ch. Lexis 168), the Delaware Chancery Court ordered respondents to produce the disputed stock option documents in an electronic format that would permit examination of all metadata associated with the documents, noting that “Maxim’s special committee as well as Deloitte & Touche undoubtedly reviewed metadata as part of their investigation into the backdating problems at Maxim.”

In Williams v. Sprint/United Management Co. [230 F.R.D. 640 (D. Kan. 2005)], the plaintiff in an age discrimination lawsuit sought an Excel workbook in its native file format. But the defendant, Sprint, stripped out all metadata from the Excel spreadsheet files that it produced, arguing that the metadata could reveal privileged information that the company had a right to withhold (the formulas and calculations used to derive information in the Excel spreadsheets that were linked to spreadsheet cells). The court held that blanket withholding of metadata from the requested accounting records went too far, and ordered Sprint to produce all of the metadata in its accounting files as maintained in the ordinary course of business, except for specific metadata that it claimed was protected by attorney-client privilege.

Confidentiality and Malpractice Implications: Client Files

Some potentially relevant information demanded in litigation, such as attorney-client communications and litigation work product including associated metadata, may be withheld from disclosure as “privileged” information, subject to judicial review. Claims of privilege must be identified in a “privilege log” that identifies the author, recipients, subject matter, and dates of all withheld files. The privilege log alerts opposing parties to the fact that potentially relevant information has been withheld, and dates and the identity of participants enable opponents to review and challenge these claims. The 2006 FRCP amendments impose a faster pace for discovery, which increases the risk of accidental disclosure of privileged documents, but also provides that parties may request a “claw back” (a court-ordered return) of privileged files or trade secrets in the event of inadvertent disclosure. Litigants may also request court protective orders that prohibit the disclosure of proprietary, confidential, or private information that has been accessed by an adversary or its experts.

CPA firms play an important role in providing electronic discovery and forensic services in litigation. The 2007 Socha-Gelbmann Electronic Discovery survey (www.sochaconsulting.com/2007survey.htm) indicates that $2.6 billion was spent on electronic discovery services in 2006, and that CPA firms are increasingly significant vendors in this arena. (Ernst & Young and KPMG were ranked in the top 10 electronic discovery service providers in 2007.) CPAs who provide forensic information and damage calculations for clients need to be aware of the liability implications of metadata contained in client files. Inadvertent disclosure of metadata in client files could result in a waiver of subsequent client claims of legal privilege for the metadata, or enable opponents to use metadata against clients’ interests.

CPAs are increasingly being held to the same malpractice standard as lawyers. Mattco-Forge, Inc. v. Arthur Young & Co. (6 Cal.Rptr.2d 281; Cal. Ct. App. 1992) involved a suit by a client against a CPA firm. Arthur Young & Co. was hired as an expert witness and damages consultant to assist Mattco in a lawsuit against General Electric. Mattco claimed that Arthur Young negligently provided unsubstantiated calculations for the profits allegedly lost because GE had struck Mattco from its supplier list. Because original estimate sheets were not available for all contracts, Arthur Young had asked Mattco to prepare noncontemporaneous estimates for the missing estimate sheets. These estimates, not identified as being noncontemporaneous, were turned over to GE, which used them to have Mattco’s legal claims dismissed. The California Appellate Court noted that in technologically driven litigation, the engineers, physicians, real estate appraisers, and other professionals—including accountants—hired to assist a party in preparing and presenting a legal case can play as great a role in shaping and evaluating their clients’ case as do lawyers. Accordingly, the court said, they should be held to the same malpractice standards.

Managing Metadata

While metadata should not, without prior judicial approval, be intentionally altered or removed from documents subject to a litigation hold or demanded in litigation, metadata may be removed in the ordinary course of business as necessary to preserve enterprise and client confidentiality, as well as to safeguard proprietary information. The AICPA Code of Professional Conduct, Rule 301, Confidential Client Information, provides that: “A member in public practice shall not disclose any confidential client information without the specific consent of the client.” If a CPA assisting a client with bid calculations were to send an amended version of a bid proposal to the opposing side which included metadata that revealed that the client had initially approved much higher bid amounts, the CPA could be liable for breach of client confidentiality or even a malpractice claim for jeopardizing the contract. Potential liability for the disclosure of metadata harmful to client interests means that metadata confidentiality policies that will pass muster with both legal discovery rules and AICPA ethics rules, including pre-release metadata viewing and “scrubbing” (intentional metadata removal) of security-sensitive files, should be conducted on a company-wide basis and not be left to individual discretion.

CPA firm personnel should possess the necessary technical skills to both view and to remove metadata from electronic files. Metadata is viewable in several ways. Basic metadata in Microsoft Office documents is viewable from the “File” menu, under “Properties.” There are tabs for “General,” “Statistics,” and “Contents” information. Word will reveal to any user a Word document’s authors, the date of creation, the date last modified, the number of revisions, and where the document is stored. If optional used-added features such as “Track Changes” or “Comments” were enabled when a Word document was created or edited, any user can see which other users made specific edits to a document and when.

In addition, commercially available metadata viewers can be used to access a much larger array of metadata. For examples, see www.payneconsulting.com/products and www.docscrubber.com. Payne Group also produces Metadata Assistant, a widely used metadata “scrubber.” Detailed instructions on removing metadata from electronic files are beyond the scope of this article; nonetheless, a Microsoft Office 2003/XP add-in called “Remove Hidden Data” can remove most—but not all—metadata from Office 2003 documents. Microsoft also offers “Office Document Inspector” for Office 2007, which can remove most metadata from Word, Excel, and PowerPoint files. For a whitepaper on the uses and limitations of MS Office Document Inspector, see
esqinc.com/Content/WhitePapers/Document-Inspector.php.

Click here to view Sidebar.


John Ruhnka, JD, LLM, is the Bard Family Term Professor of Entrepreneurship at the business school of the University of Colorado at Denver.
John W. Bagby, JD, is a professor and co-director of the Institute for Information Policy in the college of information sciences and technology at the Pennsylvania State University, University Park, Pa.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



The CPA Journal is broadly recognized as an outstanding, technical-refereed publication aimed at public practitioners, management, educators, and other accounting professionals. It is edited by CPAs for CPAs. Our goal is to provide CPAs and other accounting professionals with the information and news to enable them to be successful accountants, managers, and executives in today's practice environments.

©2009 The New York State Society of CPAs. Legal Notices