Guest Column | August 29, 2017

Does Data Conversion Equate To Destroying Evidence?

How One Bio Company Tackled Its Data Visualization Challenge

How to maintain archive data security and fidelity, without breaking the bank.

By Bill Tolson, Vice President of Marketing, Archive360

When faced with the need to archive data (usually email) due to regulatory requirements, eDiscovery responsibilities, or business requirements, most organizations have historically sought solutions based on capabilities, cost, vendor reputation, security, and regulatory requirements. Pretty straightforward, right? Organizations that required archiving solutions purchased one of the many on-premises or cloud-based solutions that met their needs.

However, the majority of these archiving solutions actually converted the data so as to enable more efficient storage, indexing, and search. Again, seems to make sense. Unfortunately, as many have learned the hard way, with data conversion it is not uncommon for the data to be corrupted, or metadata changed or lost, nullifying the “golden copy” or copy of record status.

So what’s the big deal? Maybe not a huge deal, unless of course you are ever involved in litigation.

Data Conversion = Destroying Evidence?
Text messages, video footage, emails, and so on — you need only open today’s paper to read about how seriously the legal community is taking and consequently prosecuting individuals that are thought to have destroyed or tampered with potential electronic evidence.

For business organizations, nonprofits and even governmental agencies this is an area of growing importance and focus. Especially due to the fact is that in the case of actual or even “anticipated” litigation, by converting data, you may in fact be inadvertently/knowingly destroying evidence. With that said, let’s take a minute to revisit the data responsibilities around eDiscovery.

The 2006 revisions to the Federal Rules of Civil Procedure (FRCP) established the concept of anticipated litigation. FRCP Rule 37(e) states, “If electronically stored information that should have been preserved in the anticipation or conduct of litigation is lost because a party failed to take reasonable steps to preserve it, and it cannot be restored or replaced through additional discovery, the court:

may order measures no greater than necessary to cure the prejudice (meaning; or
only upon finding that the party acted with the intent to deprive another party of the information’s use in the litigation may:
1. presume that the lost information was unfavorable to the party;
2. instruct the jury that it may or must presume the information was unfavorable to the party;
3. dismiss the action or enter a default judgment.

In reality, organizations are free to store or archive data in any way they choose, unless they choose a method in an obvious attempt to thwart eDiscovery. The litigation hold responsibility arises when companies should reasonably anticipate future litigation. Up to that point, data can be converted, deleted, or changed without the risk of eDiscovery repercussions. But, once anticipation of possible legal action arises, data must be secured in the state (including all metadata) it was in at the time the litigation hold responsibility came into effect.

This is a long way of stating that archives that convert original data during the archiving process need to be scrutinized and considered for temporary suspension after the litigation hold responsibility appears if the archive copy is the only copy of record.

A related issue occurs when organizations responding to an eDiscovery order need to migrate responsive data out of an archive that has converted data. To respond, the data must be converted back into the original format – which carries risk of data corruption and loss. If not handled properly, the migration process can violate the legal requirement to keep potentially responsive data unchanged from the format it is in when litigation commences.

Here’s the good news… Archives that store and manage data in its original native format nullify this risk.

Data Conversion And Analytics
One of the biggest challenges associated with archived data that has been converted is that of running data analytics processes against it. Data analytics (DA) is the process (via specialized systems and software) of examining large data sets in order to draw conclusions about the information they contain. Data analytics is utilized virtually everywhere today – from business-to-business, to business-to-consumer, to internal to nonprofits and governmental applications. For example, business organizations collect and analyze data associated with customer activities such as purchasing practices and customer support, business processes, market economics and other activities. Large data sets are categorized, stored and analyzed to study purchasing, usage, and problem trends as well as numerous other patterns. Data analytics is huge and growing – its possibilities are virtually limitless.

Here is a sticking point with data analytics and archives however… If the data has been converted, unless the converted format is standard such as PST or EML, the data analytics application won’t be able to utilize the converted data, invalidating the value of the data set and DA software. And by the way, most archives that convert data use a proprietary format. Uh oh.

Another Problem With Data Conversion, It Lays The Groundwork For Data Ransom
Organizations that store their content in proprietary cloud-based archives are more susceptible to being charged large amounts of money to extract their data for any reason – even (i.e., especially) due to vendor dissatisfaction. These cloud archives use the excuse that they must reconvert the data back to its original format before they can allow it to be moved. They make excuses that this reconversion process will take a great deal of time and cost. In fact, some cloud archive vendors will charge huge amounts of money to perform this reconversion process - sometimes eclipsing the monthly storage cost by a factor of 20 times, or more. And, some will even tell you that it will take a ridiculously long period of time to convert and return your data (during which time, you are still paying fees to store your data in their archive). In reality, they are holding your data for ransom hoping you will not be willing to pay the exorbitant costs and will instead acquiesce and leave the data in their archive.

My advice here is when dealing with cloud archive vendors, you need to ask a few questions:

Do you store data in its original format or do you convert it?
Can I move my data out at any time without charge or penalty?
Can I get this in writing?

Bottom line, avoid archives that do not store your content in its original format and charge you ransom to remove it.

Instead, seek an archive solution that maintains your data in its original format. The archive should also offer or be able to point you to solutions that will enable you to maintain control over all of your corporate information while leveraging the cloud to reduce your on-premises information management and storage costs. Specifically, the solution(s) should allow you to:

maintain information in its true, native format, including metadata
apply granular access controls and powerful data encryption
proactively meet regulatory storage and management requirements
retain control by storing your information in your Azure tenancy
reduce costs with usage-based pricing

By seeking a solution(s) that delivers these key features, you will not only put yourself in an ideal position for legal eDiscovery and/or demonstrating regulations compliance, you will be able to eliminate the cost and complexity inherent with traditional backup and on-premises archiving solutions and transform your information into a data-rich source of business intelligence – all while reducing you on-premises information management and storage costs by as much as 90 percent. Not bad, right?

About The Author
Bill Tolson has more than 25 years of experience with multinational corporations and technology start-ups, including 15-plus years in the archiving, ECM, information governance, regulations compliance and legal eDiscovery markets. Prior to joining Archive360, Bill held leadership positions at Actiance, Recommind, Hewlett Packard, Iron Mountain, Mimosa Systems, and StorageTek. Bill is a much sought and frequent speaker at legal, regulatory compliance and information governance industry events and has authored numerous articles and blogs. Bill is the author of two eBooks: “The Know IT All’s Guide to eDiscovery” and “The Bartenders Guide to eDiscovery.” He is also the author of the book “Cloud Archiving for Dummies” and co-author of the book “Email Archiving for Dummies.” Bill holds a Bachelor of Science degree in Business Management from California State University Dominguez Hills.