By David Jones, Nuxeo
Enterprise leaders have long awaited the potential information management benefits of artificial intelligence (AI). Machine learning, natural language processing, and other AI-based technologies are already helping companies by automating the classification of files and simplifying the way employees engage with content. But the real promise of AI goes beyond just helping organizations classify content during the capture or ingestion process – it lies somewhere in the “digital landfill” that exists within modern organizations.i
Today’s companies have a mass of content and data stored – often randomly, and almost always in silos – across various systems and repositories. This is the so-called ‘digital landfill’. Employees looking for important information must sort through this information quagmire, which is a thankless task that’s both time-consuming and inefficient. Imagine having to search the town dump for a pair of car keys mistakenly thrown out with the trash – that’s basically what information search comes down to for most organizations.
This is where AI comes in.
AI can help companies quickly and efficiently sift through their own digital landfills by automatically unearthing specific items of relevant information. AI also can go through the digital landfill and recycle useful information, as well as discard any content that no longer serves a useful organizational purpose.
This is a sizable opportunity. The era of Big Data and Big Content is upon us, and information management challenges will only increase as organizations begin to include audio, video, and image content as part of their digital transformation journey. Having these digital assets stuck in the mud of an organization’s digital landfill makes it impossible to extract value from them without the proper technology tools in place.
Here are three ways that AI can help companies extract that value from the digital assets residing within their landfill.
Information about information – that’s what metadata fundamentally provides, and it’s invaluable to companies that want to manage their content in an effective way.
Suppose you have a legacy enterprise content management (ECM) system that your company uses to store customer documents. These contracts and other customer information are invariably managed in a haphazard fashion, and eventually customer reference numbers are the only relevant metadata attributes associated with these documents.
Sound familiar? This is a classic document management scenario in days of yore. Each stored document served as the focal point for invoice processing, claims management, and other processes. Moreover, each of those documents contained a set of metadata attributes, or tags, associated with it. Typically, this was limited to include things such as filename, date created, author, and type of content. For most systems, once the set of metadata stored – or metadata “schema” – was defined, it usually remained untouched because changing metadata schemas required tedious development work and mass updates to all content related to that metadata.
The modern content services platform (CSP changes that. By using a CSP to pass that content through an AI enrichment engine, you can potentially append additional metadata attributes to each and every one of the files currently stored, which automatically injects more context, intelligence, and insight into your content management ecosystem.
This increased capability and the ability to utilize metadata much more effectively is a distinct benefit of a modern CSP over a legacy Document Management or ECM solution. But what about the content stored in those legacy solutions?
Another unique aspect of a CSP is that is can connect to content from legacy systems, leaving the content itself in-place (in its legacy repository), but providing access to that content from the CSP. It also offers the ability for legacy content to make use of a modern metadata schema from the CSP - effectively allowing you to add metadata properties and data to the legacy content, without making any changes to the legacy system at all. This is massively powerful - especially when combined with AI so that this process is automated.
Identifying Mission-Critical Content
What is the ‘what’, exactly? Providing insight to this question is a central element of enriching metadata.
This capability is a core facet of knowledge management, including simply identifying a document as a presentation, brochure, contract, or invoice. It comes down to the ability to surface and share information and content that is relevant to other situations. Without reliable metadata on the content, these insights are impossible – whether it’s providing existing solutions to technical support questions on a helpdesk, to providing all contracts that relate to a particular customer, and anything in between.
Moreover, compliance requirements within each industry mandate that organizations retain different type of documents and records for specific periods of time - these are known as retention policies or rules. If you can’t determine the type of the content, how on earth can you apply a retention policy to it? In the past, companies attempted to comply in one of two ways - manually, or not at all. The manual approach was incredibly tedious, error-prone, and very time-consuming – prompting many organizations to adopt a “keep everything, just in case” approach.
But by using an AI-driven engine to classify content stored within legacy systems, this becomes much easier to do. Even simple AI tools can identify the difference between a contract and a resume, but advanced engines expand this principle to build AI models based on content specific to an organization. So, for example, if your business needs to know the difference between a personal life insurance document and a life annuity document, then this can be incorporated into a specifically-trained AI model, which in turn will deliver a much more detailed classification than could ever be possible with a generic classification.
And using a CSP to apply this to the mass of content stored in those legacy systems can add significant benefit to your business and increase the visibility you have into both your key information assets and liabilities.
Out With The Old
The “keep it all just in case” approach described above not only exacerbated the digital landfill effect but also meant that a lot of information that could (and often should) have been destroyed, was not. Aside from the cost of having to store this content ad-infinitum, there are significant legal issues that arise from keeping information longer than you need to.
There is a whole industry dedicated to managing records, and we’re not going to get into the technicalities of that here. But AI can be used to help mitigate this problem significantly.
Part of the challenge of managing records, or even simply applying retention policies, is the sheer volume of content that needs to be managed. And the only way to go through this in the past was document by document.
A key point here is that, due to the legal ramifications of incorrectly declaring (or not) a record, there is a desire to still include a human interaction (or checkpoint) as part of this process in most organizations.
AI can help with this. By using AI-classification of content with a CSP, it is possible, at a massive scale, to quickly and easily determine what is NOT a record. According to numerous research studies the significant majority of content stored is ROT (redundant, trivial or obsolete) - so by clearing out huge chunks of that ROT, the task of identifying relevant content to apply retention policies to become much, much easier. And yes, AI can then be used on the remaining content to identify the type of content in more detail, match that to the retention rules, and then make recommendations to the relevant staff members. This makes the whole process of identifying, declaring and managing records (for which I really mean anything that needs to be retained against a retention rule) incredibly straightforward, much more scalable than before, and much more cost-effective given that the storage requirements for old content just got slashed.
Trash removal can be gratifying – and rewarding.
Whoever thought that sorting the trash out could be such a rewarding exercise. It is when it’s about optimizing your organization’s digital landfill.
About The Author
David Jones (Twitter: @InstinctiveDave ) is VP of Product Marketing for Content Services at Nuxeo, responsible for developing the global go-to-market strategy and execution plan for Nuxeo’s modern enterprise Content Services Platform. He has over 20 years’ experience in the emerging technologies space across multiple industries including Big Data, analytics, cloud and enterprise content management. Prior to joining Nuxeo David was Vice President of European Operations with AIIM, was CEO and Founder of a Document Management startup for 8 years and has held Product and Marketing Management roles with Konica Minolta and Hyland. David also holds a place on the AIIM Board of Directors, the nonprofit industry association. David has a holistic understanding of the challenges of stakeholders from every facet of the organization and is passionate about delivering modern, future-forward technologies and solutions that truly make a difference to organizations, employees and customers.