There are many use cases in Enterprise Content Management (ECM) for which Machine Learning can be deployed. In fact, i’d argue that you can apply machine learning in all the stages of content life cycle. You can apply:
Supervised learning e.g, to automatically classify images, archive documents, delete files no longer required (and not likely required in future), classify records and many moreUnsupervised learning e.g, to tag audio and videos, improve your business processes (e.g., approve a credit limit based on a machine learning algorithm instead of fixed rules), bundle related documents using clustering and so on
What are ECM vendors currently offering?
Not much i’d say. These are still early days.
To be fair, Artificial Intelligence and Machine Learning have been used for a long time in enterprise applications but their usage has really been for really complicated scenarios such as enterprise search (e.g., for for proximity, sounds etc) or sentiment analysis of social media content. But it has never been easy to use machine learning for relatively simpler use cases. Additionally, no vendor provided any SDKs or APIs using which you could use machine learning on your own for your specific use cases.
But things are gradually changing and vendors are upping their game.
In particular, the “infrastructure” ECM vendors – IBM, Oracle, OpenText and Microsoft — all have AI and ML offerings that integrate with their ECM systems to varying degrees.
OpenText Magellan is OpenText’s AI + ML engine based on open source technologies such as Apache Spark (for data processing), Spark ML (for machine learning), Jupyter and Hadoop. Magellan is integrated with other OpenText products (including Content, Experience Suites and others) and offers some pre-integrated solutions. Specifically for ECM, you apply machine learning algorithms to find related documents, classify them, do content analysis and analyse patterns. You can of course create your own machine learning programs using Python, R or Scala.
IBM’s Watson and Microsoft Azure Machine Learning get integrated with several other enterprise applications and also have connectors for their own repositories (FileNet P8 and Office365).
Amongst the specialised ECM vendors, Box is going to make its offerings generally available this year.
Box introduced Box Skills in October 2017. It’s still in beta but appears promising. You can apply machine learning to images, audios and videos stored in Box to extract additional metadata, create transcripts (for audio and video files), use facial recognition to identify people and so on. In addition, you will also be able to integrate with external providers (e.g., IBM’s Watson) to create your own machine learning use cases with content stored in Box.
Finally, there are some service providers such as Zaizi who provide machine learning solutions for specific products (Zaizi is an Alfresco partner).
Don’t wait for your vendors to start offering AI and ML
The rate at which content repositories are exploding, you will need to resort to automatic ways of classifying content and automating other aspects of content life cycle. It will soon be impossible to do all of that manually and Machine Learning provides a good alternative for those type of functionalities. If the ECM vendor provides AI/ML capabilities, that’s excellent because you not only need access to machine learning libraries but also need to integrate them with the underlying repository, security model and processes. An AI/ML engine that is pre-integrated will be hugely useful. But if your vendor doesn’t provide these capabilities yet, you still have alternatives. I’ve said this before and it applies to ECM as well:There is no need to wait for your vendors to start offering additional AI/ML capabilities. Almost all programming languages provide APIs and libraries for all kinds of machine learning algorithms for clustering, classifications, predictions, regression, sentiment analysis and so on. The key point is that AI and ML have now evolved to a point where entry barriers are really low. You can start experimenting with simpler use cases and then graduate to more sophisticated use cases, once you are comfortable with basic ones.