The other week I ran a webinar with my old friend from Alfresco, George Parapadakis, on the topic of Information Governance. I confess that when I was first asked to do the webinar I was a little tentative, as governance is such a broad topic and I was unsure what we could really achieve in a 30 minute presentation. The challenges of where to start and where to stop, what to cover and what not to cover were overwhelming. But then I had a bit of a light bulb moment: why not just start at the start?
When you think about it, finding the right starting point for information governance is often the biggest stumbling block for any governance initiative. As we like to say back in England,“It can be like fighting with fog.” So much to do, so much legacy, so many points of contact, so many people that it is hard to know where to start. In reality many don’t do anything at all. That became our focus – where and how to start. You can check out the webinar here.
But like any presentation I give, there is always a surprise takeaway that I never really considered fully. After the fact, I am made aware of something that resonated in a way I never expected…. in this case, the value or consideration of decision trees. A decision tree is a visual (unsurprisingly tree-like) representation of a series of decisions. In Information Management, ECM, and Records Management we base most of our work around things like File Plans, Retention Schedules, and Classification systems. Each of these is, if you think about it, little more than a series of decisions. If a document contains this then it is X and when it is X then these Z things should happen, in Y order. To put it another way, every type of information, be it a contract, a purchase order, an invoice, or a delivery note is classified by asking a series of questions about its structure, content, origin etc. When we have done that analysis, we classify it and apply a certain lifecycle to that category. Again, a lifecycle is simply series of decisions, such as, when it is this old we move it here, store it there, or even destroy it. Simple though that seems, it all gets complicated quickly. With the sheer volume of data produced and stored today, there is little likelihood that manual systems of classification and management can ever deal with the scale of the decision tasks set… hence, we typically end up doing little and living in a state of inertia.
It doesn’t need to be that way. You can simply automate some fixed sets of rules, or go even further and leverage machine learning capabilities to automate more complex sets of rules and decision making. It is worth noting that machine learning and AI systems often run on the basis of decision trees and so-called random forests to achieve their goals – they were designed for exactly this kind of situation.
It’s also worth noting that even in the most cost-sensitive organization with limited resources, the business case for automated and self-governing content and information is strong. Many organizations have designed file plans and retention programs, but due to the massive throughput of data don’t and can’t apply them effectively. Automating that work is a logical starting point for any governance initiative. As we stressed in the webinar, governance is best when it is led by a desire to improve the business, and not simply comply to regulations. Well-governed information assets are more effective in business decision making, easier to find and access, and typically result in the reduction of duplicate, redundant, and low-value assets. In turn, this reduces management and storage costs.
Policies, procedures, and governance structures and standards are all important, but they are hard to get right and to get any real organizational impetus behind them. Just doing something is more important, and exploring the world of decision trees and simple automation can be the starting point something of real value.