In today’s market, I.T. dollars are in short supply and there’s an increasing requirement for organizations to reduce operating costs. Projects are scrutinized closely in order to ensure a solid ROI before any significant budgetary expenditure can be authorized. In this restricted operational model, automated document classification can easily demonstrate its value simply on the basis of the hardware and storage savings that it can permit.
We know that unstructured data generally accounts for about 80% of all content in a given organization. It’s also true that organizations can lose track of data due to mergers, organizational changes, lack of a consistently applied document management policy, and other factors. Unrealistic email retention policies, unmanaged file shares, or a general “save everything” mentality can result in the accumulation of massive archives containing data that is, to be frank, largely useless. What’s the point of having every file or email ever sent by each employee if (a) no one is interested in them, (b) few employees know they exist, and (c) the cost of maintaining the servers outweighs any possible benefit of retention?
Given a well structured taxonomy, a coherent document retention policy, and a well trained classifier, organizations suffering from the type of storage nightmare described above can easily eliminate a significant percentage of pointlessly archived data, thus realizing a huge ROI while easing access and availability of truly actionable materials hidden within their existing repositories.
Evaluating the long term cost savings of such a project requires a solid analysis of existing archival data and its overall relevance to current business and regulatory requirement. Once such an analysis has been performed, and a content classifier has been trained to provide a level of accuracy appropriate to the data to be classified, the ongoing task generally involves monitoring activity and making corrections via a feedback mechanism as content changes over time. Each content item will be evaluated by the classifier and (variously) re-filed in a centralized repository, left in place, or removed from the system. Individual organizations can design their own solution and final document disposition policies based on specific organizational requirements and solution design requirements.
Think of some of the potential savings in your own organization. Are you operating older systems solely for the purpose of maintaining years of unorganized or semi-organized files, with no clear idea how much of the information on these shares is usable or in use? How much does each system, potentially running an out of support OS or locally developed content archive, cost to operate in a given year? What’s the organization’s legal or regulatory exposure should the system die unexpectedly? How much time do your employees spend managing such servers? How much space and other resources do such systems consume in your data center? Even worse, are some or all of these systems located in unmanaged offices where data can be compromised or lost due to a lack of security?
Answering these questions will help you understand the benefits of implementing a centralized, managed content store that can also assist in filtering out irrelevant, outdated data using automated content classification.