ECM: preparing for the future
Data cleanup—one of the most time-consuming aspects of enterprise content management—is a precursor to cognitive computing. “There is an ideal data model for analysis, and if the data is in that structure—meaning tagged and organized—a lot of time can be saved,” says Praful Krishna, CEO of Coseer, a cognitive computing solutions provider. “Unfortunately, this happens rarely, for two reasons. First, cognitive computing deals with data that is by definition unstructured and nebulous. Second, when these systems were designed, very few people realized that the data had potential value beyond its use at the time, so they did not plan for efficient retrieval. Then it is a journey from what the data is today to data that can be consumed by computers.” Coseer is focused on extracting information from large repositories for such tasks as identifying actionable information, automating tedious workflows or providing natural language interactions with customers.
Putting metadata on top of the original raw data is a good first step. “Another step is indexing the data,” Krishna says. “When the data is indexed, it reduces the volume of data that needs to be ingested because the relevant content can be more quickly identified.” Finally, data sometimes needs to be put in a more accessible format. “Plain text or PDF documents are great,” he adds, “but if the content consists of images or a proprietary format, then the process becomes more challenging.”
One of the challenges to making content more useful for advanced analyses is that generic models provided by some cognitive computing solutions are usually not effective. “We develop specific models for each problem,” Krishna says. “The comment that cognitive computing works best in a finite domain is very true. Those who are not happy with their model have often tried to use a generic model to analyze their content when a customized one is necessary.”