Today’s biggest untapped corporate data is in the unstructured and semi-structured world of documents, emails and even larger varchar() and text() fields in all those relational databases. For the last 30 years we have tried to automate the extraction of value from this data – but it is still generally BLOB in nature – a ‘Binary Large OBject’. We tag them the best we can and then link this ‘metadata’ to the file. The primary evolution to this has been the CLOB – a ‘Character Large OBject’ that increases the size limit of the data type but nothing else.
How far have we evolved? NOT MUCH!!! We still have mostly humans tagging, integrating, re-typing and reading all this to figure out what to do next.
As we move into the next decade of the 21st century, we are still contending with:
- Chaos in knowing where the right documents and information lives. Is it in a proper ECM system, sync-and-share in the cloud, network drive, attached in an email – or probably ‘all of the above’?
- Can you find it?
- Can someone else in the future find it?
- Is it the right version?
- Is the value of what is communicated in the document or email clear?
- Does it exists in a taxonomy/category so it can be properly managed?
- How much of all what is described ROT – redundant, outdated or trivial – and should be deleted?
Organizations need to first deem this a problem – then act. The problem is big and getting bigger. The frequency, variety and complexity of content is greater than ever. One client found their customers use over 30 file types to communicate to them.
InfoDNA is here to help. Experts in this world of unstructured information and the creator of Topla, a platform to help extrapolate metadata, analyze content across systems and drive a transformation and load process into a new system.
If you need to bring some structure to your bloat and BLOBs, give us a moment to help assess and assist.