Is ECM exciting (again)?

When I joined Accenture about 10 years ago, I was a member of AIMS, Accenture Information Management Services. At that time it was specialised in internet solutions (mainly portals and web content management), document management ("ECM") with solutions such as Documentum or Filenet and BI.
My background being Internet, I spent most of my time at Accenture working on digital projects (Video on Demand, eCommerce and digital transformation programs for various players from media, to publishing or Telcos).
However when I joined, I also had a training on Documentum and I knew then I had to keep going with "digital" stuff... yes, indeed documents were not that exciting for me.

What is ECM about ?

Enterprise Content Management (ECM) is the strategies, methods, and tools used to capture, manage, store, preserve, and deliver content and documents related to organisational processes.
ECM covers the management of information whether it is in the form of a paper document, an electronic file, a database print stream, email and other mainly unstructured sources.

From its initial definition in the early 2000s, definition did not change a lot and the solutions neither. Filenet and Documentum are for example still deployed in major insurance, finance and manufacturing organisations.

But requirements have been changing since the early 2000s

Stop thinking in silos

The business operations require to access all available content and data anywhere (from the offices or in the field) and anytime.

Operations need a real 360 of all available data from unstructured documents to structured data coming from multiple heterogeneous systems. 


ECM solutions were designed to deal with document assets with limited capabilities to manage unstructured and structured data altogether.  More important, organisations are still highly influenced by ECM technologies with dedicated teams focussing only on document management.

Big ECM ?


Due to market consolidation, major players of insurance or finance now have to deal with :

  • Variety : Multiple heterogeneous ECM solutions coming form departments or subsidiaries
  • Velocity : Tens and something hundred of thousands of documents inserted/updated per day
  • Volumes : Archiving constraints with up to ten years of documents and metadata data which must remain accessible with hundreds of millions to billions of documents.
ECM met big data with solutions designed for the early internet ages.

Again GDPR !

Now comes GDPR to protect and empower all EU citizens data privacy and to reshape the way organisations across the region approach data privacy. Of course documents are full of personal data.

What could be done ?

The first time I got involved with MarkLogic in a presale related to document management I was thinking: this is probably too simple for MarkLogic: Ok it's operational and requires some simple search and query features but at the end the metadata are very simple and not that heterogenous.
But then if you consider the previous paragraphs, there is are real opportunities to bring value.

First come the foundations :

  • Scalability: MarkLogic is a shared nothing architecture and can easily deal with hundreds of millions or billions of documents in a relatively small architecture
  • ACID: MarkLogic is ACID from day one (we are in version 9). MarkLogic’s ACID properties also apply to multi-document, multi-statement, and XA transactions (transactions between clusters), providing the unique reliability to run large-scale, operational systems for mission-critical use cases.
  • Security : MarkLogic is common criteria certified supporting multiple access control scenario (at document or even property) but it also supports encryption at rest using internal keys or external KMS and redaction.  
  • Enterprise capabilities: of course High Availability, Disaster recovery and backups are a must.
  • Search and query: the search and query capability is at the core of the product, so the product provides in one product, the schema agnostic storage (to deal with the variety  of metadata) but also the full-text search and query (based on structured and unstructured data)
  • REST extension at the core: MarkLogic allows to create REST extensions directly in the database to expose whatever data related logic. In the ECM context it allows to create bespoke access logic with minimum effort (virtual classification structure for example)
All this highly contributes to the simplification the overall architecture by providing all expected services in a single solution, avoiding the integration (and the costs) of multiple components we can find in traditional approaches (Relational Database + Search-engine + the ECM product itself).

Then comes the additional business value (and exciting opportunities):

Personnalisation to match business user contexts 

Documents are consumed by multiple business users who all have their own requirements. ECM usually expose the documents via a classification which matches business requirements at a given moment in time. By leveraging Semantics, we had the opportunity to deliver dynamic classifications that match the various requirements of the business profiles who access the documents. the classification is adapted on the fly depending on the end-user context.

360 of anything


As we mentioned before, Documents are just one additional dimension of the customer* 360 (*can be replaced by any other relevant concept) and MarkLogic is a great platform to manage golden records and 360.  
As soon as you can create a datahub mixing multiple dimensions, it then easier to extract value of the data and create new operational services.
It for example possible to extract knowledge from the documents using semantic analysis and then match this knowledge with other sources by leveraging semantics relations managed in the MarkLogic triple store. This can then be used to tackle fraud, improve customer care or provide to services to the end-users.





In short, yes documents are simple for MarkLogic but simple is beautiful especially when it provides new opportunities !






Popular posts from this blog

Domain centric architecture : Data driven business process powered by Snowflake Data Sharing

Snowflake Data sharing is a game changer : Be ready to connect the dots (with a click)

Process XML, JSON and other sources with XQuery at scale in the Snowflake Data Cloud