Domain centric architecture : Data driven business process powered by Snowflake Data Sharing

A few month ago I was discussing with the Data Management director of a large manufacturer. We were talking about the architecture principles of having a domain centric data service layer which would feed business processes with harmonised, contextualised and validated data. 

In each domain, the data would be owned by domain experts who have extensive knowledge of the data requirements, sources characteristics and usage. Each domain then serves the data "as a service" for it be be consumed and joined with other domains by business processes.

This vision provides the ability to quickly drive new use cases and value by leveraging preexisting domains (and related effort) without reinventing the overall data stack in the new "product". However the realisation at scale faces multiple challenges and barriers: currently the data is usually moved multiple times to be prepared and then moved sometimes to each individual product that requires this data, it creates delays, governance and quality issues, and of course infrastructure and operational costs. API could be an alternative but do not provide the ability to join multiple datasets from different domains at scale. 

Breaking the barriers: The Data Cloud

Here comes Snowflake Data Cloud. The Data Cloud is a network connecting 1000s of clients where data can be shared without moving it. The fondamental benefit is the data exists once at the producer, it will be consumed from there by consumers who will leverage Snowflake compute layer to query the data and join it with their own datasets but also potentially datasets from other producers.

What is the impact :

  • Immediate time to value
  • Data always fresh (data as-is from the producer)
  • Data always in producer quality (no deprecation during export/reprocessing)
  • End-to-end governance of the data
  • No data export, ingest, processing, validation == Faster and cheaper 

This is really a game changer to make the domain centric architecture happen. We usually think about Data Sharing in the client/supplier relationship. It can indeed be used to share data with "the extended enterprise"  but it can also be used between entities of a single client / group. 

Now imagine that each domain of a company has its own Snowflake account where it consolidates multiples sources and designs datasets to be served "as a service" to client business processes. The global organisation can apply common governance and security rules to all the domains but they remain managed by the domain experts.  


It's then possible to create new business processes / projects on top of these domains. The business processes will be able to consume, join and eventually enrich the domain specific datasets on the fly, keeping global governance rules and without moving the data around.

In the exemple above, the Design to Cost business process can consume live data from parts, BOM and finance domains using Shares. Snowflake allows to query and join these Shares and all this at a massive scale.

In order to deliver the service it's then possible to leverage all the capabilities of Snowflake as running multiple workloads on these datasets concurrently: enrich the data,  train machine learning models and make predictions,  deliver exploration and visualisation experience to end-users etc. etc.




Data Sharing is a major opportunity to rethink how data is managed and distributed across the organisations and inside the extended ecosystem.
Happy to discuss this topic with our partner ecosystem (and client of course).







Popular posts from this blog

Snowflake Data sharing is a game changer : Be ready to connect the dots (with a click)

Process XML, JSON and other sources with XQuery at scale in the Snowflake Data Cloud