How data mesh is critical for data architecture of a company?

By Sridhar Jayaraman, VP-Engineering, Qentelli

CXOs across the globe are working on their organization’s data literacy and striving to build efficient data architectures because Data is certainly the new Water. As they hope to automate data preparation and distribution sooner, they are constantly looking out for ways to exploit semantics, metadata, AI/ML algorithms, and knowledge graphs. Every Chief Officer is doubling as a Chief Data Officer and exploring various strategies and Data Architecture models to fulfill their BI goals.

Why do organizations need a Data Architecture?
In today’s digital economy, every business wants to be data-driven. It is one of the top strategic goals of nearly every organization. Designing, building, and enhancing data-driven systems improves the value of enterprise data which can further enhance the business performance. So having the right Data Architecture in place is crucial. Data Architecture forms a solid foundation for an organization to have efficient and scalable data-driven systems that can improve business throughput and help gain a competitive advantage.

Over time, Data Fabric has grown to be the architecture of choice for many D&A leaders. But force applying the unified data environment when what they need is a decentralized data architecture is making it difficult for the leaders to establish self-service BI culture. The proactive organizations have already taken a plunge to decentralize the data processing and delegation that lets distributed teams access data effortlessly from a heterogeneous data environment.

Where/When should we build a Data Mesh?
The advancements in information technology over the last two decades have addressed the challenges that emerge while scaling the volumes of data and also brought down the data processing time to compute. Yet, some dimensions are still unaddressed such as – the miscellany of use cases and data consumers, frequent changes in the data landscape, and the number of data sources. Data Mesh is a paradigm that enabled data transformation in ETL models by offering data-as-a-product to domain-specific data consumers.

Data Mesh extends optimal results when:
• You have multiple business domains (functional Business Units + Functions) and there is low cohesion of data between different domains
• The number of data sources, volume, and complexity of the data structures are different for each domain but are voluminous
• Each domain needs deep analysis of its own data to make meaningful business decisions and create new products/services
• Data maturity across domains is not consistent

Data Mesh tries to address the gaps between traditional data management techniques by re-imagining the data ownership structure by enabling observability and empowering individual business teams to build data systems that meet their own BI needs, of course with certain cross-domain governance in place.

Enablers for Data Mesh Architecture
Data Mesh is not a technology or a product but is an organizational and architectural structure that requires teams and culture to be at its center. Typically in Data Mesh, each functional business unit is responsible for the QA of its data lake and is the owner of its data products. They take the responsibility of Data quality and produce ready-to-consume data products for the other BU units to access. Producing discoverable, addressable, interoperable, and secure Data sets are the primary objective of Data Mesh architecture.

When we imagine Data Mesh in a Plan View, it is a topology of layered Operational and Analytical Data. A Data Mesh is not completely equipped without:
• A common Data Infrastructure Platform made by – Technology [To ingest, process, transform and store data], Process [Common Data Standards, Governance and Quality measures], and People [Expertise in advising domains on how to analyze data for their domain]
• Monitoring platforms and tools to determine how the domain-specific pipelines are working in production
• Dedicated Data Science Capability. A CoE-led model where Data Scientists and Engineers work at the domain level to create analytics and insights for that domain along with tools and platforms to create visualizations and ML models

Conclusion
On one side we have Data Fabric, what Gartner calls the future of Data Management and on the other side, we have Data Mesh, the key to move beyond monolithic Data Lakes according to Thoughtworks. Yet, these two architectures need not be a this-or-that choice to make. They can become complementary in larger organizations with larger sets of data or in organizations where new business operating models can be created by leveraging data across domains. Since they are produced under federated computational governance, the processed data sets of Data Mesh from each domain can become ingestion sources for the Data Fabric. Similarly, Data Fabric outputs can be used by a different ‘Central’ team at a Strategic level by correlating them.

Data Mesh
Comments (0)
Add Comment