By Ananth Chakravarthy, RVP Sales, Denodo India
Data has become an essential asset for today’s organisations. The current “data big bang,” and how it is managed, plays a critical role in the success of any business. If managed effectively, collated information drawn from a wide array of sources has the potential to transform the way a company does business. Every conceivable metric – from revenue generated, to employee productivity and customer satisfaction – can be positively impacted.
However, the steady shift towards digitisation has presented its own challenge: Data volumes have been growing at an exponential rate, which is proving to be overwhelming. This data overload, from sources such as cloud data, social media data, and big data, is turning into a burden on organisations. Sixty-five percent of companies say they have more data than they can analyse, and over 73 percent of collated data goes unused in analytics. Sifting through this information to extract actionable insights can be both time-consuming and laborious, and stakeholders ultimately spend more time searching for and preparing data than they do actually analysing it. To counter this, companies began to devise ways to access and analyse data more efficiently.
Unified Data Storage Mechanisms
In the past decade, businesses have adopted a wide variety of strategies to address this issue. Data consolidation, through which all information was gathered in one common location, has been the primary focus. Accomplished through data warehouses, and more recently data lakes, these data storage mechanisms helped to optimise the analysis of vast quantities of information.
However, although this centralised approach addressed the issue of fragmented and incomplete data scattered across various silos, it also introduced a new set of issues. Primarily, it was a challenge for individual data and business teams to extract viable information from these large pools of data. The infrastructure was rigid, and data access was time-consuming, leaving companies with information gaps and unsustainable time-to-value.
Cloud Data Storage as a Natural Evolution
In the post-enterprise-data-
Cloud data storage represented a step forward, but it also highlighted the drawbacks of relying on a monolithic data approach. Under the traditional extract, transform, load (ETL) model, any accessed data is copied and replicated from its point of origin and transferred to the desired destination. With the same data being repeatedly copied and pasted, this results in unnecessary and occasionally severe bottlenecks. The end result is a consolidated database, but at the cost of speed and resource efficiency. To strike a balance between the advantages and limitations inherent to cloud storage, organisations have begun to turn to the concept of a data mesh, and by extension, data virtualization.
The Impact of New Data Architecture
Data mesh is a decentralised, modern architectural approach to data infrastructure. Through the incorporation of logical data layers, data mesh configurations enable a democratised approach to data management through the creation of multiple domains. Each domain is an individual organisational unit, at a departmental level, responsible for a limited quantum of data, which is tied together as part of a larger collective data network.
This collective network is achieved through data virtualization. An essential component of cloud computing, data virtualization enables organisations to access and integrate data from multiple sources, regardless of where the data is stored. This means that organisations can quickly access and analyse data from different sources without the need for complex data integration projects. The ability of data virtualization to provide data access without first replicating it to a centralised repository differentiates it from earlier batch-oriented data integration approaches.
Data virtualization is also critical in helping organisations to improve the quality and consistency of their data. By creating a single overview of all existing information, it ensures that every stakeholder has access to the same data, for improved consistency and the reduction of data errors. This is especially critical in large businesses with multiple departments and thousands of users, who might otherwise work at cross-purposes. In this way, the data virtualization layer provides the necessary self-serve data platform functionality required in a data mesh architecture.
Cloud computing has emerged as a game-changer for organisations looking to modernise their data capabilities. However, to fully leverage the benefits of cloud data storage, organisations need to actively invest the time and resources in new architectures such as data mesh and transformational technologies such as data virtualization.