Gartner: Unlocking the potential of observability

By: Mrudula Bangera, Director Analyst, Gartner

Digital transformation and the growth of telemetry data in cloud environments are prompting infrastructure and operations (I&O) leaders to rethink their observability strategies. Enterprises are finding that the costs of storing and analysing observability data, whether through in-house solutions or vendor tools, often outweigh the benefits. Additionally, the proliferation of disparate monitoring tools adds inefficiency, requiring the management of multiple interfaces and data formats.

To keep pace with the growing speed and complexity of modern architectures, I&O leaders must transition from traditional monitoring to observability. Observability allows software and systems to be understood through their outputs, enabling questions about their behavior to be answered effectively.

While there are many benefits to adopting observability as a practice, and many organisations are rushing to implement observability platforms, it is crucial to consider the following key factors to ensure these benefits are realised in the future.

Build and Continuously Evolve Your Observability Strategy

The cost of observability is under scrutiny due to macroeconomic conditions, consumption-based pricing, and increased focus on cloud spending. Unlike traditional monitoring, which was based on the number of devices or hosts, modern observability platforms focus on data analytics and the valuable insights they provide.

Evaluate Value Versus Cost

To optimise observability spend, continuously assess the value of the data collected versus its cost. It is important to evaluate the cost of observing an application versus the value of insights it delivers. Similarly, even with commercial vendor tools, understanding the usage and billing patterns can help mitigate unexpected spikes and overspending. Detailed billing information aids in budgeting and planning which data needs to be sent to the observability tool and how long it needs to be retained.

Adopt Data Standards

Open Telemetry (OTel) is an emerging standard supported by most vendors. It simplifies telemetry data management and can reduce costs by eliminating some software licensing fees. Standardising data collection with OTel accelerates development cycles and reduces operational costs such as customisation and consulting services by allowing organisations to tailor telemetry data collection and processing to their specific needs.

Create a Center of Excellence

Form centralised teams for observability initiatives, comprising key stakeholders from I&O and platform teams. These teams must be responsible for standardising data formats, governance of tools, best practices, and introducing automation to improve efficiency.

Standardise the Observability Toolset

Assess and consolidate existing tools to align with organisational requirements. Additionally, enterprises should standardise data formats and automate repetitive tasks to streamline operations and enhance efficiency by minimising unnecessary workload.

Manage Tooling and Life Cycle

In cloud-native environments, most organisations generate a huge volume of telemetry data — more than five to 10 TB daily, especially log data. This creates cost-driving complexity in managing the data. Implement data management strategies to handle the large volumes of telemetry data generated daily. Create policies for telemetry data retention based on compliance and operational needs, ultimately applying processing techniques to transform and enrich data. This can involve filtering, aggregating, or performing calculations to derive data insights. Regularly review and improve telemetry life cycle management processes by continuously improving data optimisation and collection, processing, and analysis workflows based on evolving business requirements and technological advancements.

Implement Telemetry Pipelines

Adopting telemetry pipelines is increasing to address the growing complexity and volume of telemetry collected by observability tools. Telemetry pipelines allow organisations to collect, enrich, transform, and route data to multiple destinations. It helps organisations control their data, reducing ingestion and storage costs. In this approach, processing takes place in the pipeline and can reduce ingestion by filtering, discarding, routing, and transforming data.

Leverage AI Capabilities to Improve Processes and Productivity

AI plays a crucial role in enhancing observability’s strategic potential. By utilising AI and machine learning (ML), enterprises can efficiently analyse the vast amount of data collected by observability tools, providing insights that are otherwise unattainable. Organisations should adopt AI incrementally, focusing on tangible benefits.

Steps to Undertake in AI Adoption

Set realistic expectations for AI.

Explore and leverage AI/ML capabilities in observability tools.

Prioritise use cases with achievable outcomes, such as anomaly detection, probable cause analysis, and triage.

Implement predictive analytics to foresee issues and causal analytics to understand cause and effect.

Use generative AI (GenAI) for natural language data access and content creation.

Enhance current tools by automating IT operations (ITOps) workflows and remediation.

AI can significantly advance observability and troubleshooting performance issues. By leveraging AI/ML capabilities, organisations can better detect, diagnose, and resolve issues, ultimately improving system performance and user experience.

AIdataGenAI
Comments (0)
Add Comment