There's significant debate around observability effectiveness and applying data filters to vendor streams to reduce costs. While cost management matters, I believe the community is missing the bigger picture. We should be asking: "What value does observability truly deliver? Are we using the right solution, or is something essential missing?"
The Case for Composability
There's significant debate around observability effectiveness and applying data filters to vendor streams to reduce costs. While cost management matters, I believe the community is missing the bigger picture. We should be asking: "What value does observability truly deliver? Are we using the right solution, or is something essential missing?"
The hard truth is that observability currently doesn't offer enough value. It needs to be composable, and I'll explain why.
The Million-Dollar Question
Six years ago, I found myself in a room with executives celebrating a major win - upgrading a client from a $4-6 million annual commitment to $10-12 million. I raised concerns that observability seemed like expensive insurance for production monitoring. To justify that investment, we needed to identify more use cases and deliver additional value to customers.
I embarked on what I can only call a "feature journey," searching for standout tools that would make observability truly indispensable. In retrospect, we didn't fully achieve that goal.
Don't misunderstand - observability has matured and delivers value during incidents and when engineers need to answer questions like "have we ever experienced downtime in this scenario?" or "how much did that outage impact users?" However, legacy vendor solutions face two key challenges:
Telemetry data extends beyond what any single vendor's agents emit
The most valuable telemetry data use cases require support for real-time or run-time decision-making
The Limitations Exposed
It's inevitable that we'll need more insights than a vendor can provide, and two key use cases illustrate this gap: Root Cause Analysis: For effective analysis, we need detailed dependency maps showing which database instance an app instance in a mesh is interacting with. Most traditional vendors struggle with this, while eBPF vendors are making better progress on topology mapping.
Braided Context: To link error logs to traces, we need to tag data with agent IDs before emission. Some vendors can support this, but only after modifying their products to enable this type of braiding.
The key idea is that OpenTelemetry and eBPF can address these challenges either while data remains within the detection agent on the host or during transit through the network in real time. I've seen numerous use cases go unsolved because they were deemed "too expensive" to address in the cloud after all the data had been transmitted from the customer's network, or because the necessary signals weren't captured early enough - making the answer impossible to compute.
Beyond Technical Monitoring
There are more use cases than just these two. SecOps teams need to know if vulnerable libraries are being used in production systems, exposing them to attack risks. eBPF-based solutions can detect this effectively, but why should security products recreate the entire monitoring chain just to alert SecOps engineers? And why alert SecOps at all when developers must patch the code? Furthermore, if vulnerabilities are severe enough, can SecOps block deployment to production? Usually not, because the necessary integrations don't exist.
What Composable Observability Looks Like
What is needed is a programmable platform to manage not just telemetry data but the decisions being made when observing that data.
For the SecOps example, I would leverage a "composable observability" platform to store and manage a lookup table of MITRE vulnerabilities, linking them to library data emitted by applications during startup. If my observability vendor didn't capture this data, I'd incorporate eBPF to gather the missing information and forward it to my platform. Additionally, I could capture events when vulnerabilities are loaded into production systems and route those events back to a handler component that would prevent deployments in my CI/CD pipeline until the vulnerability is resolved.
For dependency discovery in root cause analysis, I would use eBPF across every host and container in my environment to map socket endpoints. This communication map would be invaluable for managing alerts, slowdowns, or error rate spikes. By traversing the map, I could identify related signals from other components in my architecture, providing a comprehensive view for the user. This solution would need to reside in-memory, combining signals that arrive at different times from different components.
AIOps tools like Moogsoft attempt to build this map using statistical analysis of data from observability vendors. I've tried to create this same map with similar methods, but it doesn't work reliably enough to be integrated into a decision-making system that can automatically take corrective actions.
Why Composability Matters
Composable observability means harnessing the signals from telemetry data, enhancing them, and building intelligence until you can act directly from the data—without human intervention.
Why must observability be composable? Because combining different approaches is essential for meaningful results. Sometimes you need information only eBPF can capture. Other times, you need to merge multiple signals arriving at different times, requiring buffers to hold everything together. You may also need to reroute, rewrite, or manipulate data during processing.
Data filtration is just one use case. OpenTelemetry provides the foundation for enabling composability. We need to move beyond passive observation (where value depends on humans spotting issues on dashboards) into decision-making. This requires complex systems integrating buffers, ETL, data routers, and event brokers—all working together to trigger the right actions at the right time.
Watch for the next article, where I dive into the technical details of what makes composability possible.
Thoughts? Join us on slack: MyDecisive community slack
Ari