We all know how important observability is. Open source tools are always a popular option. The complexity of tool selection is always a challenge. Most organizations typically end up with a few best-of-breed tools, including different projects and databases.
As organizations continue to implement Microservices-based architecture Operational data is becoming larger and more complex due to cloud-native technologies. The old approach of sorting logs doesn’t scale because the data is scattered.
As a result, organizations continue to distributed tracing As a way to gain insight into the system. Distributed tracing helps determine where to start investigating problems, ultimately reducing time spent on root cause analysis. It acts as an observability signal that captures the entire lifecycle of a given request as it traverses a distributed service. A trace can include multiple service hops, called spanconstitutes the entire operation.
jaeger
One of the most popular open source solutions for distributed tracing is: jaeger. Yeter is an open source end-to-end solution that Cloud native computing foundation (CNCF). Yeter leverages data from the Instrumentation SDK. open telemetry (OTel)-based and supports multiple open source data stores such as Cassandra, OpenSearch, and Elasticsearch for trace storage.
Jaeger provides a UI solution for visualizing and analyzing traces along with monitoring data from Prometheus, open search now provides the option to visualize traces in the native OpenSearch visualization tool, the OpenSearch Dashboard.
trace analysis
OpenSearch offers extensive support for log analytics and observability use cases. Starting with version 1.3, OpenSearch added support for distributed trace data analysis using the observability feature. Observability allows you to analyze important rate, error, and duration (RED) metrics in your trace data. In addition, you can assess delays and errors in various components of your system and identify services that need attention.
The OpenSearch project has launched a trace analysis feature that supports OTel-compliant trace data provided by Data Prepper, an OpenSearch server-side data collector. To incorporate the popular Jaeger trace data format, OpenSearch version 2.5 introduced a trace analysis feature in Observability.
With Observability, you can now filter traces to isolate erroneous spans and quickly identify relevant logs. The same rich analytics capabilities are available for RED metrics, contextually linking traces and spans to relevant logs for use with Data Prepper trace data. The following diagram shows how to view traces using Observability.
Note that there are some differences between OTel and Yeter formats: Converting from OpenTelemetry to Jaeger See the OpenTelemetry documentation.
Try it
To try out this new feature, see Analysis of Jaeger trace data documentation. The documentation includes a Docker Compose file that demonstrates how to use the demo to add sample data and visualize it using trace analysis. To enable this feature, --es.tags-as-fields.all
to flag true
as described in the related article, GitHub issues. This is necessary for the following reasons: Limitations of the OpenSearch dashboard.
On the dashboard, you can see the top services and operation combinations with the highest latency and highest number of errors. Selecting a service or operation automatically navigates to the next page. trace The page with the appropriate filters applied, as shown in the following image. You can also apply various filters to explore your own traces and services.
next step
To try out the OpenSearch trace analysis feature, check out: open search playground or download Latest version of OpenSearch. We welcome your feedback on community forum!