大纲
- *Metrics*
- *Meter Provider*
- *Meter*
- *Metric Exporter*
- *Metric Instruments*
- *Aggregation*
- *Views*
- *Language Support*
- *Specification*
Metrics
A measurement captured at runtime 在运行时捕获的测量值。
A metric is a measurement of a service captured at runtime. The moment of capturing a measurements is known as a metric event, which consists not only of the measurement itself, but also the time at which it was captured and associated metadata. Metric是在运行时捕获的服务的测量值。 捕获测量值的时刻被称为Metric Event,它不仅包含仅测量值本身,还包含捕获测量值的时刻,以及关联的元数据。
Application and request metrics are important indicators of availability and performance. Custom metrics can provide insights into how availability indicators impact user experience or the business. Collected data can be used to alert of an outage or trigger scheduling decisions to scale up a deployment automatically upon high demand. 应用程序和请求的Metrics是可用性和性能的重要标志。自定义Metrics可以深入了解可用性指标如何影响用户体验或业务。收集的数据可用于发出中断警报或在需求高峰时触发调度决策。
To understand how metrics in OpenTelemetry works, let’s look at a list of components that will play a part in instrumenting our code. 要了解OpenTelemetry中的指标是如何工作的,让我们看一下将在测量我们的代码中发挥作用的这些组件。
Meter Provider
A Meter Provider (sometimes called MeterProvider) is a factory for Meters. In most applications, a Meter Provider is initialized once and its lifecycle matches the application’s lifecycle. Meter Provider initialization also includes Resource and Exporter initialization. It is typically the first step in metering with OpenTelemetry. In some language SDKs, a global Meter Provider is already initialized for you. Meter Provider(有时称为 MeterProvider)是 Meter 的工厂。 在大多数应用程序中,Meter Provider只会初始化一次,并且其生命周期与应用程序的生命周期相匹配。 Meter Provider 初始化还包括 Resource 和 Exporter 初始化。 这通常是使用OpenTelemetry 进行计量的第一步。 在某些语言的 SDK 中,已经为您初始化了一个全局的Meter Provider。
Meter
A Meter creates metric instruments, capturing measurements about a service at runtime. Meters are created from Meter Providers. Meter 创建Metric测量量化装置,用于在运行时捕获有关服务的测量结果。 Meters是由Meter Providers创建的。
Metric Exporter
Metric Exporters send metric data to a consumer. This consumer can be standard output for debugging during development, the OpenTelemetry Collector, or any open source or vendor backend of your choice. Metric Exporters将metric数据发送给Consumer。 该Consumer可以是开发期间调试的标准输出、OpenTelemetry Collector 或您选择的任何开源或供应商后端。
Metric Instruments
In OpenTelemetry measurements are captured by metric instruments. A metric instrument is defined by: 在 OpenTelemetry 中,测量结果由Metric测量量化装置捕获。 Metric Instruments的定义为:
- Name 名称
- Kind 类型
- Unit (optional) 单元(可选)
- Description (optional) 描述(可选)
The name, unit, and description are chosen by the developer or defined via semantic conventions for common ones like request and process metrics. 名称、单元和描述由开发人员选择,或通过常见的语义约定(如请求和流程指标)定义。
The instrument kind is one of the following: Metric测量量化装置的类型可以是以下任意一个:
- Counter: A value that accumulates over time – you can think of this like an odometer on a car; it only ever goes up. Counter(计数器):随着时间的推移而累积的值——你可以把它想象成汽车上的里程表; 它只会上涨。
- Asynchronous Counter: Same as the Counter, but is collected once for each export. Could be used if you don’t have access to the continuous increments, but only to the aggregated value. Asynchronous Counter(异步计数器):与Counter相同,但每次导出时只会收集一次。如果您无法访问连续的增量,而只能访问聚合值,则可以使用。
- UpDownCounter: A value that accumulates over time, but can also go down again. An example could be a queue length, it will increase and decrease with the number of work items in the queue. UoDownCounter(增减计数器):随时间累积的值,但也可以再次减少。一个例子是队列长度,它会随着队列中工作项的数量而增减。
- Asynchronous UpDownCounter: Same as the UpDownCounter, but is collected once for each export. Could be used if you don’t have access to the continuous changes, but only to the aggregated value (e.g., current queue size). Asynchronous UpDownCounter(异步增减计数器):与UpDownCounter相同,但每次导出都会收集一次。如果您无法访问连续的变化,而只能访问聚合值(例如,当前队列大小),则可以使用该方法。
- Gauge: Measures a current value at the time it is read. An example would be the fuel gauge in a vehicle. Gauges are asynchronous. Gauge(仪表盘):在读取时测量当前值。一个例子是车辆中的油量表。Gauge是异步的。
- Histogram: A client-side aggregation of values, such as request latencies. A histogram is a good choice if you are interested in value statistics. For example: How many requests take fewer than 1s? Histogram(直方图):对值进行客户端聚合,例如请求延迟。如果您对值统计信息感兴趣,Histogram是一个不错的选择。例如:有多少个请求的时间少于 1 秒?
For more on synchronous and asynchronous instruments, and which kind is best suited for your use case, see Supplementary Guidelines. 有关Synchronous和Asynchronous Instruments 的更多信息,以及哪种Instrument最适合您的使用案例,请参阅 Guidelines。
Aggregation
In addition to the metric instruments, the concept of aggregations is an important one to understand. An aggregation is a technique whereby a large number of measurements are combined into either exact or estimated statistics about metric events that took place during a time window. The OTLP protocol transports such aggregated metrics. The OpenTelemetry API provides a default aggregation for each instrument which can be overridden using the Views. The OpenTelemetry project aims to provide default aggregations that are supported by visualizers and telemetry backends. 除了度量工具之外,聚合的概念也是需要理解的一个重要概念。聚合是一种将大量测量结果,组合成有关时间窗口期间发生的度量事件的,精确或预估统计数据的技术。OTLP 协议传输此类聚合指标。 OpenTelemetry API 为每个测量装置提供了默认聚合,可以使用视图覆盖该默认聚合。OpenTelemetry 项目旨在提供可视化工具和遥测后端支持的默认聚合。
Unlike request tracing, which is intended to capture request lifecycles and provide context to the individual pieces of a request, metrics are intended to provide statistical information in aggregate. Some examples of use cases for metrics include: 与旨在捕获请求生命周期,并为请求的各个部分提供上下文的请求跟踪(Trace)不同,指标旨在提供聚合的统计信息。Metrics用例的一些示例包括:
- Reporting the total number of bytes read by a service, per protocol type. 报告每个协议类型的服务读取的字节总数
- Reporting the total number of bytes read and the bytes per request. 报告读取的总字节数和每个请求的字节数。
- Reporting the duration of a system call. 报告系统调用的持续时间。
- Reporting request sizes in order to determine a trend. 报告请求大小以确定趋势。
- Reporting CPU or memory usage of a process. 报告进程的 CPU 或内存使用情况。
- Reporting average balance values from an account. 报告帐户的平均余额值。
- Reporting current active requests being handled. 报告当前正在处理的活动请求。
Views
视图 A view provides SDK users with the flexibility to customize the metrics output by the SDK. You can customize which metric instruments are to be processed or ignored. You can also customize aggregation and what attributes you want to report on metrics. View为 SDK 用户提供了自定义SDK输出指标的灵活性。您可以自定义要处理或忽略哪些Metric测量装置。您还可以自定义聚合以及要在指标上报的属性。
Language Support
*Metrics are a stable signal in the OpenTelemetry specification. For the individual language specific implementations of the Metrics API & SDK, the status is as follows: Metrics是OpenTelemetry 规范中的稳定信号。对于 Metrics API 和 SDK 的各个语言特定实现,状态如下:
Language | Metrics |
---|---|
C | Stable |
C#/.NET | Stable |
Erlang/Elixir | Experimental |
Go | Stable |
Java | Stable |
JavaScript | Stable |
PHP | Stable |
Python | Stable |
Ruby | In development |
Rust | Alpha |
Swift | Experimental |
Specification
To learn more about metrics in OpenTelemetry, see the metrics specification.