[Service/Component Name] Metrics¶
This document details the telemetry metrics exposed by the [Service or Component Name]. These metrics provide insights into the performance, reliability, and operational status of [briefly describe the component's purpose].
The primary meter for these metrics is [Your.Meter.Name].
Service Goal
[Provide a concise, one-sentence description of what this service is responsible for.]
Metrics¶
Use this table to list all metrics exposed by the component.
| Metric Name | Type | Description |
|---|---|---|
metric_name_total |
Counter | [Description of what the counter tracks, e.g., "Total number of operations attempted."] |
metric_name_errors_total |
Counter | [Description of what the error counter tracks.] |
metric_name_duration_seconds |
Histogram | [Description of the histogram, e.g., "The duration, in seconds, of each operation."] |
metric_name_last_event_timestamp |
Observable Gauge | [Description of the gauge, e.g., "The Unix epoch timestamp of the last successful event."] |
metric_name_active_items |
Observable Gauge | [Description of the gauge, e.g., "The number of items currently being processed."] |
Dimensions¶
Use this table to describe the dimensions (tags) associated with the metrics listed above.
| Metric Name | Dimension Name | Possible Values | Description |
|---|---|---|---|
metric_name_total |
operation |
success, skipped, failure |
[Description of the dimension.] |
reason |
[value1], [value2] |
[Description of the dimension, e.g., "The reason an operation was skipped."] |
|
metric_name_errors_total |
error_type |
Exception Name | [Description of the dimension.] |
Example KQL Queries¶
Provide useful KQL queries that users can adapt for their own monitoring and alerting needs in Azure Application Insights.
customMetrics
| where name == "metric_name_total"
| where timestamp > ago(1d)
| extend operation = tostring(customDimensions.operation)
| summarize Count = sum(value) by operation
| order by Count desc
customMetrics
| where name == "metric_name_duration_seconds"
| where timestamp > ago(7d)
| summarize percentiles(value, 95) by bin(timestamp, 1d)
customMetrics
| where name == "metric_name_last_event_timestamp"
| summarize LastEvent = max(value)
| extend AgeInSeconds = datetime_diff('second', now(), unixtime_seconds_todatetime(LastEvent))
| where AgeInSeconds > 3600 // 1 hour
| project readable_time_utc = unixtime_seconds_todatetime(LastEvent), AgeInSeconds
Metric Flow Diagram¶
Use Mermaid to create a simple diagram illustrating the process and showing where each metric is recorded.
graph TD
A[Start] --> B{Decision Point};
B -- Success --> C[Record: metric_name_total(success)];
B -- Failure --> D[Record: metric_name_errors_total];