Skip to main content

Metrics Standards

Quantitative metrics to Prometheus/Mimir and Datadog using CustomMetricsService.

Metric Types

Counter - Only increases (requests, errors, events) Gauge - Can go up or down (connections, queue depth, memory) Histogram - Distribution over time (latency, sizes) Summary - Percentiles over sliding window (p50, p95, p99)

Basic Usage

import { Injectable } from '@nestjs/common';
import { CustomMetricsService } from '../../common/metrics/metrics.service';

@Injectable()
export class UserService {
constructor(private readonly metricsService: CustomMetricsService) {}

async createUser(values: UserCreateInput): Promise<User> {
const startTime = Date.now();

try {
const user = await this.repository.save(values);

// Increment success counter
this.metricsService.incrementCounter('users_created_total', {
status: 'success',
});

// Record duration
const duration = (Date.now() - startTime) / 1000;
this.metricsService.recordHistogram('user_creation_duration_seconds', duration);

return this.toUser(user);
} catch (error) {
// Count errors
this.metricsService.incrementCounter('users_created_total', {
status: 'error',
});
throw error;
}
}
}

Methods

incrementCounter(name, labels, value) Increment a counter metric.

this.metricsService.incrementCounter('operations_total', {
operation: 'create',
status: 'success',
});

setGauge(name, value, labels) Set a gauge to specific value.

this.metricsService.setGauge('active_connections', 42, {
pool: 'database',
});

incrementGauge(name, value, labels) Increase a gauge value.

decrementGauge(name, value, labels) Decrease a gauge value.

recordHistogram(name, value, labels) Record observation in histogram.

const duration = (Date.now() - startTime) / 1000;
this.metricsService.recordHistogram('operation_duration_seconds', duration, {
operation: 'query',
});

recordSummary(name, value, labels) Record observation in summary for percentiles.

Naming Conventions

Format: {domain}_{metric}_{unit}

Examples:

  • http_requests_total - Total HTTP requests (counter)
  • http_request_duration_seconds - Request duration (histogram)
  • active_connections - Current connections (gauge)
  • database_query_duration_seconds - Query time (histogram)

Units:

  • Seconds: _seconds
  • Bytes: _bytes
  • Counts: _total (counter) or no suffix (gauge)
  • Ratios: _ratio (0.0-1.0)

Labels

Use labels to add dimensions to metrics.

this.metricsService.incrementCounter('api_requests_total', {
method: 'POST',
route: '/v1/users',
status: '201',
});

Common labels:

  • method - HTTP method (GET, POST, etc.)
  • route - API route
  • status - HTTP status code or operation status
  • operation - Operation type (create, update, delete)
  • error_type - Error class name

Anti-Patterns

❌ Don't use high cardinality labels Labels create new time series. Avoid user IDs, timestamps, or unbounded values.

// Bad - creates unlimited time series
this.metricsService.incrementCounter('requests_total', {
user_id: userId, // WRONG - unbounded
timestamp: Date.now().toString(), // WRONG - unique every time
});

// Good - use bounded label values
this.metricsService.incrementCounter('requests_total', {
method: 'GET', // Limited values
status: '200', // Limited values
});

❌ Don't forget to track errors Count both successes and failures.

// Bad - only counts successes
this.metricsService.incrementCounter('operations_total');

// Good - label success/error
try {
await operation();
this.metricsService.incrementCounter('operations_total', { status: 'success' });
} catch (error) {
this.metricsService.incrementCounter('operations_total', { status: 'error' });
throw error;
}

❌ Don't record durations in milliseconds Use seconds for duration metrics.

// Bad - milliseconds
this.metricsService.recordHistogram('duration_ms', Date.now() - start);

// Good - seconds
const duration = (Date.now() - start) / 1000;
this.metricsService.recordHistogram('duration_seconds', duration);

❌ Don't use counters for values that decrease Use gauges for values that go up and down.

// Bad - counter for connections
this.metricsService.incrementCounter('active_connections'); // Can't decrease!

// Good - gauge for connections
this.metricsService.setGauge('active_connections', count);

Other mistakes:

  • ❌ Not tracking operation duration
  • ❌ Missing error metrics
  • ❌ Inconsistent naming conventions
  • ❌ Not exposing /metrics endpoint