Monitoring Observability Setup
Jump to navigation
Jump to search
Introduction
My setup
Resources
Component | CPU Request | CPU Limit | Memory Request | Memory Limit |
---|---|---|---|---|
Prometheus | 500m | 1–2 cores | 512Mi | 2–4 Gi |
Grafana | 100m | 500m | 128Mi | 512Mi–1 Gi |
Loki | 300m | 1 core | 512Mi | 2 Gi |
OTEL Collector | 200m | 500m | 256Mi | 512Mi–1 Gi |
Persistent Data
Component | Needs Persistence? | What It Stores | Notes |
---|---|---|---|
Prometheus | Yes | Time-series metrics (TSDB) | Use a PVC for `/prometheus` or `/data`. Retention defaults to 15d; tune via `--storage.tsdb.retention.time`. |
Grafana | Optional | Dashboards, users, config | If using SQLite (default), persist `/var/lib/grafana`. For Postgres/MySQL, persist the DB. |
Loki | Yes | Log chunks, index, metadata | Persist `/loki` or use object storage (e.g., MinIO, S3). Index and chunk retention are configurable. |
OTEL Collector | No | N/A | Stateless by design. Doesn’t store data unless you add a file exporter or buffering. |
Backups
Component | Persistence Required? | Suggested Setup | Retention Strategy |
---|---|---|---|
Prometheus | Minimal | Use emptyDir or hostPath | Retention: ~6h to 24h. Set `--storage.tsdb.retention.time=6h` |
Grafana | Optional | No volume needed unless storing user dashboards | Use provisioning (ConfigMaps) for dashboards |
Loki | Yes-ish | Use emptyDir or hostPath | Set retention via config (`table_manager`, `index`, `chunks`) |
OTEL Collector | No | Stateless | No action needed |
Migration
Step | Action | Notes |
---|---|---|
1 | Install MicroK8s | `sudo snap install microk8s --classic` — enable `dns`, `hostpath-storage`, `helm3` |
2 | Export kubeconfig | `microk8s config > ~/.kube/config` — allows kubectl/helm access |
3 | Prepare Helm charts | Use official charts for Prometheus, Grafana, Loki, OTEL Collector |
4 | Convert local configs | Translate local YAML/configs into Helm `values.yaml` or Kubernetes manifests |
5 | Deploy stack | `helm install` each component into `observability` namespace |
6 | Wire services | Use `Service`, `Ingress`, or `NodePort` to expose Grafana/Prometheus |
7 | Validate data flow | Confirm metrics/logs/traces are flowing via OTEL Collector |
Exporters
Exporter | Run Locally? | Why It Works Well | Notes |
---|---|---|---|
Caddy (custom Go exporter) | ✅ Yes | Direct access to logs/filesystem | Avoids volume mounts or sidecars in K8s |
Redis Exporter | ✅ Yes | Reads Redis metrics via TCP | Just expose `localhost:6379` or bind to 0.0.0.0 |
MySQL Exporter | ✅ Yes | Connects via socket or TCP | Use local creds; expose `/metrics` to Prometheus |
Postgres Exporter | ✅ Yes | Uses local DB connection | Can run as systemd service or container |
Network
Layer | Component | Network Role | Notes |
---|---|---|---|
Host OS | Caddy, Redis, MySQL, Postgres | Local services | Expose metrics/logs via `localhost` or `0.0.0.0` |
MicroK8s Cluster | Prometheus, Grafana, Loki, OTEL Collector | Observability stack | Internal K8s networking (`ClusterIP`, `NodePort`, `hostNetwork`) |
Prometheus | Scrapes exporters | Targets local IPs or `host.docker.internal` | Use `hostNetwork: true` or static IPs |
OTEL Collector | Ingests telemetry | Receives from app pods or local exporters | Use OTLP over HTTP/gRPC |
Grafana | Visualizes data | Access via NodePort or Ingress | Can query Prometheus, Loki, Tempo |