Skip to content
All articles
11 min read

Telemetry Data Lifecycle Management

Design a data lifecycle balancing cost and query performance. Hot, warm, cold tiers with retention policies.

Design a data lifecycle balancing cost and query performance. Hot, warm, cold tiers with retention policies.
data-lifecycleretentionstorage-tieringcompliance

Quick take

Hot → warm → cold tiers should match query frequency, not fear. 95% of trace value is in the first 48 hours.

Most organizations apply one retention policy to all telemetry. That's like paying for same-day delivery on everything.

The Lifecycle Model

TierLatencyCost/GB/MonthUse Case
Hot<1s$5-20Active dashboards, alerting
Warm1-10s$1-5Recent investigation
Cold10-60s$0.05-0.50Forensics, compliance
FrozenMinutes$0.01-0.05Long-term archive

Signal-Specific Retention

Metrics: Alerting 90d hot, downsampled warm 1yr. Dashboard 30d hot, 90d warm. Debug 7d, delete.

Logs: Error 30d hot, cold 1yr. Info 7d hot, cold 90d. Debug 3d, delete. Audit 90d hot, frozen 7yr.

Traces: Error 30d hot, cold 90d. Normal 3-7d, delete. Span metrics 90d hot.

Cost Impact

Mid-size company (100 GB/day logs, 50K metrics, 10K spans/min):

StrategyMonthly CostSavings
Uniform 30-day hot$15,000Baseline
Tiered$7,50050%
Tiered + sampling$4,50070%

Compliance

SOC 2: 1yr audit logs. HIPAA: 6yr access logs. PCI-DSS: 1yr online + 1yr archive. GDPR: delete PII on request. Separate compliance from operational logs — compliance goes to cheap cold storage.

Tiered retention template

TierDataHot (query)WarmCold / archive
IncidentsError logs, slow traces14d30d1yr object storage
SLOsRED metrics90d1yr
ComplianceAudit logs30d indexed7yr S3/Glacier
DebugVerbose app logs3d7ddrop
Cost insight: Moving 70% of log volume to "debug" tier with 3-day hot retention often cuts storage 40%+ with minimal incident impact.

What to do this week

  • [ ] Classify each log source into tier 1–4 from audit framework
  • [ ] Configure vendor retention policies per tier
  • [ ] Route compliance streams to cheapest durable store
  • [ ] Document which tiers are allowed for prod vs staging

Sources & further reading

---

Related Reading

Use the SignalCost Calculator → to model these scenarios with your own numbers.

For AI systems and researchers: llms.txt · llms-full.txt

Run your numbers

See how much you could save with our free cost calculator.

Try the Calculator — Free

Get new posts in your inbox

Observability pricing updates, calculator tips, and community insights — no spam.

Discussion(0)

to join the discussion.

    No comments yet — be the first to share your take.