grafana

# Grafana Mimir: Scaling Prometheus Metrics to 1 Billion Series and Beyond

As observability demands grow exponentially, DevOps engineers and SREs face a critical challenge: how to store, query, and manage Prometheus metrics at massive scale without breaking the bank or sacrificing performance. Enter Grafana Mimir , an open-source time…

Opsgenie

28 Nov 2025 — 4 min read

In this technical guide, I'll walk you through Grafana Mimir's architecture, deployment strategies, and practical implementation patterns that will help you scale your observability infrastructure efficiently.

## What is Grafana Mimir?

Grafana Mimir is an open-source, horizontally scalable time series database purpose-built for long-term storage of Prometheus and OpenTelemetry metrics.[1] Launched in 2022 by Grafana Labs, Grafana Mimir combines the best features from the Cortex project with innovations learned from running Grafana Cloud at massive scale.[1] The project has already gained significant traction with over 4,700 GitHub stars and 30 active maintainers.[2]

The core promise of Grafana Mimir is ambitious yet achievable: scale to 1 billion active series and beyond while maintaining high availability, multi-tenancy, and blazing-fast query performance.[1][7] Unlike traditional Prometheus setups that struggle with retention periods beyond a few weeks, Grafana Mimir enables durable, long-term storage without operational complexity.

## Architectural Foundation

Grafana Mimir follows a microservices-based architecture where all components compile into a single binary.[3][4] This design elegance means you control which components run using the `-target` parameter, enabling flexible deployment patterns from monolithic mode to distributed setups.

The architecture comprises several key components working in concert:

- **Distributor**: Handles incoming writes from Prometheus instances via the remote write API - **Ingester**: Buffers incoming samples and makes them available for querying - **Compactor**: Performs background compaction of time series data - **Querier**: Executes PromQL queries against stored data - **Query Frontend**: Routes and caches query requests for optimal performance

This separation of concerns ensures that each component can scale independently, preventing resource contention and enabling precise capacity planning.

## The Mimir 3.0 Revolution: Decoupled Architecture

The recently released Mimir 3.0 represents a watershed moment for the project.[2] The headline feature is a fundamentally redesigned architecture that decouples read and write operations through Apache Kafka as an asynchronous buffer.[2]

Why does this matter? In earlier versions, the ingester handled both reads and writes, meaning heavy query loads could degrade ingestion performance. This architectural limitation created cascading failures during traffic spikes. Mimir 3.0 eliminates this coupling entirely.

The performance implications are substantial:

- **92% reduction in peak memory usage** through the new Mimir Query Engine (MQE)'s streaming approach[2] - **15% resource savings** in large clusters while improving performance and reliability[2] - **Significantly reduced risk** of read path outages from ingester failures[2]

The new Mimir Query Engine uses a streaming approach rather than bulk sample processing, loading only necessary samples at each query execution step.[2] This maintains 100% PromQL compatibility while dramatically improving efficiency.

## Deployment Patterns for DevOps Teams

Grafana Mimir supports multiple deployment modes, allowing you to start small and scale horizontally as needs grow.

### Monolithic Mode

Perfect for testing and development environments, monolithic mode runs all components in a single process:

./mimir -config.file=mimir-config.yaml -target=all

### Read-Write Mode

For production environments, separate read and write paths for better resource isolation:

# Write path
./mimir -config.file=mimir-config.yaml -target=write

# Read path
./mimir -config.file=mimir-config.yaml -target=read

### Distributed Mode

Enterprise deployments leverage individual component targeting for maximum flexibility and scalability.

## Prometheus Integration

Integrating Prometheus with Grafana Mimir leverages the Prometheus remote write API, which emits batched, Snappy-compressed Protocol Buffer messages.[4] Configuration is straightforward:

global:
  scrape_interval: 15s

remote_write:
  - url: http://mimir-distributor:9009/api/prom/push
    headers:
      X-Scope-OrgID: tenant-id
    queue_config:
      capacity: 10000
      max_shards: 200
      min_shards: 1
      max_samples_per_send: 10000
      batch_send_wait: 5s
      min_backoff: 30ms
      max_backoff: 100ms

Each HTTP request must include a tenant ID header for multi-tenancy isolation.[4] External reverse proxies handle authentication and authorization, maintaining clean separation of concerns.

## Multi-Tenancy and Resource Isolation

Grafana Mimir's built-in multi-tenancy enables metrics from different teams or departments to be completely isolated in storage.[6] Each tenant can only query their own data, providing security and performance guarantees without complex external systems.

This is particularly valuable for managed service providers or large enterprises where data isolation is non-negotiable. Tenant federation further enables cross-tenant queries when needed, controlled through explicit configuration.

## Query Performance and Optimization

Query sharding in Grafana Mimir splits single queries across multiple machines, enabling sub-second response times even across billions of series.[6] The query frontend caches results and intelligently routes requests, reducing load on downstream components.

For organizations migrating from Prometheus with local storage, the performance gains are transformative. Query latency remains predictable regardless of dataset size, and the streaming query engine prevents memory spikes during high-cardinality queries.

## Operational Observability

Grafana Mimir includes best-practice dashboards and runbooks for monitoring cluster health.[5] The Overview dashboard provides a high-level view with drill-down capabilities into specific components, making troubleshooting straightforward for SRE teams.

This operational maturity reflects lessons learned from running massive Grafana Mimir deployments at organizations like CERN, where scale and reliability are non-negotiable.[2]

## Migration Path and Upgrade Strategy

For organizations running existing Prometheus infrastructure, migration to Grafana Mimir requires careful planning. Grafana Labs recommends a parallel deployment approach:[2]

1. Deploy a new Grafana Mimir cluster alongside your existing infrastructure 2. Reconfigure Prometheus write clients to send metrics to both endpoints 3. Gradually migrate read clients to the new cluster 4. Validate query results match before decommissioning legacy systems

This zero-downtime approach minimizes risk and allows for easy rollback if issues arise.

## Licensing and Community

Grafana Mimir is released under the AGPLv3 license, with the full source code available on GitHub.[1] For organizations requiring commercial support, Grafana Labs offers Grafana Enterprise Metrics as a self-managed alternative with dedicated support.

## Conclusion

Grafana Mimir represents a mature, production-ready solution for scaling Prometheus metrics infrastructure. The combination of horizontal scalability, multi-tenancy, and architectural innovations in version 3.0 makes it the ideal choice for DevOps teams and SREs managing observability at scale.

Whether you're running metrics for dozens of services or billions of time series, Grafana Mimir provides the reliability, performance, and operational simplicity needed for modern observability platforms. Start with monolithic mode for evaluation, then scale horizont

# Grafana Mimir: Scaling Prometheus Metrics to 1 Billion Series and Beyond

Opsgenie

Read more

Faster Incident Diagnosis with Timeline Views

Faster Incident Diagnosis with Timeline Views

Faster Incident Diagnosis with Timeline Views

Faster Incident Diagnosis with Timeline Views