![]() ![]() We need to deduplicate samples because thanks to using 3x replication, each sample was accepted by three different ingesters and is stored in three different ingester-generated blocks. Compaction reduces the number of blocks in storage, which speeds up querying and also deduplicates samples. Mimir uses compaction to solve this problem. We cannot efficiently search all these blocks from long-term storage when a user runs a query or opens a dashboard in Grafana. At ~5.5 GB per TSDB block with 10 million series, that’s about 20 TB of data daily. But with this many ingesters, we have a new problem: Each ingester will produce a single TSDB block every 2 hours. How do we ingest 1 billion series into Grafana Mimir? One thing is clear – we need a lot of ingesters! If a single ingester can handle 10 million active series, and we also use 3x replication to ensure fault tolerance and to guarantee durability, we need to run 300 of them. These features enable us to easily scale horizontally to allow us to ingest 1 billion active series, internally replicate to 3 billion time series for redundancy, and compact them back down to 1 billion again for long-term storage. In this article, we will discuss the challenges with the existing Prometheus and Cortex compactors and the new features of Grafana Mimir’s compactor. In a previous blog post, we described how we did extensive load testing to ensure high performance at 1 billion active series. Grafana Mimir, our new open source time series database, introduces a horizontally scalable split-and-merge compactor that can easily handle a large number of series. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |