Mach5: A Modern Integrated Search and Analytics platform
12/16/2024
Introduction:
Mach5 revolutionizes search infrastructure with its cutting-edge architecture, establishing a new benchmark in performance and efficiency compared to conventional solutions like OpenSearch and Elasticsearch.
Mach5 introduces an innovative disaggregated storage architecture, which sets it apart from traditional cluster-based implementations with tightly coupled storage and compute. This architectural advancement delivers three transformative benefits:
- Significant infrastructure cost reduction,
- Adaptability to varying workloads through dynamic horizontal scalability, and
- Converged architecture allowing seamless compatibility with various standard APIs
In this blog, we will examine how Mach5 delivers substantial cost benefits while ensuring better performance compared to OpenSearch and Elasticsearch.
Cost Drivers of Infrastructure:
To understand the cost advantages of Mach5, we must first look at the key cost drivers that contribute to the total cost of ownership (TCO) in traditional search infrastructure. These drivers include:
- Infrastructure Provisioning: Effective capacity planning and optimal resource allocation.
- Data Replication: Costs associated with data replication for high availability and durability, and data loss prevention.
- Compute Resources: Requirements for processing power and memory to sustain operations.
In the following sections, we will examine how each of these cost drivers impacts the performance and efficiency of a tightly coupled Elasticsearch/OpenSearch architecture compared to Mach5's disaggregated architecture.
Traditional Infrastructure: Expensive, Rigid, Complex:
Traditional architectures leverage a distributed cluster model, where data is partitioned into shards and distributed across multiple nodes. Each node manages both the storage and processing of its assigned data. However, due to the tight coupling of compute and storage, scaling one resource necessitates scaling the other—even when only one component requires additional capacity.
This architectural rigidity introduces significant challenges for modern workloads, especially when demand patterns are unpredictable or highly variable. To understand the implications of this design, let’s examine its impact on the key cost drivers:
1. Infrastructure Provisioning:
The architecture requires nodes to be allocated based on anticipated peak usage, creating two significant challenges:
- Over-provisioning results in significant resource underutilization during normal operations, leading to unnecessarily high infrastructure costs.
- Under-provisioning leads to service degradation or outages during peak loads when clusters cannot handle traffic spikes. Users experience request timeouts, increased latency, or HTTP 429 (Too Many Requests) errors during high-demand periods.
This creates a major operational and financial challenge in environments with variable workloads, particularly when there's a substantial gap between average and peak usage. Organizations must balance resource allocation to maintain optimal performance while controlling costs. The result is often complex capacity planning and deployment strategies that lead to either costly over-provisioning or risky under-provisioning—both of which compromise service quality.
2. Data Replication:
Traditional node-coupled storage architectures require data replication across multiple nodes to maintain high availability and ensure robust fault tolerance. This architectural requirement multiplies storage and compute costs by a factor of 2-3x, depending on the replication configuration. The cascading effect of this design choice significantly amplifies the total storage footprint and associated costs, particularly as data volumes grow over time. Large-scale deployments are particularly affected, as storage costs often constitute a significant portion of the total infrastructure budget. This compounding cost factor becomes an ever-present burden for organizations attempting to scale.
3. Compute Resources:
The architecture enforces a fixed ratio between compute and storage resources, creating a rigid coupling that necessitates scaling the entire node, including storage capacity, when only additional compute power is required. This inflexible architectural constraint severely limits the ability to optimize resource allocation independently, resulting in inefficient resource utilization patterns and increased operational costs. The inability to scale compute resources separately from storage creates scenarios where organizations must over-provision entire nodes just to meet specific computational demands, resulting in wasted resources and increased expenses.
The Mach5 Approach: Cost-effective, Adaptive, Converged:
Is it possible to design a system that addresses the above challenges encountered with traditional search and analytics platforms?
YES!
Mach5 answers this question with an innovative, disaggregated architecture powered by Kubernetes. While Mach5 leverages Kubernetes for autoscaling, its core advantages stem from two key architectural elements working in tandem:
- A disaggregated architecture that completely separates compute and storage resources
- A specialized storage and indexing system that:
- Eliminates replication overhead
- Uses cloud object storage instead of node-attached storage
- Enables true independent scaling of compute resources for specific workloads
This comprehensive architectural approach enables significant cost reductions—21x in total infrastructure costs and 30-45x lower storage costs per GB—which would be impossible to achieve by simply adding Kubernetes to a traditional search infrastructure. You can learn more about our architecture here.
To better understand how this architecture transforms infrastructure management, let’s evaluate its impact on the same cost drivers:
1. Infrastructure Provisioning:
Unlike traditional systems that rely on static peak estimates, Mach5 uses Kubernetes’ native autoscaling capabilities to orchestrate resources dynamically. The system continuously monitors usage patterns in real-time, automatically adjusting resources to match demand. This adaptive approach ensures sustained, high-performance operations while reducing infrastructure costs by at least a factor of 10. By allocating and deallocating resources in response to actual usage, the platform eliminates the need to maintain excess capacity during off-peak hours, delivering optimal efficiency at all times. This dynamic scalability ensures that organizations no longer need to compromise between cost and performance, delivering unmatched flexibility.
2. Data Replication:
Mach5 leverages cloud object storage such as Amazon S3, Google Cloud Storage (GCS), Azure Blob Storage, and MinIO instead of conventional node-attached storage. By adopting this cloud-first architecture, the platform eliminates the costly replication overhead typically seen in traditional search solutions like OpenSearch and Elasticsearch. At the same time, it ensures robust data durability through the advanced redundancy mechanisms provided by these storage systems. This streamlined approach results in an extraordinary 30-45 X reduction in storage costs per gigabyte or raw data, offering unparalleled value and cost efficiency for organizations of all sizes.
3. Compute Resources:
The disaggregated architecture enhances resource management by enabling compute and storage to scale independently based on demand. This allows compute resources to be precisely matched to specific workloads, such as data ingestion or querying, without the need for over-provisioning. By isolating workloads and optimizing compute power for each task, the system ensures that resources are used efficiently and only when needed. This level of granularity not only improves performance but also drives significant cost savings, as organizations only pay for the compute capacity they actually use, eliminating the need for costly, underutilized infrastructure.
Operational Cost Breakdown:
To understand the impact of these architectural choices, let’s compare the infrastructure costs of Mach5 and OpenSearch in a large-scale deployment scenario.
Storage Costs Comparison
Here’s a breakdown of the average cost to store per GB, taking into account various replication scenarios:
OpenSearch | Mach5 | ||||
---|---|---|---|---|---|
Space used | Raw data size * (1 + number of replicas) * 1.45 | Raw data size / 3 | |||
Cost / GB-month | $0.08 | $0.023 | |||
Total cost / GB (pri + 1 rep) | $0.232 | $0.0076 | |||
Total cost / GB (pri + 2 rep) | $0.348 | $0.0076 |
The table reveals stark differences in storage efficiency and cost between OpenSearch deployments and Mach5's architecture:
- OpenSearch requires 2–3x more storage due to replication requirements and has a higher base cost per GB.
- Mach5 achieves significant savings through efficient compression and elimination of needless replication.
- Unlike OpenSearch's costs, which increase linearly with each replica, Mach5's cost per GB stays constant regardless of replication.
Real-World Deployment Cost Comparison:
Let’s take a closer look at the costs involved in a real-world deployment of OpenSearch vs Mach5:
- Amount of Raw data under Management: 240 TB
- Number of queries per day: 1 million
OpenSearch (Primary + 1 Replica) | Mach5 | ||||
---|---|---|---|---|---|
Stored data size | 696 TB | 80 TB | |||
Machine type | i3.8xlarge.search ($3.994 / hour) | Search: i4i.8xlarge ($2.746 / hr) Ingestion: m6id.2xlarge ($0.4746 / hr) | |||
# of machines | ~ 100 | 4 + 2 (ingestion) | |||
Cost / hour (on-demand) | $399.4 | $11.9332 | |||
Storage cost | Included | $2.616 / hour (S3 storage) | |||
Mach5 Licensing cost | $36,864 | ||||
Total cost / year | $3,498,744 | $127,451 + $36,864 = $164,315 |
Key Insights:
- Storage Cost Comparison: Mach5’s cloud-based storage architecture is 30-45x cheaper (depending on replication factor) per GB compared to OpenSearch due to efficient compression and the elimination of replication overhead.
- Total Infrastructure Cost Comparison: In this scenario, Mach5’s total annual cost is $164,315, compared to OpenSearch's $3,498,744, resulting in a 21x reduction. This significant saving comes from better resource utilization, fewer required machines, and optimized storage.
Conclusion:
The analysis confirms that Mach5 offers exceptional cost efficiency compared to traditional search infrastructures like OpenSearch and Elasticsearch. With its disaggregated storage architecture and dynamic resource scaling, Mach5 delivers:
- A 21x reduction in total infrastructure costs for large-scale deployments
- 30-45x lower storage costs per GB
These benefits stem from Mach5’s fundamental architectural advantages, including:
- Elimination of replication overhead
- Independent scaling of compute and storage resources
- Real-time resource provisioning based on actual demand
For organizations seeking to optimize search infrastructure costs while maintaining high performance, Mach5 provides a transformative solution that redefines cost efficiency at scale.
References:
OpenSearch sizing: AWS OpenSearch Sizing Guide