EKS Cluster configuration for deploying Mach5 Search
This document contains the EKS cluster configuration requirements for deploying Mach5 Search.
Kubernetes Version
Verified Kubernetes version for the EKS cluster to deploy Mach5 Search:
- 1.32
- 1.31
- 1.29
- 1.28
Amazon EBS CSI driver
Mach5 Search needs the Amazon EBS CSI add-on added to your cluster. It manages the lifecycle of Amazon EBS volumes as storage for the Kubernetes Volumes that we create.
EKS Node Groups
Mach5 Search uses node-groups in EKS for scalability, efficient resource utilization and better performance of different parts of the system.
Managed node group configuration in Mach5:
Node group name | Desired, Min Size | Max size | Instance Type | Labels | Pre bootstrap steps | Tags |
---|---|---|---|---|---|---|
mach5-nodes | 1,1 | 1 | m6a.2xlarge | mach5-main-role = “true” | pre_bootstrap_user_data = <<-EOT setup-local-disks raid0 EOT | “k8s.io/cluster-autoscaler/cluster-name” = “owned”, “k8s.io/cluster-autoscaler/enabled” = “true”, “k8s.io/cluster-autoscaler/node-template/label/group” = “mach5-nodes“, “k8s.io/cluster-autoscaler/node-template/label/mach5-main-role” = “true” |
mach5-ccs-nodes | 1,1 | 1 | c5ad.large | mach5-ccs-role = “true” | pre_bootstrap_user_data = <<-EOT setup-local-disks raid0 EOT | “k8s.io/cluster-autoscaler/cluster-name” = “owned”, “k8s.io/cluster-autoscaler/enabled” = “true”, “k8s.io/cluster-autoscaler/node-template/label/group” = “mach5-ccs-nodes“, “k8s.io/cluster-autoscaler/node-template/label/mach5-ccs-role” = “true” |
mach5-ingestor-nodes | 0,0 | 10 | m6id.2xlarge | mach5-ingestor-role = “true” | pre_bootstrap_user_data = <<-EOT setup-local-disks raid0 EOT | “k8s.io/cluster-autoscaler/cluster-name” = “owned”, “k8s.io/cluster-autoscaler/enabled” = “true”, “k8s.io/cluster-autoscaler/node-template/label/group” = “mach5-ingestor-nodes“, “k8s.io/cluster-autoscaler/node-template/label/mach5-ingestor-role” = “true” |
mach5-compactor-nodes | 0,0 | 10 | m6id.2xlarge | mach5-compactor-role = “true” | pre_bootstrap_user_data = <<-EOT setup-local-disks raid0 EOT | “k8s.io/cluster-autoscaler/cluster-name” = “owned”, “k8s.io/cluster-autoscaler/enabled” = “true”, “k8s.io/cluster-autoscaler/node-template/label/group” = “mach5-compactor-nodes“, “k8s.io/cluster-autoscaler/node-template/label/mach5-compactor-role” = “true” |
mach5-warehouse-nodes | 0,0 | 10 | i4i.2xlarge | mach5-warehouse-worker-role = “true” | pre_bootstrap_user_data = <<-EOT setup-local-disks raid0 EOT | “k8s.io/cluster-autoscaler/cluster-name” = “owned”, “k8s.io/cluster-autoscaler/enabled” = “true”, “k8s.io/cluster-autoscaler/node-template/label/group” = “mach5-warehouse-nodes“, “k8s.io/cluster-autoscaler/node-template/label/mach5-warehouse-worker-role” = “true” |
mach5-warehouse-head-nodes | 0,0 | 10 | t3a.2xlarge | mach5-warehouse-head-role = “true | NA | “k8s.io/cluster-autoscaler/cluster-name” = “owned”, “k8s.io/cluster-autoscaler/enabled” = “true”, “k8s.io/cluster-autoscaler/node-template/label/group” = “mach5-warehouse-head-nodes“, “k8s.io/cluster-autoscaler/node-template/label/mach5-warehouse-head-role” = “true” |
Important notes:
- Pre-bootstrap command is needed in the above node group configurations to be able to configure local instance SSD as root volume.
- Make sure to propagate all the node group tags to the corresponding node group Auto scaling group too.
Log rotation in the nodes
To enable log rotation for Kubernetes pods, you can configure the following post_bootstrap_user_data script for each of the node groups mentioned above:
#!/usr/bin/env bash
SRC_CONF="/etc/systemd/system/kubelet.service.d/30-kubelet-extra-args.conf"
DST_CONF="/etc/systemd/system/kubelet.service.d/40-kubelet-extra-args.conf"
if [ -f "$SRC_CONF" ]; then
content=$(cat "$SRC_CONF")
modified_content=$(echo "$content" | sed "s/'$/
--container-log-max-size=${var.log_max_size}
--container-log-max-files=${var.log_max_files}'/")
echo "$modified_content" | tee "$DST_CONF"
else
echo '[Service]
Environment="KUBELET_EXTRA_ARGS=--container-log-max-size=${var.log_max_size}
--container-log-max-files=${var.log_max_files}"
' | tee $DST_CONF
fi
systemctl daemon-reexec
systemctl daemon-reload
systemctl restart kubelet
Node group IAM role
Following policies must be attached to the IAM role to be attached to each node group:
- arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
- arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
- arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
- arn:aws:iam::aws:policy/AmazonEC2FullAccess
- arn:aws:iam::aws:policy/AWSMarketplaceMeteringFullAccess
- arn:aws:iam::aws:policy/AWSMarketplaceMeteringRegisterUsage
- arn:aws:iam::aws:policy/AmazonS3FullAccess
- Alternatively, attach a custom policy for limited S3 permissions
- Terraform code snippet to create the custom policy:
data "aws_iam_policy_document" "mach5_vm_inline_policy_document" {
statement {
actions = [
"s3:GetBucketAcl",
"s3:GetBucketEncryption",
"s3:GetBucketLocation",
"s3:GetBucketPolicy",
"s3:GetBucketPolicyStatus",
"s3:GetBucketPublicAccessBlock",
"s3:GetBucketVersioning",
"s3:ListBucket"
]
effect = "Allow"
resources = [
"arn:aws:s3:::${var.mach5_search_s3_bucket}"
]
}
statement {
actions = [
"s3:*"
]
effect = "Allow"
resources = [
"arn:aws:s3:::${var.mach5_search_s3_bucket}/*"
]
}
}
S3 Bucket
Mach5 Search needs:
- An S3 bucket in the same AWS region as the EKS cluster
- This bucket would be used for data and OTLP logs storage by Mach5 Search
- Add a lifecycle rule to all the objects in the bucket to delete incomplete multipart uploads into the bucket.
Terraform snippet to add the lifecycle rule to the bucket:
resource "aws_s3_bucket_lifecycle_configuration" "abort_incomplete_multipart_upload" {
bucket = aws_s3_bucket.mach5_s3_bucket.id
rule {
id = "abort-incomplete-multipart-upload"
status = "Enabled"
abort_incomplete_multipart_upload {
days_after_initiation = 1
}
}
depends_on = [ aws_s3_bucket.mach5_s3_bucket ]
}
- To reduce NAT gateway costs, make sure to have an S3 VPC endpoint configured