v6.1.0 Release Notes

Migration Guide

General Updates

Upgrading from v5.9.x or earlier: follow the v5.10.0 migration guide before upgrading to v6.1.0. See migration guide from v4.7.0 to v6.1.0.
Rename orphanedFileGracePeriodDurationseconds → orphanedFileGracePeriodDurationSeconds in any custom config before upgrading.
Upgrade to the latest mach5-sdk-ts to use the new gRPC query API.

API Updates

Update the ingest pipeline API payloads for S3, Iceberg, and Kafka as:

S3 pipeline

Move the scan-tuning fields inside the connector config:

{
  "source_config": {
    "config": {
      "type": "s3"
    }
  },
  "scan_filter_mpl": 1,
  "scan_filter_batch_size": 8192,
  "segment_bin_capacity_bytes": 104857600,
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "max_files_per_ingestor": 100,
  "workflow_timeout_seconds": 7200
}

6.1 snippet

{
  "source_config": {
    "config": {
      "type": "s3",
      "scan_mode": "enumerated",
      "scan_filter_mpl": 1,
      "scan_filter_batch_size": 8192,
      "segment_bin_capacity_bytes": 104857600
    }
  },
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200
}

Iceberg pipeline

Drop the scan tuning keys and max_files_per_ingestor entirely.

{
  "source_config": {
    "config": {
      "type": "iceberg"
    }
  },
  "scan_filter_mpl": 1,
  "scan_filter_batch_size": 8192,
  "segment_bin_capacity_bytes": 104857600,
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200,
  "max_files_per_ingestor": 100
}

6.1 snippet

{
  "source_config": {
    "config": {
      "type": "iceberg"
    }
  },
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200
}

Kafka pipeline

Remove the scan filters, segment_bin_capacity_bytes, and max_files_per_ingestor

{
  "source_config": {
    "config": {
      "type": "kafka"
    }
  },
  "scan_filter_mpl": 1,
  "scan_filter_batch_size": 8192,
  "segment_bin_capacity_bytes": 104857600,
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200,
  "max_files_per_ingestor": 100
}

6.1 snippet

{
  "source_config": {
    "config": {
      "type": "kafka"
    }
  },
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200
}

What’s Changed

Query Language & MDX Enhancements

typeof() and to-typed() add runtime type inspection and explicit casting for finer control over value interpretation across heterogeneous sources.
Composite aggregations surface missing-value buckets via missing_bucket; Terms aggregations now produce correct results for boolean and IP fields.
mdx-cli now accepts a query as a command-line argument, simplifying scripting without intermediate files.

SQL Connectivity

Mach5 now offers PostgreSQL compatibility, so you can connect using any PostgreSQL-compatible client or BI tool.
Trino connectivity reaches production readiness with stability and correctness fixes for distributed query workloads.
SQL jobs now carry owner identity and track lifecycle state end-to-end, enabling resource governance in multi-tenant environments.

Garbage Collection

Orphaned file cleanup is more robust: several edge cases leaving files uncollected are fixed, and the deletion grace period is increased to 7 days.
A task leak in the fdb-reconciler causing unbounded memory growth on retry loops has been resolved.

Performance, Reliability & Infrastructure

Full-text indexing throughput is improved via batching and parallelism at the segment and term level.
Memory estimates now include projected columns, and both FSM and ART structures have a reduced baseline footprint.
Hash joins can spill to disk under memory pressure, preventing OOM failures on large join workloads.
Several correctness fixes land in this release: disk cache hot path and cstore snapshot consistency
Dremel null handling across all read modes, and three concurrency issues (PostOffice race, MutationGapError, job cancellation deadlock).
nginx worker_connections is now configurable via values.yaml for high-concurrency deployments.

External Integrations

OpenSearch and Elasticsearch can be connected as external data sources with query pushdown support.
Azure Blob Storage is now a supported ingest target with full Iceberg table support. GCS users can ingest Iceberg tables via native BigLake bucket integration.
BigQuery SQL pushdown is corrected for several patterns that previously fell back to client-side execution.

UI (Dex) & API changes

Cell outputs are updated in-place as data arrives rather than being fully recreated, reducing flicker.
Browser sessions refresh more reliably, reducing stale-session errors in long-running notebooks.
Editor fixes: Ctrl/Cmd+A selects the full active cell, connection resolution works in the ingest pipeline edit view, and dynamic value rendering is consistent.
Ingest pipeline source configs must be updated to the new per-type proto format when creating through API.
When creating a warehouse, you can choose how memory is managed during query execution. This helps balance performance, reliability, and cost based on your workload.
Route indices to different stores using namespace-based regex patterns.

Migration Notes

Release Notes

v6.1.0 Release Notes

Migration Guide

General Updates

API Updates

S3 pipeline

Previous

6.1 snippet

Iceberg pipeline

Previous

6.1 snippet

Kafka pipeline

Previous

6.1 snippet

What’s Changed

Query Language & MDX Enhancements

SQL Connectivity

Garbage Collection

Performance, Reliability & Infrastructure

External Integrations

UI (Dex) & API changes

Need Help?

Training Sessions

Help us understand website usage.