v6.1.0 Release Notes

Migration Guide

General Updates

  • Upgrading from v5.9.x or earlier: follow the v5.10.0 migration guide before upgrading to v6.1.0. See migration guide from v4.7.0 to v6.1.0.
  • Rename orphanedFileGracePeriodDurationsecondsorphanedFileGracePeriodDurationSeconds in any custom config before upgrading.
  • Upgrade to the latest mach5-sdk-ts to use the new gRPC query API.

API Updates

  • Update the ingest pipeline API payloads for S3, Iceberg, and Kafka as:

S3 pipeline

Move the scan-tuning fields inside the connector config:

Previous
{
  "source_config": {
    "config": {
      "type": "s3"
    }
  },
  "scan_filter_mpl": 1,
  "scan_filter_batch_size": 8192,
  "segment_bin_capacity_bytes": 104857600,
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "max_files_per_ingestor": 100,
  "workflow_timeout_seconds": 7200
}
6.1 snippet
{
  "source_config": {
    "config": {
      "type": "s3",
      "scan_mode": "enumerated",
      "scan_filter_mpl": 1,
      "scan_filter_batch_size": 8192,
      "segment_bin_capacity_bytes": 104857600
    }
  },
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200
}

Iceberg pipeline

Drop the scan tuning keys and max_files_per_ingestor entirely.

Previous
{
  "source_config": {
    "config": {
      "type": "iceberg"
    }
  },
  "scan_filter_mpl": 1,
  "scan_filter_batch_size": 8192,
  "segment_bin_capacity_bytes": 104857600,
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200,
  "max_files_per_ingestor": 100
}
6.1 snippet
{
  "source_config": {
    "config": {
      "type": "iceberg"
    }
  },
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200
}

Kafka pipeline

Remove the scan filters, segment_bin_capacity_bytes, and max_files_per_ingestor

Previous
{
  "source_config": {
    "config": {
      "type": "kafka"
    }
  },
  "scan_filter_mpl": 1,
  "scan_filter_batch_size": 8192,
  "segment_bin_capacity_bytes": 104857600,
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200,
  "max_files_per_ingestor": 100
}
6.1 snippet
{
  "source_config": {
    "config": {
      "type": "kafka"
    }
  },
  "max_ingest_workflows_limit": 4,
  "segment_cache_size": 134217728,
  "append_batch_size": 104857600,
  "ignore_mapping_errors": false,
  "workflow_timeout_seconds": 7200
}

What’s Changed

Query Language & MDX Enhancements

  • typeof() and to-typed() add runtime type inspection and explicit casting for finer control over value interpretation across heterogeneous sources.
  • Composite aggregations surface missing-value buckets via missing_bucket; Terms aggregations now produce correct results for boolean and IP fields.
  • mdx-cli now accepts a query as a command-line argument, simplifying scripting without intermediate files.

SQL Connectivity

  • Mach5 now offers PostgreSQL compatibility, so you can connect using any PostgreSQL-compatible client or BI tool.
  • Trino connectivity reaches production readiness with stability and correctness fixes for distributed query workloads.
  • SQL jobs now carry owner identity and track lifecycle state end-to-end, enabling resource governance in multi-tenant environments.

Garbage Collection

  • Orphaned file cleanup is more robust: several edge cases leaving files uncollected are fixed, and the deletion grace period is increased to 7 days.
  • A task leak in the fdb-reconciler causing unbounded memory growth on retry loops has been resolved.

Performance, Reliability & Infrastructure

  • Full-text indexing throughput is improved via batching and parallelism at the segment and term level.
  • Memory estimates now include projected columns, and both FSM and ART structures have a reduced baseline footprint.
  • Hash joins can spill to disk under memory pressure, preventing OOM failures on large join workloads.
  • Several correctness fixes land in this release: disk cache hot path and cstore snapshot consistency
  • Dremel null handling across all read modes, and three concurrency issues (PostOffice race, MutationGapError, job cancellation deadlock).
  • nginx worker_connections is now configurable via values.yaml for high-concurrency deployments.

External Integrations

  • OpenSearch and Elasticsearch can be connected as external data sources with query pushdown support.
  • Azure Blob Storage is now a supported ingest target with full Iceberg table support. GCS users can ingest Iceberg tables via native BigLake bucket integration.
  • BigQuery SQL pushdown is corrected for several patterns that previously fell back to client-side execution.

UI (Dex) & API changes

  • Cell outputs are updated in-place as data arrives rather than being fully recreated, reducing flicker.
  • Browser sessions refresh more reliably, reducing stale-session errors in long-running notebooks.
  • Editor fixes: Ctrl/Cmd+A selects the full active cell, connection resolution works in the ingest pipeline edit view, and dynamic value rendering is consistent.
  • Ingest pipeline source configs must be updated to the new per-type proto format when creating through API.
  • When creating a warehouse, you can choose how memory is managed during query execution. This helps balance performance, reliability, and cost based on your workload.
  • Route indices to different stores using namespace-based regex patterns.

Cookie Settings

Choose whether Mach5 can use analytics cookies

We use strictly necessary storage to remember your privacy choices and keep the banner from reappearing after you decide. With your permission, we also load PostHog analytics to measure website traffic and interactions.

You can change this at any time using the Cookie Settings control in the footer or by visiting our Privacy Notice.

Strictly necessary storage stays enabled so the site can remember your privacy choice.