v6.1.0 Release Notes
Migration Guide
General Updates
- Upgrading from v5.9.x or earlier: follow the v5.10.0 migration guide before upgrading to v6.1.0. See migration guide from v4.7.0 to v6.1.0.
- Rename
orphanedFileGracePeriodDurationseconds→orphanedFileGracePeriodDurationSecondsin any custom config before upgrading. - Upgrade to the latest
mach5-sdk-tsto use the new gRPC query API.
API Updates
- Update the ingest pipeline API payloads for S3, Iceberg, and Kafka as:
S3 pipeline
Move the scan-tuning fields inside the connector config:
Previous
{
"source_config": {
"config": {
"type": "s3"
}
},
"scan_filter_mpl": 1,
"scan_filter_batch_size": 8192,
"segment_bin_capacity_bytes": 104857600,
"max_ingest_workflows_limit": 4,
"segment_cache_size": 134217728,
"append_batch_size": 104857600,
"ignore_mapping_errors": false,
"max_files_per_ingestor": 100,
"workflow_timeout_seconds": 7200
}
6.1 snippet
{
"source_config": {
"config": {
"type": "s3",
"scan_mode": "enumerated",
"scan_filter_mpl": 1,
"scan_filter_batch_size": 8192,
"segment_bin_capacity_bytes": 104857600
}
},
"max_ingest_workflows_limit": 4,
"segment_cache_size": 134217728,
"append_batch_size": 104857600,
"ignore_mapping_errors": false,
"workflow_timeout_seconds": 7200
}
Iceberg pipeline
Drop the scan tuning keys and max_files_per_ingestor entirely.
Previous
{
"source_config": {
"config": {
"type": "iceberg"
}
},
"scan_filter_mpl": 1,
"scan_filter_batch_size": 8192,
"segment_bin_capacity_bytes": 104857600,
"max_ingest_workflows_limit": 4,
"segment_cache_size": 134217728,
"append_batch_size": 104857600,
"ignore_mapping_errors": false,
"workflow_timeout_seconds": 7200,
"max_files_per_ingestor": 100
}
6.1 snippet
{
"source_config": {
"config": {
"type": "iceberg"
}
},
"max_ingest_workflows_limit": 4,
"segment_cache_size": 134217728,
"append_batch_size": 104857600,
"ignore_mapping_errors": false,
"workflow_timeout_seconds": 7200
}
Kafka pipeline
Remove the scan filters, segment_bin_capacity_bytes, and max_files_per_ingestor
Previous
{
"source_config": {
"config": {
"type": "kafka"
}
},
"scan_filter_mpl": 1,
"scan_filter_batch_size": 8192,
"segment_bin_capacity_bytes": 104857600,
"max_ingest_workflows_limit": 4,
"segment_cache_size": 134217728,
"append_batch_size": 104857600,
"ignore_mapping_errors": false,
"workflow_timeout_seconds": 7200,
"max_files_per_ingestor": 100
}
6.1 snippet
{
"source_config": {
"config": {
"type": "kafka"
}
},
"max_ingest_workflows_limit": 4,
"segment_cache_size": 134217728,
"append_batch_size": 104857600,
"ignore_mapping_errors": false,
"workflow_timeout_seconds": 7200
}
What’s Changed
Query Language & MDX Enhancements
typeof()andto-typed()add runtime type inspection and explicit casting for finer control over value interpretation across heterogeneous sources.- Composite aggregations surface missing-value buckets via
missing_bucket;Termsaggregations now produce correct results for boolean and IP fields. mdx-clinow accepts a query as a command-line argument, simplifying scripting without intermediate files.
SQL Connectivity
- Mach5 now offers PostgreSQL compatibility, so you can connect using any PostgreSQL-compatible client or BI tool.
- Trino connectivity reaches production readiness with stability and correctness fixes for distributed query workloads.
- SQL jobs now carry owner identity and track lifecycle state end-to-end, enabling resource governance in multi-tenant environments.
Garbage Collection
- Orphaned file cleanup is more robust: several edge cases leaving files uncollected are fixed, and the deletion grace period is increased to 7 days.
- A task leak in the fdb-reconciler causing unbounded memory growth on retry loops has been resolved.
Performance, Reliability & Infrastructure
- Full-text indexing throughput is improved via batching and parallelism at the segment and term level.
- Memory estimates now include projected columns, and both FSM and ART structures have a reduced baseline footprint.
- Hash joins can spill to disk under memory pressure, preventing OOM failures on large join workloads.
- Several correctness fixes land in this release: disk cache hot path and
cstoresnapshot consistency - Dremel null handling across all read modes, and three concurrency issues (PostOffice race,
MutationGapError, job cancellation deadlock). - nginx
worker_connectionsis now configurable viavalues.yamlfor high-concurrency deployments.
External Integrations
- OpenSearch and Elasticsearch can be connected as external data sources with query pushdown support.
- Azure Blob Storage is now a supported ingest target with full Iceberg table support. GCS users can ingest Iceberg tables via native BigLake bucket integration.
- BigQuery SQL pushdown is corrected for several patterns that previously fell back to client-side execution.
UI (Dex) & API changes
- Cell outputs are updated in-place as data arrives rather than being fully recreated, reducing flicker.
- Browser sessions refresh more reliably, reducing stale-session errors in long-running notebooks.
- Editor fixes: Ctrl/Cmd+A selects the full active cell, connection resolution works in the ingest pipeline edit view, and dynamic value rendering is consistent.
- Ingest pipeline source configs must be updated to the new per-type proto format when creating through API.
- When creating a warehouse, you can choose how memory is managed during query execution. This helps balance performance, reliability, and cost based on your workload.
- Route indices to different stores using namespace-based regex patterns.