Documentation

Trino Opensearch Connector with Mach5 Search

This guide explains how to set up the Trino OpenSearch connector with Mach5 Search, deploy it using Helm, and run queries (including joins) across Mach5 Search indices.

Install helm charts

 helm repo add trino https://trinodb.github.io/charts/
 helm repo update
 helm upgrade --install my-trino-cluster trino/trino --namespace mach5 -f trino-values.yaml

Trino values file

# trino-values.yaml

# General Trino configuration
server:
  workers: 2 # You can adjust the number of worker nodes here
  nodeScheduler:
    includeCoordinator: false

# Catalogs configuration
additionalCatalogs:
  opensearch: |
    connector.name=opensearch
    opensearch.host=warehouse-os-f2311022-19d1-44b3-9fa5-8f2f6da80b #CHANGE_ME
    opensearch.port=9200
    #opensearch.scroll-field=id
    #opensearch.sort.field=age

Port-forward to access trino

 kubectl --namespace mach5 port-forward svc/my-trino-cluster-trino 8022:8080

Download trino cli

curl -L https://repo1.maven.org/maven2/io/trino/trino-cli/476/trino-cli-476-executable.jar -o trino
chmod +x trino

show catalogs; 

SHOW TABLES FROM opensearch.default;

./trino --server http://localhost:8022 --catalog opensearch --schema default --debug --execute "select count(*) from table_name;"

Restart all trino pods

kubectl delete pod -l app.kubernetes.io/name=trino -n mach5

Join example with Trino

Suppose we have the following indices with a few documents in Mach5:

# Users index
PUT users
{
  "mappings": {
    "properties": {
      "user_id": { "type": "keyword" },
      "name": { "type": "keyword" },
      "email": { "type": "keyword" }
    }
  }
}

# Orders index
PUT orders
{
  "mappings": {
    "properties": {
      "order_id": { "type": "keyword" },
      "user_id": { "type": "keyword" },
      "amount": { "type": "double" }
    }
  }
}

# Insert users
POST users/_bulk
{ "index": {} }
{ "user_id": "u1", "name": "Alice", "email": "alice@example.com" }
{ "index": {} }
{ "user_id": "u2", "name": "Bob", "email": "bob@example.com" }

# Insert orders
POST orders/_bulk
{ "index": {} }
{ "order_id": "o1", "user_id": "u1", "amount": 120.5 }
{ "index": {} }
{ "order_id": "o2", "user_id": "u2", "amount": 75.0 }
{ "index": {} }
{ "order_id": "o3", "user_id": "u1", "amount": 200.0 }

Run a join query on the 2 indices created above, using Trino CLIi as follows:

trino:default> SELECT 
            ->     u.user_id,
            ->     u.name,
            ->     o.order_id,
            ->     o.amount
            -> FROM opensearch.default.users u
            -> JOIN opensearch.default.orders o
            ->     ON u.user_id = o.user_id
            -> WHERE o.amount > 100;
 user_id | name  | order_id | amount 
---------+-------+----------+--------
 u1      | Alice | o3       |  200.0 
 u1      | Alice | o1       |  120.5 
(2 rows)

We showed how the Trino Opensearch connector allows running SQL queries, including joins across multiple indices, directly on search data, enabling teams to use familiar SQL syntax for performing analytics on Mach5 Search.