A Prometheus exporter that collects output buffer metrics from Trino workers, enabling safe Horizontal Pod Autoscaler (HPA) scale-down decisions for Trino clusters running on Kubernetes.
Trino workers involved in distributed queries hold output buffers containing intermediate data destined for downstream stages. Scaling down a worker that has active output buffers kills the query for all users. Standard CPU/memory metrics cannot detect this -- you need visibility into the query execution layer.
This exporter solves the problem by:
- Querying each Trino coordinator for all running queries
- Streaming-parsing the (often massive) query detail JSON to extract per-worker output buffer state
- Mapping worker IPs to Kubernetes pod names via Prometheus
- Exposing per-pod buffer metrics that HPA or custom controllers can use to protect active workers from scale-down
+---------------------+ +------------------------+
| Trino Coordinator | | Prometheus Server |
| /v1/query (REST) |<---------+ (IP-to-pod mapping) |
+----------+----------+ +----------+-------------+
| |
v v
+----------+--------------------------------+-------------+
| Trino Buffer Exporter |
| |
| 1. Query Prometheus for worker IP -> pod name mapping |
| 2. GET /v1/query?state=RUNNING from each coordinator |
| 3. Stream-parse /v1/query/{id} for outputBuffers stats |
| 4. Aggregate per-worker and expose as Prometheus gauges |
+----------------------------+-----------------------------+
|
v
+---------+---------+
| /metrics endpoint |
| (port 8000) |
+---------+---------+
|
v
+---------+---------+
| Prometheus scrape |-----> HPA / Alerts
+-------------------+
| Metric | Type | Labels | Description |
|---|---|---|---|
trino_worker_output_buffered_bytes |
Gauge | pod, namespace, release | Total bytes in output buffers waiting to be consumed |
trino_worker_output_buffered_pages |
Gauge | pod, namespace, release | Total pages in output buffers |
trino_worker_active_output_buffers |
Gauge | pod, namespace, release | Count of output buffers NOT in FINISHED state |
trino_worker_output_pages_sent |
Gauge | pod, namespace, release | Total pages already sent from output buffers |
docker build -t trino-buffer-exporter:latest .helm upgrade --install trino-buffer-exporter ./chart \
-n trino \
-f chart/values.yamlFor environment-specific overrides, layer an additional values file:
helm upgrade --install trino-buffer-exporter ./chart \
-n trino \
-f chart/values.yaml \
-f examples/values-blue-green.yaml| Parameter | Default | Description |
|---|---|---|
image.repository |
trino-buffer-exporter |
Docker image repository |
image.tag |
latest |
Docker image tag |
image.pullPolicy |
IfNotPresent |
Image pull policy |
namespace |
trino |
Kubernetes namespace for deployment |
prometheus.url |
http://prometheus-server.prometheus.svc:80 |
Prometheus server URL for IP-to-pod queries |
prometheus.auth.type |
none |
Prometheus auth type: none, basic, bearer |
prometheus.auth.existingSecret |
"" |
K8s Secret name for Prometheus credentials |
trinoAuth.type |
header |
Trino auth type: header, basic, bearer |
trinoAuth.headerName |
X-Trino-User |
Header name for header-based auth |
trinoAuth.headerValue |
"" |
Header value for header-based auth |
trinoAuth.existingSecret |
"" |
K8s Secret name for Trino credentials |
coordinators |
{default: {url: ..., release: ...}} |
Map of Trino coordinator endpoints |
workerHttpPort |
8080 |
HTTP port workers listen on (for IP mapping) |
ipToPodMetric |
trino_execution_executor_TaskExecutor_Tasks |
Prometheus metric used for IP-to-pod mapping |
pollIntervalSeconds |
15 |
Seconds between collection cycles |
requestTimeoutSeconds |
60 |
HTTP request timeout for Trino API calls |
metricsPort |
8000 |
Port for the /metrics endpoint |
logging.level |
INFO |
Log level: DEBUG, INFO, WARNING, ERROR |
serviceMonitor.enabled |
true |
Create a Prometheus ServiceMonitor resource |
serviceMonitor.interval |
15s |
Scrape interval for ServiceMonitor |
serviceMonitor.namespace |
"" |
Namespace for ServiceMonitor (defaults to deployment namespace) |
serviceMonitor.additionalLabels |
{} |
Extra labels on the ServiceMonitor |
resources.requests.cpu |
100m |
CPU request |
resources.requests.memory |
256Mi |
Memory request |
resources.limits.cpu |
500m |
CPU limit |
resources.limits.memory |
1Gi |
Memory limit |
nodeAffinity.requiredLabels |
{} |
Node affinity label requirements (empty = no constraint) |
tolerations |
[] |
Pod tolerations |
nodeSelector |
{} |
Pod node selector |
datadog.enabled |
false |
Enable Datadog log annotations |
datadog.source |
trino-buffer-exporter |
Datadog log source |
datadog.service |
trino-buffer-exporter |
Datadog log service |
Each coordinator entry requires:
coordinators:
<name>:
url: "http://<service>.<namespace>.svc:<port>"
release: "<helm-release-name>"The release label is attached to exported metrics so you can distinguish workers from different Trino deployments.
Three auth modes are supported:
Header-based (default): Sends a custom header with each request. This is the standard Trino approach for single-user service accounts.
trinoAuth:
type: "header"
headerName: "X-Trino-User"
headerValue: "monitoring"Basic auth: Username/password via HTTP Basic Authentication. Store the password in a Kubernetes Secret.
trinoAuth:
type: "basic"
existingSecret: "trino-credentials" # Must have a 'password' keyBearer token: Token-based authentication (e.g., OAuth2, JWT).
trinoAuth:
type: "bearer"
existingSecret: "trino-credentials" # Must have a 'token' keyIf your Prometheus server requires authentication:
prometheus:
url: "https://prometheus.example.com"
auth:
type: "bearer"
existingSecret: "prometheus-credentials" # Must have a 'token' keyFor Trino basic auth:
apiVersion: v1
kind: Secret
metadata:
name: trino-credentials
type: Opaque
stringData:
password: "your-password"For bearer token auth:
apiVersion: v1
kind: Secret
metadata:
name: trino-credentials
type: Opaque
stringData:
token: "your-token"See the examples/ directory for ready-to-use values files:
- values-basic.yaml -- Single coordinator, minimal configuration
- values-blue-green.yaml -- Blue/green deployment with two coordinators
- values-with-auth.yaml -- Trino basic auth + Prometheus bearer token
The exporter runs a continuous collection loop:
-
Build IP-to-pod map: Queries Prometheus for a known Trino worker metric (default:
trino_execution_executor_TaskExecutor_Tasks) to map each worker'sinstanceIP to its Kubernetes pod name. -
Discover running queries: For each configured coordinator, calls
GET /v1/query?state=RUNNINGto get the list of active query IDs. -
Stream-parse query details: For each running query, calls
GET /v1/query/{queryId}with streaming enabled. The response can be tens of megabytes for complex queries. The exporter uses ijson to incrementally parse the JSON without loading it into memory, extractingtaskStatus.self(worker address) andoutputBuffersstats (buffered bytes, pages, state). -
Aggregate and export: Buffer stats are aggregated per worker across all queries. Workers are mapped to pod names, and Prometheus gauges are updated. Metrics are cleared each cycle to avoid stale label sets.
-
Sleep and repeat: The loop sleeps for the remaining time in the poll interval before starting the next cycle.
The streaming JSON parser is critical for production use. A single Trino query detail response can exceed 100MB when the query has many splits. Loading these into memory would cause OOM kills. The ijson-based parser processes the response incrementally, keeping memory usage constant regardless of response size.
pip install -r requirements.txt
python trino-buffer-exporter.py --config config.yaml --log-level DEBUG# Lint
python -m py_compile trino-buffer-exporter.py
# Helm lint
helm lint ./chartContributions are welcome! Please see CONTRIBUTING.md for guidelines.
Trino Buffer Exporter is maintained by Simon, the agentic marketing platform that combines customer data with real-world signals to orchestrate personalized, 1:1 campaigns at scale. We built this tool to safely autoscale our Trino query clusters and open-sourced it so others can benefit.
Apache License 2.0. See LICENSE for details.