Telemetry & monitoring

Golf provides comprehensive monitoring capabilities including configurable health check endpoints for containerized deployments and OpenTelemetry integration for distributed tracing.

Health check endpoints

New in v0.2.11: Golf now supports sophisticated health and readiness check endpoints with custom logic, Kubernetes compatibility, and flexible response formats.

Golf provides built-in health check endpoints designed for containerized deployments, Kubernetes health probes, and load balancers. The framework supports both liveness (/health) and readiness (/ready) endpoints with customizable logic.

Modern approach (recommended)

Create custom health and readiness check files in your project root to implement sophisticated monitoring logic.

Custom health check

Create health.py in your project root for liveness probes:

from starlette.responses import JSONResponse

def check():
    """Health check function for liveness probe."""
    try:
        # Check critical dependencies
        database_ok = check_database_connection()
        external_apis_ok = check_external_apis()
        
        if database_ok and external_apis_ok:
            return JSONResponse(
                {
                    "status": "pass",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "version": "1.0.0",
                    "services": {
                        "database": "healthy",
                        "external_apis": "healthy"
                    }
                },
                status_code=200  # Healthy - Kubernetes won't restart
            )
        else:
            return JSONResponse(
                {
                    "status": "fail",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "error": "Critical services unavailable",
                    "services": {
                        "database": "healthy" if database_ok else "failed",
                        "external_apis": "healthy" if external_apis_ok else "failed"
                    }
                },
                status_code=503  # Unhealthy - Kubernetes will restart container
            )
    except Exception as e:
        return JSONResponse(
            {
                "status": "fail",
                "timestamp": "2025-09-26T13:01:10Z", 
                "error": f"Health check exception: {str(e)}"
            },
            status_code=503
        )

Generated endpoint: GET /health

Custom readiness check

Create readiness.py in your project root for readiness probes:

from starlette.responses import JSONResponse

def check():
    """Readiness check function for readiness probe."""
    try:
        # Check if ready to serve traffic (can be more strict than health)
        database_ready = check_database_connection_pool()
        cache_ready = check_redis_connection()
        queue_ready = check_message_queue()
        
        if database_ready and cache_ready and queue_ready:
            return JSONResponse(
                {
                    "status": "pass",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "services": {
                        "database": "ready",
                        "cache": "ready", 
                        "message_queue": "ready"
                    },
                    "ready_since": "2025-09-26T12:00:00Z"
                },
                status_code=200  # Ready - Kubernetes will send traffic
            )
        else:
            return JSONResponse(
                {
                    "status": "fail",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "error": "Services not ready for traffic",
                    "services": {
                        "database": "ready" if database_ready else "not_ready",
                        "cache": "ready" if cache_ready else "not_ready",
                        "message_queue": "ready" if queue_ready else "not_ready"
                    }
                },
                status_code=503  # Not Ready - Kubernetes removes from load balancer
            )
    except Exception as e:
        return JSONResponse(
            {
                "status": "fail",
                "timestamp": "2025-09-26T13:01:10Z",
                "error": f"Readiness check failed: {str(e)}"
            },
            status_code=503
        )

Generated endpoint: GET /ready

Response format options

1. JSONResponse with status codes (recommended)

from starlette.responses import JSONResponse

def check():
    return JSONResponse(
        {"status": "pass", "timestamp": "2025-09-26T13:01:10Z"}, 
        status_code=200  # Explicit control over HTTP status
    )

2. Structured dictionary (auto-converted)

def check():
    return {
        "status": "pass",  # or "fail" 
        "timestamp": "2025-09-26T13:01:10Z",
        "additional_data": "value"
    }
    # Automatically becomes JSONResponse with HTTP 200

HTTP status code guidelines

For Kubernetes health probes:

HTTP 200: Service is healthy/ready - continue normal operation
HTTP 503: Service is unhealthy/not ready - restart container or remove from load balancer

# ✅ Healthy - HTTP 200
return JSONResponse({"status": "pass"}, status_code=200)

# ❌ Unhealthy - HTTP 503 (triggers restart/removal)
return JSONResponse({"status": "fail"}, status_code=503)

# ⚠️ Degraded - HTTP 200 (still serving traffic)
return JSONResponse({
    "status": "pass", 
    "degraded": True,
    "message": "Running with reduced functionality"
}, status_code=200)

Default behavior

When custom files don’t exist but health checks are enabled via configuration:

Readiness endpoint: GET /ready returns {"status": "pass"} (HTTP 200)
Health endpoint: GET /health returns plain text “OK” (configurable)

Legacy configuration (deprecated)

The golf.json configuration approach is deprecated in favor of custom health.py and readiness.py files. This legacy approach will be removed in a future version.

{
  "health_check_enabled": true,
  "health_check_path": "/health",
  "health_check_response": "OK"
}

Troubleshooting

Common issues

Files not copied to build:

Ensure health.py/readiness.py are in project root, same level as golf.json

Import errors:

Verify files have valid Python syntax and include a check() function

Wrong status codes:

Use JSONResponse objects for explicit status code control

Telemetry

Golf introduces comprehensive telemetry capabilities through automatic OpenTelemetry integration and enhanced instrumentation.

Overview

Golf’s telemetry system provides:

OpenTelemetry integration - Configurable distributed tracing
Detailed tracing - For tools, resources, prompts, and Golf utilities
Input/output capture - Safe serialization of request and response data
Configurable detail levels - Control what gets traced and monitored

OpenTelemetry configuration

Basic configuration

Configure OpenTelemetry in your golf.json:

{
  "name": "my-project",
  "opentelemetry_enabled": true,
  "opentelemetry_default_exporter": "otlp_http"
}

Detailed tracing (optional)

By default, Golf captures basic telemetry like execution timing, success/failure rates, and error information. You can optionally enable detailed tracing to capture input and output data:

{
  "name": "my-project",
  "opentelemetry_enabled": true,
  "detailed_tracing": true  // Enable input/output capture
}

What detailed tracing captures:

Tool Input/Output - Function parameters and return values
Resource Content - Resource parameters and returned data
Golf Utilities - Full elicitation prompts/responses and sampling conversations
Request Context - Complete request metadata and session information

Security considerations:

Detailed tracing may capture sensitive user data
Recommended only for development and testing environments
Uses safe serialization with size limits to prevent issues
Can be disabled in production while keeping basic telemetry active

// Production configuration - basic telemetry only
{
  "detailed_tracing": false,  // Disable for production
  "opentelemetry_enabled": true  // Keep basic telemetry active
}

Advanced configuration

For custom telemetry setups, use environment variables:

# OTLP HTTP Exporter (recommended)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318/v1/traces
OTEL_SERVICE_NAME=my-mcp-server
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer your-token

# Console Exporter (for debugging)
OTEL_TRACES_EXPORTER=console

Getting started

Core concepts

Authentication

Advanced features

Reference

Health check endpoints

Modern approach (recommended)

Custom health check

Custom readiness check

Response format options

1. JSONResponse with status codes (recommended)

2. Structured dictionary (auto-converted)

HTTP status code guidelines

Default behavior

Legacy configuration (deprecated)

Troubleshooting

Common issues

Telemetry

Overview

OpenTelemetry configuration

Basic configuration

Detailed tracing (optional)

Advanced configuration

Getting started

Core concepts

Authentication

Advanced features

Reference

​Health check endpoints

​Modern approach (recommended)

​Custom health check

​Custom readiness check

​Response format options

​1. JSONResponse with status codes (recommended)

​2. Structured dictionary (auto-converted)

​HTTP status code guidelines

​Default behavior

​Legacy configuration (deprecated)

​Troubleshooting

​Common issues

​Telemetry

​Overview

​OpenTelemetry configuration

​Basic configuration

​Detailed tracing (optional)

​Advanced configuration

Health check endpoints

Modern approach (recommended)

Custom health check

Custom readiness check

Response format options

1. JSONResponse with status codes (recommended)

2. Structured dictionary (auto-converted)

HTTP status code guidelines

Default behavior

Legacy configuration (deprecated)

Troubleshooting

Common issues

Telemetry

Overview

OpenTelemetry configuration

Basic configuration

Detailed tracing (optional)

Advanced configuration