> ## Documentation Index
> Fetch the complete documentation index at: https://docs.golf.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Telemetry & monitoring

> Health checks, OpenTelemetry integration, and monitoring capabilities in Golf.

Golf provides comprehensive monitoring capabilities including configurable health check endpoints for containerized deployments and OpenTelemetry integration for distributed tracing.

## Health check endpoints

<Note>
  **New in v0.2.11**: Golf now supports sophisticated health and readiness check endpoints with custom logic, Kubernetes compatibility, and flexible response formats.
</Note>

Golf provides built-in health check endpoints designed for containerized deployments, Kubernetes health probes, and load balancers. The framework supports both liveness (`/health`) and readiness (`/ready`) endpoints with customizable logic.

### Modern approach (recommended)

Create custom health and readiness check files in your project root to implement sophisticated monitoring logic.

#### Custom health check

Create `health.py` in your project root for liveness probes:

```python theme={null}
from starlette.responses import JSONResponse

def check():
    """Health check function for liveness probe."""
    try:
        # Check critical dependencies
        database_ok = check_database_connection()
        external_apis_ok = check_external_apis()
        
        if database_ok and external_apis_ok:
            return JSONResponse(
                {
                    "status": "pass",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "version": "1.0.0",
                    "services": {
                        "database": "healthy",
                        "external_apis": "healthy"
                    }
                },
                status_code=200  # Healthy - Kubernetes won't restart
            )
        else:
            return JSONResponse(
                {
                    "status": "fail",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "error": "Critical services unavailable",
                    "services": {
                        "database": "healthy" if database_ok else "failed",
                        "external_apis": "healthy" if external_apis_ok else "failed"
                    }
                },
                status_code=503  # Unhealthy - Kubernetes will restart container
            )
    except Exception as e:
        return JSONResponse(
            {
                "status": "fail",
                "timestamp": "2025-09-26T13:01:10Z", 
                "error": f"Health check exception: {str(e)}"
            },
            status_code=503
        )
```

**Generated endpoint**: `GET /health`

#### Custom readiness check

Create `readiness.py` in your project root for readiness probes:

```python theme={null}
from starlette.responses import JSONResponse

def check():
    """Readiness check function for readiness probe."""
    try:
        # Check if ready to serve traffic (can be more strict than health)
        database_ready = check_database_connection_pool()
        cache_ready = check_redis_connection()
        queue_ready = check_message_queue()
        
        if database_ready and cache_ready and queue_ready:
            return JSONResponse(
                {
                    "status": "pass",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "services": {
                        "database": "ready",
                        "cache": "ready", 
                        "message_queue": "ready"
                    },
                    "ready_since": "2025-09-26T12:00:00Z"
                },
                status_code=200  # Ready - Kubernetes will send traffic
            )
        else:
            return JSONResponse(
                {
                    "status": "fail",
                    "timestamp": "2025-09-26T13:01:10Z",
                    "error": "Services not ready for traffic",
                    "services": {
                        "database": "ready" if database_ready else "not_ready",
                        "cache": "ready" if cache_ready else "not_ready",
                        "message_queue": "ready" if queue_ready else "not_ready"
                    }
                },
                status_code=503  # Not Ready - Kubernetes removes from load balancer
            )
    except Exception as e:
        return JSONResponse(
            {
                "status": "fail",
                "timestamp": "2025-09-26T13:01:10Z",
                "error": f"Readiness check failed: {str(e)}"
            },
            status_code=503
        )
```

**Generated endpoint**: `GET /ready`

### Response format options

#### 1. JSONResponse with status codes (recommended)

```python theme={null}
from starlette.responses import JSONResponse

def check():
    return JSONResponse(
        {"status": "pass", "timestamp": "2025-09-26T13:01:10Z"}, 
        status_code=200  # Explicit control over HTTP status
    )
```

#### 2. Structured dictionary (auto-converted)

```python theme={null}
def check():
    return {
        "status": "pass",  # or "fail" 
        "timestamp": "2025-09-26T13:01:10Z",
        "additional_data": "value"
    }
    # Automatically becomes JSONResponse with HTTP 200
```

### HTTP status code guidelines

**For Kubernetes health probes:**

* **HTTP 200**: Service is healthy/ready - continue normal operation
* **HTTP 503**: Service is unhealthy/not ready - restart container or remove from load balancer

```python theme={null}
# ✅ Healthy - HTTP 200
return JSONResponse({"status": "pass"}, status_code=200)

# ❌ Unhealthy - HTTP 503 (triggers restart/removal)
return JSONResponse({"status": "fail"}, status_code=503)

# ⚠️ Degraded - HTTP 200 (still serving traffic)
return JSONResponse({
    "status": "pass", 
    "degraded": True,
    "message": "Running with reduced functionality"
}, status_code=200)
```

### Default behavior

When custom files don't exist but health checks are enabled via configuration:

* **Readiness endpoint**: `GET /ready` returns `{"status": "pass"}` (HTTP 200)
* **Health endpoint**: `GET /health` returns plain text "OK" (configurable)

### Legacy configuration (deprecated)

<Warning>
  The `golf.json` configuration approach is deprecated in favor of custom `health.py` and `readiness.py` files. This legacy approach will be removed in a future version.
</Warning>

```json theme={null}
{
  "health_check_enabled": true,
  "health_check_path": "/health",
  "health_check_response": "OK"
}
```

### Troubleshooting

#### Common issues

**Files not copied to build:**

* Ensure `health.py`/`readiness.py` are in project root, same level as `golf.json`

**Import errors:**

* Verify files have valid Python syntax and include a `check()` function

**Wrong status codes:**

* Use `JSONResponse` objects for explicit status code control

## Telemetry

Golf introduces comprehensive telemetry capabilities through automatic OpenTelemetry integration and enhanced instrumentation.

## Overview

Golf's telemetry system provides:

* **OpenTelemetry integration** - Configurable distributed tracing
* **Detailed tracing** - For tools, resources, prompts, and Golf utilities
* **Input/output capture** - Safe serialization of request and response data
* **Configurable detail levels** - Control what gets traced and monitored

## OpenTelemetry configuration

### Basic configuration

Configure OpenTelemetry in your `golf.json`:

```json theme={null}
{
  "name": "my-project",
  "opentelemetry_enabled": true,
  "opentelemetry_default_exporter": "otlp_http"
}
```

### Detailed tracing (optional)

By default, Golf captures basic telemetry like execution timing, success/failure rates, and error information. You can optionally enable detailed tracing to capture input and output data:

```json theme={null}
{
  "name": "my-project",
  "opentelemetry_enabled": true,
  "detailed_tracing": true  // Enable input/output capture
}
```

**What detailed tracing captures:**

* **Tool Input/Output** - Function parameters and return values
* **Resource Content** - Resource parameters and returned data
* **Golf Utilities** - Full elicitation prompts/responses and sampling conversations
* **Request Context** - Complete request metadata and session information

**Security considerations:**

* Detailed tracing may capture sensitive user data
* Recommended only for development and testing environments
* Uses safe serialization with size limits to prevent issues
* Can be disabled in production while keeping basic telemetry active

```json theme={null}
// Production configuration - basic telemetry only
{
  "detailed_tracing": false,  // Disable for production
  "opentelemetry_enabled": true  // Keep basic telemetry active
}
```

### Advanced configuration

For custom telemetry setups, use environment variables:

```bash theme={null}
# OTLP HTTP Exporter (recommended)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318/v1/traces
OTEL_SERVICE_NAME=my-mcp-server
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer your-token

# Console Exporter (for debugging)
OTEL_TRACES_EXPORTER=console
```