Your first test

This tutorial complements the Quickstart Guide by showing manual configuration creation and detailed result analysis.

Prerequisites

MCP Testing Framework installed
An MCP server to test (we’ll use a demo server)

Create a Server Configuration

Use the CLI command to create your first server configuration:

mcp-t create server

Follow the interactive prompts to configure your server. For this tutorial, you can also create the configuration manually:

mkdir -p configs/servers configs/suites

Example server configuration (configs/servers/my-first-server.json):

{
  "type": "url",
  "url": "https://no-auth-server.courier-mcp.authed-qukc4.ryvn.run/mcp/",
  "name": "hackernews_mcp_server"
}

This configures a test server that provides Hacker News functionality via MCP.

Create a Test Suite

Use the CLI command to create your first test suite:

mcp-t create suite

Follow the interactive prompts to configure your test suite. For this tutorial, you can also create the configuration manually:configs/suites/my-first-suite.json:

{
  "suite_id": "my_first_suite",
  "name": "My First Test Suite",
  "description": "Learning how to test MCP servers",
  "suite_type": "conversational",
  "created_at": "2024-01-01T00:00:00Z",
  "parallelism": 1,
  "test_cases": [
    {
      "test_id": "greeting_test",
      "user_message": "Hello! Can you help me browse Hacker News?",
      "success_criteria": "Agent should respond politely and explain Hacker News capabilities",
      "max_turns": 5,
      "context_persistence": true,
      "metadata": {
        "category": "basic_interaction",
        "priority": "high"
      }
    },
    {
      "test_id": "tool_discovery_test", 
      "user_message": "What Hacker News tools do you have available?",
      "success_criteria": "Agent should list available Hacker News MCP tools like get stories, search, etc.",
      "max_turns": 3,
      "context_persistence": true,
      "metadata": {
        "category": "tool_discovery",
        "priority": "high"
      }
    }
  ],
  "user_patience_level": "medium",
  "conversation_style": "natural"
}

Understanding Test Case Structure

Each test case includes:

test_id

string

required

Unique identifier for the test

user_message

string

required

What the AI agent will say to your server

success_criteria

string

required

Natural language description of what constitutes success

max_turns

number

Maximum conversation turns before timeout

context_persistence

boolean

Whether context carries between turns

metadata

object

Optional categorization and priority

Writing Good Success Criteria

Success criteria should be specific but flexible:Good examples:

“Agent should respond politely and explain Hacker News capabilities”
“Agent should list at least 2 available tools and explain their purpose”
“Agent should successfully fetch and display story titles with URLs”

Avoid these patterns:

Too vague: “Agent should work correctly”
Too specific: “Agent must respond with exactly ‘Welcome to Hacker News!’”
Technical: “Agent should make HTTP GET request to /stories endpoint”

Run Your First Test

Execute the test suite:

mcp-t run my_first_suite my-first-server

You’ll see a summary of test results showing which tests passed or failed, along with confidence scores from the LLM judge.

Analyze Your Results

Test Verdict Meanings

✅ PASS: Test met the success criteria
❌ FAIL: Test did not meet the success criteria
⚠️ TIMEOUT: Test exceeded maximum turns or time limit
🔥 ERROR: Technical error occurred (server unreachable, API issues)

Judge Reasoning

The LLM judge analyzes each conversation and provides:

Verdict: Pass/fail decision
Reasoning: Detailed explanation of why the test passed or failed
Confidence score: How confident the judge is (0.0-1.0)
Conversation quality: How natural and helpful the interaction was

Interpreting Confidence Scores

0.9-1.0: Very confident - clear pass/fail
0.7-0.89: Confident - good evidence for verdict
0.5-0.69: Somewhat confident - borderline cases
Below 0.5: Low confidence - may need better success criteria

View Detailed Results

For more detailed analysis:

# List recent test results
mcp-t list results

This shows recent test runs with their results and completion status.

Iterate and Improve

Adding More Test Cases

Add more test cases to your suite to cover different scenarios:

{
  "test_id": "story_fetching_test",
  "user_message": "Can you show me the top 3 stories from Hacker News?",
  "success_criteria": "Agent should successfully fetch and display at least 3 story titles with brief descriptions or URLs",
  "max_turns": 7,
  "context_persistence": true,
  "metadata": {
    "category": "core_functionality",
    "priority": "critical"
  }
}

Running with Verbose Output

Get more detailed output during test execution:

mcp-t run my_first_suite my-first-server --verbose

Testing Different Scenarios

Create suites for different types of testing:

Happy path: Normal user workflows
Edge cases: Unusual requests or error conditions
Security: Authentication, input validation
Performance: Response times, handling multiple requests

Common First-Time Issues

Server Connection Failures

If your server is unreachable:

🔥 ERROR - greeting_test
   Connection failed: Could not connect to MCP server at https://your-server.com/mcp
   Error: Connection timeout after 30 seconds

Solutions:

Verify server URL is correct and accessible
Check server is running and responding to MCP protocol
Test server manually with curl or browser

Authentication Errors

If authentication fails:

🔥 ERROR - greeting_test  
   Authentication failed: 401 Unauthorized

Solutions:

Add authentication to server configuration
Verify API keys or credentials are correct
Check server authentication requirements

Next Steps

Now that you’ve created and run your first test:

Learn Core Concepts - Understand why this testing approach works
Explore Test Types - Learn about different testing strategies
Master the CLI - Discover more powerful commands
Advanced Configuration - Learn about templates and advanced options

Creating More Tests

Use the interactive creation commands for faster development:

# Create new server configurations
mcp-t create server

# Create new test suites with templates
mcp-t create suite

# Add test cases to existing suites  
mcp-t create test-case --suite-id my_first_suite

These interactive commands guide you through configuration creation and help avoid syntax errors.

Overview

Guides

Reference

Support

Prerequisites

Understanding Test Case Structure

Writing Good Success Criteria

Test Verdict Meanings

Judge Reasoning

Interpreting Confidence Scores

Adding More Test Cases

Running with Verbose Output

Testing Different Scenarios

Common First-Time Issues

Next Steps

Creating More Tests

Overview

Guides

Reference

Support

​Prerequisites

​Understanding Test Case Structure

​Writing Good Success Criteria

​Test Verdict Meanings

​Judge Reasoning

​Interpreting Confidence Scores

​Adding More Test Cases

​Running with Verbose Output

​Testing Different Scenarios

​Common First-Time Issues

​Next Steps

​Creating More Tests

Prerequisites

Understanding Test Case Structure

Writing Good Success Criteria

Test Verdict Meanings

Judge Reasoning

Interpreting Confidence Scores

Adding More Test Cases

Running with Verbose Output

Testing Different Scenarios

Common First-Time Issues

Next Steps

Creating More Tests