MCP servers are starting to look like infrastructure.
That means the old readiness question is no longer enough:
Does the process start?
Even this is not enough:
Does
tools/listreturn a clean schema?
A server can pass both checks and still fail every real agent loop because auth handoff, scopes, downstream permissions, environment setup, or data boundaries are broken.
So I shipped mcp-probe v1.4.0 with contract assertions for production MCP servers.
The problem: discovery is not readiness
A typical MCP smoke test looks like this:
- Start the server
- Run
initialize - Run
tools/list - Check that schemas exist
That catches broken startup and malformed tools.
But it misses the failures that matter in production:
- The tool advertises correctly, but every call returns
401 - OAuth requires a browser redirect the agent cannot trigger
- The DB role is not actually read-only
- Write attempts leak raw SQL errors or stack traces
- Results omit metadata agents need to reason safely
- Tenant or project scope is not preserved
- Broad exports or admin actions are reachable
- Error codes are unstable, so agents cannot recover
In other words: the server starts, but the contract is broken.
v1.4.0: sidecar contract assertions
mcp-probe already supported sidecar inputs via .mcp-probe.json so teams could run real tools/call checks instead of relying on schema-minimum dummy inputs.
v1.4.0 extends that sidecar with assertions.
Example for a database-backed MCP server:
{
"tools": {
"execute_sql": {
"input": {
"project_id": "YOUR_PROJECT_ID",
"query": "select 1 as health_check"
},
"expect": {
"status": "pass",
"requiredFields": ["rowCount", "limit", "source", "freshness"],
"maxRows": 100
}
},
"execute_sql_write_denied": {
"input": {
"project_id": "YOUR_PROJECT_ID",
"query": "delete from users where id = 1"
},
"expect": {
"status": "fail",
"errorCode": "WRITE_NOT_ALLOWED",
"notContains": ["DATABASE_URL", "password", "stack"]
}
}
}
}
Now CI can validate the contract an agent actually depends on.
What assertions are supported?
expect.status
Declare whether a call should pass, fail, or warn.
This is important for negative probes. A write attempt against a read-only DB role should fail. In that case, failure is success.
{
"expect": {
"status": "fail"
}
}
expect.requiredFields
Validate that result metadata exists.
For database tools, an agent often needs more than rows. It needs context:
rowCountlimitsource-
freshness
{
"expect": {
"requiredFields": ["rowCount", "limit", "source", "freshness"]
}
}
expect.maxRows
Catch broad exports or missing limits.
{
"expect": {
"maxRows": 100
}
}
mcp-probe looks for common result shapes such as rowCount, rowsReturned, rows, data, items, and records.
expect.errorCode
Require stable structured error codes.
{
"expect": {
"status": "fail",
"errorCode": "WRITE_NOT_ALLOWED"
}
}
This matters because agents can only recover if errors are predictable.
expect.contains and expect.notContains
Check for expected output and leaked internals.
{
"expect": {
"notContains": ["DATABASE_URL", "password", "stack"]
}
}
This catches errors that expose raw internals.
expect.not_error_code
Treat known auth/permission status codes as warnings instead of hard failures.
{
"expect": {
"not_error_code": [401, 403]
}
}
This keeps OAuth handoff failures visible without confusing them with transport or runtime crashes.
Output example
When assertions pass:
Tool Call Dry-run
✓ db_query [sidecar] 1ms
✓ status: Tool status matched expected pass
✓ requiredFields.rowCount: Found required field "rowCount"
✓ requiredFields.limit: Found required field "limit"
✓ requiredFields.source: Found required field "source"
✓ requiredFields.freshness: Found required field "freshness"
✓ maxRows: Row count 1 is within maxRows 100
✓ db_write [sidecar] 0ms
✓ status: Tool status matched expected fail
✓ errorCode: Found expected error code WRITE_NOT_ALLOWED
✓ notContains.DATABASE_URL: Output does not contain "DATABASE_URL"
✓ notContains.password: Output does not contain "password"
✓ notContains.stack: Output does not contain "stack"
If a contract assertion fails, mcp-probe reports:
CONTRACT_ASSERTION_FAILED
and includes per-assertion details in terminal output, JSON output, and GitHub Actions summaries.
Quick start
npx @k08200/mcp-probe@latest init \
--target @your-org/your-mcp-server \
--discover \
--github-actions
Then edit .mcp-probe.json with real read-only probes and run:
npx @k08200/mcp-probe@latest --config mcp-probe.config.json --github-summary
Why this matters
MCP CI should test the contract an agent will actually depend on, not just whether the server process starts.
For database-backed MCP servers, that means validating things like:
- read-only role behavior
- denied writes
- stable error codes
- row limits
- tenant or project scope
- result metadata
- no leaked internals
mcp-probe should not know every server's semantics. But it can give teams a small, declarative way to encode the production contract their agents rely on.
That is the goal of v1.4.0.
























