PlatPhorm Network

PlatPhorm Evals

CLI / platphormctl

Run Evals from the operator CLI

`platphormctl` is a first-class Evals client for discovery, MCP validation, policy inspection, dry-run harnesses, and evidence-producing release checks. Public examples are documented here; protected execution still requires PLATPHORM_API_KEY.

site-inspect-evals

public-safe

Inspect Evals public route, policy, discovery, and health surfaces.

platphormctl site inspect evals

Status: documented. Last validated: not persisted in this deployment.

mcp-validate-evals

public-safe

Validate Evals MCP JSON-RPC introspection and tool schema metadata.

platphormctl mcp validate evals

Status: documented. Last validated: not persisted in this deployment.

policy-inspect-evals

public-safe

Inspect Evals agent, AI, trust, security, and robots policies.

platphormctl policy inspect evals

Status: documented. Last validated: not persisted in this deployment.

evals-list

public-safe

List public-safe Evals suites, templates, gates, and recent run summaries.

platphormctl evals list

Status: documented. Last validated: not persisted in this deployment.

evals-run-site-mcp

public-safe

Run a site-level evaluation plan for the MCP Hub target.

platphormctl evals run-site mcp

Status: documented. Last validated: not persisted in this deployment.

evals-run-mcp-mcp

public-safe

Run public-safe MCP introspection checks for MCP Hub.

platphormctl evals run-mcp mcp

Status: documented. Last validated: not persisted in this deployment.

grade-tool-health

public-safe

Grade the MCP get_health tool using deterministic output checks where possible.

platphormctl evals grade-tool mcp get_health

Status: documented. Last validated: not persisted in this deployment.

harness-discovery-full

public-safe

Run the full discovery harness with trace propagation.

platphormctl harness run discovery-full --trace

Status: documented. Last validated: not persisted in this deployment.

developer-validation-dry-run

dry-run

Preview developer validation without protected execution.

platphormctl harness run developer-validation --target https://evals.platphormnews.com --dry-run

Status: documented. Last validated: not persisted in this deployment.

spec-evals-browserops-loop

dry-run

Preview the Spec to Evals to BrowserOps loop without claiming provider evidence.

platphormctl harness run spec-evals-browserops-loop --dry-run

Status: documented. Last validated: not persisted in this deployment.

Evidence rule

This page documents commands; it does not claim they executed. Store real CLI output through a protected runner, `platphormctl` dry-run artifact, or persisted Evals run before using CLI evidence in scorecards or release gates.