Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.pecta.ai/llms.txt

Use this file to discover all available pages before exploring further.

Every agent evaluated by Pecta accumulates a reputation score tied to its agent_id. The score is a portable 0–1000 integer stored centrally, which means it travels with the agent across every platform and integration mode — whether the agent is gated via the SDK, the proxy, or the REST API. A single score reflects all of an agent’s evaluations regardless of where they originated.

Scoring formula

The score is a weighted sum of four components, each normalised to its maximum contribution:
Score = (pass_rate × 400) + (latency_score × 250) + (streak × 200) + (volume × 150)
ComponentWeightDescription
pass_rate400Fraction of evaluations that passed, in the rolling window. 1.0 = all pass.
latency_score250Normalised inverse of average latency. Faster agents score higher.
streak200Rewards consecutive passing evaluations. A long unbroken streak raises the score.
volume150Rewards consistent usage. More evaluations in the window increases this component up to its cap.
The maximum possible score is 1000. A brand-new agent with perfect evaluations climbs toward 1000 as it accumulates volume.

Lifecycle states

The score goes through four lifecycle states as evaluations accumulate:
StateConditionMeaning
new0 evaluationsAgent has never been evaluated.
calibrating1–49 evaluationsScore is being established; shown as N/50 in the dashboard.
active50–499 evaluationsScore is statistically reliable and fully displayed.
mature500+ evaluationsRolling window is fully saturated; oldest evaluations age out as new ones arrive.
The score only becomes visible to external consumers at 50 evaluations. Below that threshold the data is insufficient for a statistically reliable signal, so the dashboard displays the calibration progress instead of a raw number.

Rolling window

Reputation is computed over the last 500 evaluations. When a new evaluation arrives and the window is full, the oldest entry is evicted. This means an agent can recover from a bad period: sustained good behaviour will eventually push earlier failures out of the window.

Portability

The score is stored centrally under a stable agent_id per organization. Any integration mode that uses the same agent_id writes to the same window:
  • An SDK evaluation in your Node.js service
  • A proxy evaluation from Claude Desktop on a developer’s laptop
  • A REST API evaluation from your Python test suite
All three update the same rolling window and the same score. You control the agent_id string, so you can segment agents as finely as you need (e.g. dsp-bidder-prod vs dsp-bidder-staging).

Reading the score

The REST API returns the current reputation inline on every /v1/evaluate response:
{
  "evaluation_id": "v7k2mQpXtR9fNwJz",
  "passed": true,
  "reputation": {
    "score": 812,
    "lifecycle": "active",
    "eval_count": 124
  }
}
You can also fetch the score directly:
curl https://api.pecta.ai/v1/reputation/research-bot-v2 \
  -H "Authorization: Bearer $PECTA_API_KEY"
{
  "agent_id": "research-bot-v2",
  "score": 812,
  "lifecycle": "active",
  "eval_count": 124,
  "window_size": 500
}
Scores are served with low latency and reflect evaluations ingested within seconds.