Backed by Y Combinator

Automatic prompt optimization
with user feedback.

Turn user feedback into better prompts automatically. Refine your instructions against real-world examples, replacing manual tuning with continuous optimization.

Documentation
before

“You are a helpful assistant.”

afterauto-optimized

“You are a technical support specialist. Keep answers under 3 sentences.”

from 0 user signals

HOW IT WORKS

Turn failures into better instructions

Your users already know what's wrong. We turn their feedback into prompt improvements.

1
Input

“You are a helpful assistant.”

78% negative
2
Signals
Too generic
Way too verbose
Needs step-by-step
Didn't answer my question
Too formal
3
Output

“You are a technical support specialist. Keep answers under 3 sentences.”

89% positive

FEEDBACK API

Collect user feedback

One API call per thumbs-down. Include why it failed and what the response should have been. Complaints become constraints.

  • 👎 or 👍 per response
  • Free-text field for the reason
  • Paste the expected output
  • Links to your traces automatically
import zeroeval as ze
# 1. Your user didn't like a response
completion_id = "chatcmpl-abc123..."
# 2. Submit their feedback (one API call)
ze.feedback(
prompt="customer-support",
completion_id=completion_id,
thumbs_up=False,
reason="Too verbose",
expected="Keep it under 2 sentences."
)
import zeroeval as ze
# 1. Run optimization (uses your collected feedback)
run = ze.optimize(
prompt="customer-support",
optimizer="gepa" # or "bootstrap"
)
# 2. Get the optimized prompt in your app
prompt = ze.get_prompt("customer-support")
# Returns the latest version automatically

AUTO-OPTIMIZATION

Run optimization. Get a new prompt.

We read your feedback, run DSPy, and generate a rewritten prompt. Your code pulls it at runtime. Deploy when ready.

  • Stanford DSPy optimization
  • Negative feedback becomes constraints
  • A/B test before you deploy
  • Full version history with rollback

HOW OPTIMIZATION WORKS

Users give feedback.Prompts get better.

STRUCTURED FEEDBACK

Each field has a job

DSPy uses each field differently during optimization.

👍
thumbs_up

Labels the example as positive or negative for training.

💬
reason

Added as a constraint in the optimized prompt.

expected_output

Becomes a few-shot example for the model.

Constraints persist

Negative feedback reasons are extracted and prepended to every optimized prompt as non-negotiable constraints.

POWERED BY DSPY

Two optimization strategies

GEPA

Analyzes failures, rewrites instructions iteratively

Bootstrap

Selects best examples as few-shot demos

AUTOMATIC DEPLOY

Runtime prompt fetching

Your app calls the API to get the latest prompt version. Adopt an optimized prompt from the dashboard. No redeploy.

Dashboard
API
Your App
# Returns latest version
GET /v1/prompts/support-agent
VERSION CONTROL

Every optimization, versioned

Each run creates a new prompt version with its own eval score. Deploy the best one from the dashboard, or rollback in one click.

42%
v1
Dec 1
58%
v2
Dec 8
+16 pts
71%
v3
Dec 15
+13 pts
89%
v4live
Dec 22
+18 pts

Close the loop with user feedback

Turn production signals from users into automatic and measurable prompt improvements.