Execution Playbooks

How to Set Up an AI Voice Agent for Lead Qualification

Set up an ai voice agent lead qualification system by defining your qualification rubric (ICP + disqualifiers + routing), choosing a voice platform (Vapi/Retell/Bland), wiring it to your CRM via webhooks, and running a QA + measurement loop that tunes prompts and call flows weekly. This playbook gives you the exact stack, configs, prompts, and rollout plan.

Key takeaways:

  • Ship a pilot in 7–14 days with a tight scope: one call type, one routing path, one CRM object.
  • Your “secret sauce” is the qualification rubric + handoff logic, not the TTS voice.
  • Treat call QA like a growth funnel: score, label failures, iterate prompts, and retrain tools.

I’ve built growth systems where speed matters more than perfection. At Uber and Postmates, we shipped scrappy call flows, measured ruthlessly, and iterated weekly. AI voice is the same discipline, just a new interface.

A working ai voice agent lead qualification setup is not “a bot that talks.” It’s a production workflow: call entry → identity + intent capture → structured qualification → routing → CRM writeback → QA scoring → prompt and flow iteration. If you skip the CRM writeback and QA loop, you’ll end up with a demo that can’t be trusted by sales, and it’ll get turned off after one bad call.

This playbook assumes you’re a growth leader or marketing operator who needs an end-to-end system: what you’re building, the tools to pick, the exact call flow, copy-paste prompts, webhook payloads, measurement, and how to graduate from pilot to production. I’ll also show you where teams usually break things: compliance, hallucinated claims, bad transfers, and “pretty conversations” that don’t produce qualified meetings.

What this playbook produces (deliverables)

By the end, you will have:

  1. Qualification rubric

    • ICP definition (firmographics + intent signals)
    • Disqualifiers (hard “no” rules)
    • Required fields to capture (per lead type)
    • Routing matrix (AE/SDR queue, region, urgency)
  2. Call flows (state machine)

    • Inbound lead response flow (default)
    • Outbound follow-up flow (optional)
    • Human handoff and fallback paths
    • Voicemail + SMS follow-up templates
  3. Voice agent config

    • System prompt + tool/function definitions
    • Knowledge boundaries (“don’t answer pricing beyond X”)
    • Tone, pacing, and compliance language
  4. CRM + scheduling integration

    • Webhook receiver service (copy-paste code)
    • Lead creation/update logic
    • Calendar booking or “request to book” workflow
    • Transcript + recording attached to the CRM record
  5. Measurement + QA system

    • KPI definitions and targets
    • QA rubric and sampling plan
    • Failure labeling taxonomy and prompt iteration loop

Prerequisites and setup (before you touch the voice tool)

1) Decide your pilot scope (tight)

Pick one:

  • Inbound: “Call me now” from website or ads (highest intent, easiest ROI)
  • Inbound: missed call back (great for local services)
  • Outbound: speed-to-lead follow-up (higher risk for compliance and deliverability)

My operator advice: start with inbound. Sales teams forgive less on outbound.

2) Define your qualification fields (minimum viable)

Use 6–10 fields max. Example for B2B SaaS:

  • Full name
  • Company name
  • Email
  • Role/seniority
  • Team size or revenue band
  • Use case / problem
  • Timeline
  • Budget range (optional, depends on market)
  • Region/time zone
  • Consent for follow-up (if required)

3) Pick your routing rules

Example:

  • If Enterprise ICP + urgent timeline, warm transfer to AE queue.
  • If SMB ICP, book SDR meeting.
  • If not ICP, send resources + mark as nurture.

4) Compliance checklist (non-negotiable)

  • Recordings: disclose if you record calls (jurisdiction-specific).
  • Outbound: confirm TCPA/consent requirements for your market and list sources.
  • Data: don’t collect sensitive info (SSN, payment) in the voice flow unless you have a hardened compliance setup.

(You should align with counsel; I’m not giving legal advice.)


Complete tool stack with configuration (pick your lane)

Lane A: Fastest time-to-value (recommended for most teams)

Voice platform: Vapi or Retell or Bland.ai
CRM: HubSpot or Salesforce
Automation: Zapier/Make (pilot) → webhook service (production)
Storage: S3/GCS for recordings + transcripts (optional)
Analytics: PostHog/Amplitude + CRM dashboards
QA: Airtable/Google Sheet + human review, later add LLM grader

How to choose (decision matrix):

Requirement Choose
You need maximum control over tool-calling, webhooks, latency tuning Vapi
You want a more guided voice agent builder and quick iteration Retell
You want an “SDR-like” packaged experience Bland.ai
You have security/compliance constraints + custom telephony Custom build (Twilio + model gateway)

Lane B: Custom build (Twilio + model gateway)

Use if you need:

  • Private networking, strict logging controls
  • Deep call routing through your existing contact center
  • Custom model selection / redaction

Tradeoff: slower shipping.


Step-by-step execution workflow (pilot → production)

Step 1: Map the call flow as a state machine (not a script)

Write it as states with exit criteria:

  1. Greeting + consent
  2. Intent confirmation (why they called)
  3. Qualification questions (branching)
  4. Value confirmation (repeat back needs + next step)
  5. Routing (book, transfer, or follow-up)
  6. CRM writeback + tags
  7. Wrap-up

A common mistake is over-writing “human-sounding” dialogue and under-building the routing logic. Sales cares about fields + next steps.

Step 2: Create your qualification rubric + disqualifiers

Example disqualifiers (B2B):

  • Student/research only
  • Competitor
  • Outside supported regions
  • Needs features you don’t have (be explicit)

Keep disqualifiers tight. If you disqualify too aggressively, you’ll burn pipeline you could have closed.

Step 3: Build the agent prompt (system prompt)

This is where most teams fail. Your prompt must enforce:

  • Ask one question at a time
  • Don’t improvise product claims
  • Confirm spelling for email
  • Summarize and confirm before routing
  • Always write to CRM, even if disqualified

Copy-paste prompt (edit placeholders):

SYSTEM: You are an AI voice agent for {Company}. Your job is lead qualification for inbound calls.
Primary goal: collect structured qualification data and route to the correct next step (book meeting, warm transfer, or follow-up).
Secondary goals: be polite, concise, and accurate. Do not make up facts about {Company}. If unsure, say you will have a human follow up.

HARD RULES:
- Ask ONE question at a time.
- Never claim discounts, guarantees, legal/medical advice, or implementation specifics.
- If the caller asks a product question outside the approved FAQ, capture the question and offer a follow-up.
- Confirm spelling for email addresses and company names.
- Always end with a clear next step and confirmation of contact info.
- If user is angry, confused, or requests a human twice, attempt warm transfer immediately.

DATA TO COLLECT (required if possible):
full_name, company, email, phone, role, team_size_or_revenue_band, primary_use_case, timeline, region, consent_to_follow_up

QUALIFICATION LOGIC:
- If ICP_FIT = high AND timeline <= 60 days: route = warm_transfer_sales (if within business hours) else route = book_meeting
- If ICP_FIT = medium: route = book_meeting
- If ICP_FIT = low OR disqualified: route = nurture_follow_up

DISQUALIFIERS:
{List your disqualifiers here}

TOOLS YOU CAN CALL:
1) create_or_update_lead(payload) -> writes to CRM
2) book_meeting(payload) -> schedules meeting (if enabled)
3) warm_transfer(payload) -> transfers to sales line
4) send_sms(payload) -> sends confirmation text

CALL FLOW:
1) Greeting + (if required) recording disclosure.
2) Ask intent: "What prompted your call today?"
3) Qualification questions (branch based on answers).
4) Confirm summary: "Here’s what I captured..."
5) Route and confirm next step.
6) Call create_or_update_lead with all collected fields + call metadata (duration, disposition).

Step 4: Define tool calls / function schema (webhook-ready)

Even if you use Zapier first, keep a stable schema so you can graduate to production.

Example JSON schema (what your agent should send):

{
  "lead": {
    "full_name": "Jane Doe",
    "company": "Acme Inc",
    "email": "jane@acme.com",
    "phone": "+14155551212",
    "role": "VP Marketing",
    "team_size_or_revenue_band": "51-200",
    "primary_use_case": "Inbound lead response automation",
    "timeline": "30 days",
    "region": "US - Pacific",
    "consent_to_follow_up": true
  },
  "call": {
    "direction": "inbound",
    "call_id": "abc123",
    "started_at": "2026-02-19T18:22:10Z",
    "duration_seconds": 412,
    "recording_url": "https://...",
    "transcript": "..."
  },
  "routing": {
    "icp_fit": "high",
    "disposition": "warm_transfer_sales",
    "notes": "Wants demo, asked about integrations with HubSpot."
  }
}

Step 5: Build the CRM webhook receiver (production-grade starter)

This avoids brittle Zapier chains and gives you retries + logging.

Node.js/Express webhook receiver (copy-paste):

import express from "express";
import crypto from "crypto";

const app = express();
app.use(express.json({ limit: "5mb" }));

// Optional: verify signature from your voice vendor if supported
function verifySignature(req) {
  const secret = process.env.WEBHOOK_SECRET;
  const sig = req.header("X-Signature");
  if (!secret || !sig) return true;

  const hmac = crypto
    .createHmac("sha256", secret)
    .update(JSON.stringify(req.body))
    .digest("hex");

  return crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(hmac));
}

app.post("/webhooks/voice/lead", async (req, res) => {
  try {
    if (!verifySignature(req)) return res.status(401).send("bad signature");

    const payload = req.body;
    const { lead, call, routing } = payload;

    // 1) Basic validation
    if (!lead?.phone && !lead?.email) {
      return res.status(422).json({ error: "missing phone/email" });
    }

    // 2) TODO: write to your CRM (HubSpot/Salesforce)
    // Example pseudo:
    // const contactId = await upsertContact({lead});
    // await createCallEngagement({contactId, call, routing});
    // await setLifecycleStage({contactId, routing});

    console.log("Lead payload received", {
      phone: lead?.phone,
      email: lead?.email,
      disposition: routing?.disposition
    });

    // 3) Respond quickly to avoid vendor timeouts
    res.json({ ok: true });
  } catch (e) {
    console.error(e);
    res.status(500).json({ ok: false });
  }
});

app.listen(process.env.PORT || 3000, () =>
  console.log("Webhook server running")
);

Step 6: Handoff design (warm transfer without embarrassment)

Rules I like:

  • Attempt warm transfer only when (a) prospect is qualified and (b) sales is available.
  • If transfer fails (no answer, queue too long): revert to booking or “sales will call you back in X business hours.”

Your agent should say:

  • Who it’s transferring to
  • What the prospect will do next
  • What happens if nobody answers

Step 7: QA rubric + labeling (make it operational)

Build a 10-point rubric you can score quickly:

Category Score 0–2 What “2” means
Accuracy 0–2 No invented claims; correct next steps
Control 0–2 One question at a time, no rambling
Data capture 0–2 Captured required fields or documented why not
Routing 0–2 Correct disposition given rubric
Experience 0–2 Polite, concise, handles interruptions

Sampling plan (pilot):

  • Review first 20 calls manually
  • Then 20% of calls weekly, plus all escalations/transfers

Copy-pasteable prompts for each stage

Prompt: Build your qualification rubric from your ICP

Use this with ChatGPT/Claude internally to draft your rubric, then edit it.

You are my growth ops assistant. Create a lead qualification rubric for an AI voice agent.

Context:
- Company: {Company}
- Product: {1 sentence}
- ICP: {Describe firmographics + roles}
- Non-ICP: {Who you don’t serve}
- Regions served: {List}
- Sales routing: {SDR/AE structure + business hours}
- Booking link rules: {If any}

Output format:
1) Required fields (8-12 max)
2) Qualification questions (ordered, one at a time)
3) Disqualifiers (hard no)
4) ICP fit scoring rules (high/medium/low)
5) Routing matrix: fit x urgency x region -> action
6) CRM fields mapping (field name -> description)
Keep it tight and operational.

Prompt: Generate call flow branches + fallback handling

Design a state-machine call flow for an AI voice agent doing inbound lead qualification for {Company}.

Include:
- States: greeting/consent, intent, qualification, objection handling, routing, wrap-up
- For each state: entry criteria, exit criteria, and 2 example utterances
- Branches for: (a) caller wants pricing, (b) caller is not decision maker, (c) caller refuses email, (d) caller asks for human, (e) background noise / can't hear
- Fallback rules: after 2 failed attempts to collect a field, move on and mark as unknown
- Final output as a table plus a short JSON-like state diagram
Do not include marketing fluff. Optimize for short call duration and high data accuracy.

Measurement framework (what to track + target benchmarks)

You need two layers: funnel KPIs and quality KPIs. Benchmarks vary heavily by industry, traffic source, and call type, so I’m giving targets as operating goals for a healthy pilot, not “market stats.”

Funnel KPIs (core)

Track per call cohort (inbound vs outbound, campaign, hour of day):

  1. Connection rate (outbound only)
    Target (pilot): improve week-over-week; diagnose by list quality and dialing windows.

  2. Qualification completion rate = calls where required fields captured / total connected calls
    Target (pilot): 60–80% completion for inbound; if lower, your questions are too long or too early.

  3. Qualified lead rate = ICP fit high/medium / total connected calls
    Target: depends on traffic quality; use this to pressure-test targeting, not the agent.

  4. Handoff success rate

    • Warm transfer success = answered transfers / transfer attempts
      Target (pilot): >50% during business hours with a staffed queue.
    • Booking success = meetings booked / qualified calls
      Target (pilot): 30–60% if you offer booking during the call.
  5. Sales acceptance rate = sales-marked “good lead” / AI-qualified leads
    Target (pilot): >70%. If sales rejects leads, your rubric or routing is off.

Quality KPIs (trust + safety)

  1. Hallucination/incorrect claim rate
    Target: 0 tolerated for regulated or pricing/legal topics. Treat as Sev-1.

  2. Customer complaint rate (angry calls, “stop calling,” brand risk)
    Target: near-zero on inbound; investigate every complaint.

  3. Avg handle time (AHT)
    Target: keep calls tight. Most qualification calls should finish in 2–6 minutes unless complex.

Instrumentation (minimum viable)

  • Log every call with:
    • campaign/source, timestamp, agent version, prompt version
    • disposition, fields captured count, transcript, recording URL
  • Create a weekly dashboard:
    • funnel conversion by source
    • top 10 failure reasons (from QA labels)

Scaling guide (pilot → production)

Phase 1 (Week 1–2): Pilot safely

  • One phone number
  • One language
  • One ICP segment
  • Business hours only
  • Human fallback always available (voicemail + callback SLA)

Ship requirement: CRM writeback works end-to-end.

Phase 2 (Week 3–6): Expand coverage

  • Add after-hours handling
  • Add 1–2 additional routing paths
  • Add SMS confirmation (meeting details + “reply STOP” if required)
  • Add LLM-based QA pre-scoring to reduce manual review volume

Phase 3 (Week 6+): Production hardening

  • Vendor redundancy plan (failover number or fallback to voicemail)
  • Load testing for webhook receiver
  • Structured versioning:
    • agent_version, prompt_version, rubric_version
  • Security:
    • signed webhooks, secret rotation, PII minimization
  • Continuous improvement loop:
    • weekly prompt updates with changelog
    • monthly rubric review with sales leadership

Common pitfalls and how to avoid them

  1. The agent “sounds good” but captures bad data

    • Fix: force confirmation for email/company, add a “repeat back” step, and validate emails server-side.
  2. Over-qualifying too early

    • Fix: ask intent first, then 2–3 high-signal questions, then only go deeper if fit looks good.
  3. Broken transfers ruin trust

    • Fix: only transfer when the queue is staffed; otherwise book. Add a transfer timeout and graceful fallback.
  4. Sales rejects leads because definitions are fuzzy

    • Fix: write the rubric with sales, then put it in the CRM as a visible field (“AI Fit: High/Med/Low” + reason).
  5. No iteration loop

    • Fix: schedule a 30-minute weekly “call review” with marketing + sales. Watch 5 calls together. Update prompt the same day.

Practical example workflow (inbound “call me now”)

  1. Website form submission triggers immediate call (or user calls a tracked number).
  2. AI voice agent answers within seconds.
  3. Agent captures: use case, role, company, email, timeline.
  4. If fit high: warm transfer to AE.
  5. If fit medium: book SDR meeting.
  6. If low: send resource link via SMS/email, mark as nurture.
  7. All calls logged with transcript + recording in CRM.

That’s the system. Everything else is tuning.

Frequently Asked Questions

Should I start with Bland.ai, Vapi, or Retell?

Start with the platform that makes it easiest to ship your first working call flow plus webhooks. If you need deep control over function calling and logging, I usually pick Vapi. If you want faster guided setup, Retell is often quicker to iterate.

Do I let the agent answer product questions?

Let it answer only what you can guarantee is accurate and current. For everything else, capture the question verbatim, tag it in CRM, and route to a human follow-up.

How do I keep the agent from rambling?

Enforce “one question at a time,” set a hard max sentence length in the prompt, and add a control rule: if the caller interrupts, the agent stops and asks a clarifying question. Then QA-score “control” and iterate weekly.

What’s the fastest CRM integration for a pilot?

Use Zapier/Make to prove the data model and routing. Once you have stable fields, move to a webhook receiver service with retries and structured logging so sales can trust the records.

How do I handle accents, noisy environments, or bad audio?

Add a “can you hear me clearly?” checkpoint early. After two failed attempts to collect a field, move on and mark unknown, then offer SMS/email follow-up to complete details.

Frequently Asked Questions

Should I start with Bland.ai, Vapi, or Retell?
Start with the platform that makes it easiest to ship your first working call flow plus webhooks. If you need deep control over function calling and logging, I usually pick Vapi. If you want faster guided setup, Retell is often quicker to iterate.
Do I let the agent answer product questions?
Let it answer only what you can guarantee is accurate and current. For everything else, capture the question verbatim, tag it in CRM, and route to a human follow-up.
How do I keep the agent from rambling?
Enforce “one question at a time,” set a hard max sentence length in the prompt, and add a control rule: if the caller interrupts, the agent stops and asks a clarifying question. Then QA-score “control” and iterate weekly.
What’s the fastest CRM integration for a pilot?
Use Zapier/Make to prove the data model and routing. Once you have stable fields, move to a webhook receiver service with retries and structured logging so sales can trust the records.
How do I handle accents, noisy environments, or bad audio?
Add a “can you hear me clearly?” checkpoint early. After two failed attempts to collect a field, move on and mark unknown, then offer SMS/email follow-up to complete details.

Ready to build your AI growth engine?

I help CEOs use AI to build the growth engine their board is asking for.

Talk to Isaac