AI Runbooks

The Complete AI Outbound Playbook

The Complete AI Outbound Playbook (AI Runbook)
1. Objective
2. Inputs Required
3. Tool Stack
4. Prompt Pack
5. Execution Steps
6. Output Schema
7. QA Rubric
8. Failure Modes
9. Iteration Loop
Frequently Asked Questions

You can ship an AI-led outbound engine today by standardizing your ICP, automating enrichment, generating a pain hypothesis per account, then producing an ai cold email sequence with deterministic prompts and QA gates before sending through Smartlead/Instantly. This runbook gives you the exact tool stack, prompts, schemas, QA rubric, and iteration loop.

Key takeaways:

Enrichment first, copy second: your sequence quality is capped by data quality.
Force determinism with fixed schemas, capped word counts, and explicit “only use provided fields” rules.
QA the output like code: factuality, relevance, deliverability, and compliance gates before you hit send.

I’ve run outbound-adjacent growth at Uber and Postmates where “spray and pray” fails fast, because brand damage and deliverability penalties compound. What works is systems: tight ICP definitions, consistent data enrichment, a repeatable pain hypothesis per segment, and messaging that’s short, specific, and provable.

AI makes this workflow faster, but only if you constrain it. Most teams point Claude/ChatGPT at a lead list and get generic emails, invented facts, and sequences that trigger spam filters. The fix is to treat AI like a junior growth operator: give it clean inputs, a strict schema, a scoring rubric, and a loop that improves the prompts based on measured outcomes.

This runbook is designed for a CEO, VP Growth, or growth engineer who needs an AI outbound workflow working right now. You’ll set up: (1) enrichment in Clay (or alternative), (2) vertical-specific pain hypothesis generation, (3) ai cold email sequence generation with personalization tokens, (4) objection-handling snippets, and (5) a QA gate that blocks hallucinations and deliverability issues before launch.

The Complete AI Outbound Playbook (AI Runbook)

1. Objective

Produce a ready-to-send, QA-approved ai cold email outbound campaign (lead list + enrichment + vertical pain hypotheses + 4–6 email sequence + objection handling snippets) that can be launched in Smartlead/Instantly today.

2. Inputs Required

ICP definition (one-page): target roles, company size, industries, trigger events, exclusions.
Offer definition: what you sell, primary value prop, proof points you can legally claim (case studies, logos you have permission to reference, quantified outcomes you can cite).
Outbound domains + inboxes: at minimum 3–10 inboxes warmed (Google Workspace/M365).
Sending platform access: Smartlead or Instantly (or Apollo Sequencing).
Enrichment access: Clay (preferred) or Apollo/Clearbit/ZoomInfo.
A seed lead list: 200–2,000 accounts/contacts to start (CSV is fine).
Compliance constraints: CAN-SPAM basics, your company’s policy on personalization and claims, region constraints (US/EU).
A “do-not-invent” list: fields AI is forbidden to fabricate (revenue, funding, headcount, customers, tech stack) unless present in enrichment data.

Assumptions:

You can tolerate early iteration. Your first campaign won’t be perfect.
You will measure outcomes by domain/inbox and by segment (vertical + persona).

3. Tool Stack

Primary stack (what I’d ship first):

Clay for enrichment and workflow tables
- Alternatives: Apollo, ZoomInfo, Clearbit, People Data Labs (PDL)
Claude (Sonnet/Opus) for longer reasoning + writing variants
- Alternatives: GPT-4.1 / GPT-4o, Gemini
OpenAI API (optional) for batch generation at scale
- Alternatives: Anthropic API
Smartlead for sending + inbox rotation + basic warm-up
- Alternatives: Instantly, Apollo, Outreach (enterprise)
Cursor (or VS Code) for prompt/version control + scripts
- Alternatives: Windsurf, plain VS Code
Google Sheets (or Airtable) for quick ops review
- Alternatives: Notion database, Coda

Nice-to-have:

MxToolbox / Google Postmaster Tools for deliverability monitoring
- Alternatives: GlockApps (paid), Mailreach monitoring
Webhook/Zapier/Make for moving rows from “QA Passed” to “Ready to Send”
- Alternatives: n8n (self-hosted)

4. Prompt Pack

Use these prompts exactly. They’re written to be deterministic, minimize hallucinations, and output in strict schemas.

# Prompt 1 (Claude / ChatGPT): ICP → Segment Map + Pain Hypotheses
You are my outbound strategy operator. Use ONLY the inputs I provide. Do not invent facts, metrics, customer names, or company details.

INPUTS:
- Company product: {{PRODUCT_DESC}}
- ICP: {{ICP_ONE_PAGER}}
- Offer: {{OFFER}}
- Proof points I can claim: {{PROOF_POINTS}}
- Segments I want to target (if any): {{SEGMENT_LIST_OR_EMPTY}}
- Constraints: 
  - No invented personalization.
  - Each pain must map to a plausible trigger + observable signal.
  - Output must follow the JSON schema exactly.

TASK:
1) Propose 3–6 ICP segments (vertical + persona) if not provided.
2) For each segment, generate:
   - Top 5 pain hypotheses (each as a falsifiable statement)
   - Triggers (events that increase likelihood of pain)
   - Observable signals (data points we can enrich for)
   - Messaging angle (1 sentence)
   - Disqualifiers (who NOT to email)

OUTPUT FORMAT (JSON only):
{
  "segments": [
    {
      "segment_id": "string",
      "vertical": "string",
      "persona": "string",
      "pains": [
        {
          "pain_id": "string",
          "hypothesis": "string",
          "triggers": ["string"],
          "observable_signals": ["string"],
          "message_angle": "string"
        }
      ],
      "disqualifiers": ["string"]
    }
  ]
}

# Prompt 2 (Claude / ChatGPT): Account-Level Personalization Brief (No Hallucinations)
You are generating an outbound personalization brief. Use ONLY the provided enrichment fields. If a field is empty, write "UNKNOWN" and do not guess.

ENRICHMENT JSON:
{{LEAD_ENRICHMENT_JSON}}

TASK:
Create a 1-page personalization brief that an SDR would trust. It must include:
- A single best-fit segment_id from our segment map
- 3 ranked pain hypotheses with evidence from fields
- 1–2 specific openers that reference ONLY the data we have
- A "safe claim list": statements we can say without risk
- A "do not say list": anything that would require guessing
- A suggested CTA type: {calendar_link, quick_question, value_asset}

OUTPUT FORMAT (JSON only):
{
  "lead_id": "string",
  "segment_id": "string",
  "confidence": 0-100,
  "ranked_pains": [
    {
      "pain_id": "string",
      "reasoning_from_fields": ["string"]
    }
  ],
  "safe_openers": ["string"],
  "safe_claims": ["string"],
  "do_not_say": ["string"],
  "cta_type": "calendar_link|quick_question|value_asset"
}

# Prompt 3 (Claude / ChatGPT): Generate the AI Cold Email Sequence (Deliverability-Safe)
You are writing a B2B ai cold email sequence. Use ONLY the personalization brief + offer inputs. No invented facts. Keep it plain text, no HTML, no images, no links except optional calendar link token.

INPUTS:
- Personalization brief JSON: {{PERSONALIZATION_BRIEF_JSON}}
- Offer: {{OFFER}}
- Proof points I can claim: {{PROOF_POINTS}}
- CTA rules:
  - Email 1 CTA = quick yes/no question
  - Email 2 CTA = ask for 15-min, optional calendar token {{CAL_LINK}}
  - Email 3 CTA = offer a value asset (template/checklist) with no gate
  - Email 4 CTA = breakup / permission to close file
- Style rules:
  - 60–120 words each
  - 1–2 short sentences per paragraph
  - No buzzwords, no exclamation points
  - Subject lines: 2 options per email, 2–5 words each
  - Use tokens exactly: {{first_name}}, {{company}}, {{role}}, {{personalized_opener}}
  - Include one optional PS line max in the entire sequence (choose best email)

OUTPUT FORMAT (YAML only):
sequence:
  - step: 1
    subjects: ["", ""]
    body: |
      ...
  - step: 2
    subjects: ["", ""]
    body: |
      ...
  - step: 3
    subjects: ["", ""]
    body: |
      ...
  - step: 4
    subjects: ["", ""]
    body: |
      ...

# Prompt 4 (Claude / ChatGPT): Objection-Handling Snippet Library (Reusable)
You are building an objection-handling library for replies to ai cold email. Use ONLY the offer + proof points provided. No invented metrics.

INPUTS:
- Offer: {{OFFER}}
- Proof points: {{PROOF_POINTS}}
- Common objections list (if any): {{OBJECTIONS_OR_EMPTY}}

TASK:
Generate 10 objections and responses. Responses must be:
- <= 70 words
- End with a single question
- Contain zero attachments and zero links (calendar token allowed)
- Avoid "just checking in" language

OUTPUT FORMAT (JSON only):
{
  "snippets": [
    {
      "objection": "string",
      "response": "string",
      "best_next_step": "ask_question|offer_call|send_asset"
    }
  ]
}

5. Execution Steps

Follow this sequence exactly. Don’t skip steps. This is where most teams get sloppy and blame the model.

Lock the ICP one-pager (30 minutes)
- Write: target industries, personas, employee ranges, geo, exclusions.
- Define 3 trigger events you care about (examples: hiring for RevOps, new product launch, new funding round). Use only triggers you can actually enrich for.
Build your Clay table (60 minutes)
- Columns (minimum):
  - lead_id, first_name, last_name, title, email, company, domain, linkedin_url
  - industry, employee_count, country
  - tech_stack_signals (nullable)
  - trigger_signals (nullable)
  - personalized_opener (nullable)
  - segment_id (nullable)
  - qa_status (values: raw, enriched, briefed, sequence_generated, qa_passed, queued)
- Import your CSV.
Enrich deterministically
- In Clay, add enrichment steps in this order:
  1. Validate email (if your provider supports it).
  2. Company firmographics (industry, headcount, location).
  3. Role/persona normalization (map titles into a controlled set like VP Growth, Head of RevOps, Founder).
  4. Optional: tech stack signals only if your data source is reliable for your market.
- Hard rule: if a field is missing, keep it blank. Do not “fill” with AI.
Generate segment map + pains (Prompt 1)
- Store output JSON as segments.json in your repo (yes, treat it like code).
- Add the segment_id options into Clay as a dropdown.
Create per-lead personalization briefs (Prompt 2)
- Batch in groups of 25–50 leads to start.
- Write the brief JSON back into Clay column personalization_brief_json.
- Populate personalized_opener from safe_openers[0].
- Set qa_status = briefed.
Generate sequence YAML per lead OR per segment
- For speed, do per segment first:
  - Generate one “base sequence” per segment_id.
- For quality, do per lead for your top 50–200 accounts:
  - Run Prompt 3 per lead using their personalization brief.
- Save the output in Clay column sequence_yaml.
- Set qa_status = sequence_generated.
Apply QA rubric (Section 7)
- Reject anything with invented facts, spammy language, or long paragraphs.
- Only move qa_status to qa_passed when it meets the pass threshold.
Push to Smartlead/Instantly
- Map fields:
  - {{first_name}}, {{company}}, {{role}}, {{personalized_opener}}
- Create a campaign per segment_id to keep performance attribution clean.
- Start conservative:
  - Low daily volume per inbox.
  - Plain text.
  - No attachments. Minimal links.
Reply handling
- Route replies to a shared inbox (or Slack) with lead context.
- Use objection snippet library (Prompt 4) to draft responses, then human-send.
Measure and iterate

Track by segment:
- Positive reply rate (booked meetings + interested replies)
- Negative replies (angry, spam complaints)
- Bounce rate
Update pain hypotheses and openers weekly.

6. Output Schema

This is the strict schema your workflow should produce for each lead before sending.

{
  "lead_id": "L-000123",
  "contact": {
    "first_name": "Ava",
    "last_name": "Kim",
    "title": "VP Growth",
    "email": "ava@company.com",
    "linkedin_url": "https://linkedin.com/in/...",
    "company": "CompanyName",
    "domain": "company.com"
  },
  "enrichment": {
    "industry": "Fintech",
    "employee_count": 250,
    "country": "US",
    "tech_stack_signals": ["UNKNOWN"],
    "trigger_signals": ["Hiring: Lifecycle Marketing Manager"]
  },
  "strategy": {
    "segment_id": "fintech_vp_growth",
    "confidence": 82,
    "ranked_pains": [
      {
        "pain_id": "activation_drop_mobile",
        "reasoning_from_fields": [
          "Role indicates ownership of growth funnel",
          "Trigger: hiring lifecycle suggests focus on retention/activation"
        ]
      }
    ],
    "personalized_opener": "Saw you're hiring a Lifecycle Marketing Manager at {{company}}."
  },
  "sequence": {
    "format": "yaml",
    "content": "sequence:\n  - step: 1\n    subjects: ...\n    body: |\n      ..."
  },
  "qa": {
    "qa_status": "qa_passed",
    "checks": {
      "no_hallucinations": true,
      "deliverability_safe": true,
      "cta_present_each_step": true,
      "word_count_ok": true
    },
    "score": 9
  }
}

7. QA Rubric

Score every lead (or every segment template) before sending.

Category	Test (Pass/Fail)	How to Check	Points
Factuality	No invented claims about the prospect	Every specific claim traces to an enrichment field	0/3
Personalization	Opener references only known data	If any “guessing” appears, fail	0/2
Deliverability	Plain text, short paragraphs, no spam terms	Scan for hype, multiple links, excessive punctuation	0/2
CTA Quality	Single clear CTA per email matches rules	Email 1 yes/no, Email 4 breakup	0/1
Relevance	Pain matches persona + segment	Would VP Growth care? If not, fail	0/2

Passing threshold:

Per-lead sequence: 8/10 minimum and Factuality must pass.
Per-segment base sequence: 9/10 minimum (because it scales).

8. Failure Modes

These are the ones I see in real teams, and the exact fixes.

Hallucinated personalization (“Congrats on the Series B…”)
- Cause: model tries to be helpful; enrichment didn’t include funding.
- Fix: In Prompt 2 and 3, keep “Use ONLY provided enrichment fields” and add “If unknown, write UNKNOWN.” Reject any output with external claims.
Generic pain statements that could fit anyone
- Cause: ICP not tight; pain hypotheses not falsifiable.
- Fix: In Prompt 1, require “observable signals” and “triggers.” If a pain can’t map to a signal you can enrich, delete it.
Sequences too long or over-written
- Cause: unconstrained word count and paragraph rules.
- Fix: Hard cap at 120 words, 1–2 sentences per paragraph, max 2 commas per sentence if needed. Enforce in QA.
Spammy formatting that tanks deliverability
- Cause: too many links, too many variables, marketing phrasing.
- Fix: No HTML, no images, no attachments, limit to one link (calendar token) in step 2 only.
Mismatch between CTA and buying motion
- Cause: asking for a meeting before establishing relevance.
- Fix: Step 1 is a yes/no question tied to the pain. Step 3 offers a free template/checklist (ungated) so the exchange feels fair.
Segment contamination (performance data becomes noisy)
- Cause: mixing multiple personas in one campaign.
- Fix: One campaign per segment_id. Keep subject lines and pain angles consistent.
Over-personalization that doesn’t scale
- Cause: trying to write bespoke emails for every lead from day one.
- Fix: Start per-segment base sequence, then only personalize top accounts. At Postmates, we learned to standardize first, then add complexity after signal appears.

9. Iteration Loop

Run this weekly. It’s short, mechanical, and it compounds.

Collect outcomes by segment_id
- Tags: positive reply, neutral, objection, unsubscribe, bounce.
- Store examples of best/worst threads.
Update pain hypotheses
- For segments with weak replies, rewrite the top 2 pains to be more specific and tied to a trigger you can enrich.
Tighten the opener policy
- Add new “approved opener patterns” that performed well.
- Ban patterns that caused skepticism (“noticed you’re scaling fast”).
Prompt versioning
- Keep prompts in a repo with versions (prompt_v3.md).
- Record what changed and why. Treat it like growth experiments.
Expand enrichment only after messaging works
- Add one new signal column at a time (job posts, tech install, etc.).
- If it doesn’t change reply quality, remove it. Extra enrichment steps often add latency and failure points.
Scale volume slowly
- Increase daily send only when bounce/spam signals are stable and copy is passing QA consistently.

Frequently Asked Questions

Should I generate one ai cold email per lead or one per segment?

Start with one per segment so you can launch today and learn faster. Then personalize per lead for your top accounts where a better opener materially changes outcomes.

What’s the minimum enrichment I need before I send?

First name, company, title/persona, and one credible opener signal (hiring, role scope, industry, or a verified trigger). If you don’t have a real opener, send a segment-level opener and keep it honest.

How do I stop the model from inventing facts?

Force “ONLY use provided fields,” require UNKNOWN for missing fields, and fail QA if any claim can’t be traced to enrichment. Most hallucinations come from open-ended prompts, not from the model “being bad.”

Smartlead vs Instantly?

Both work for getting campaigns out fast. Pick the one your team already knows, then focus on segmentation and QA; tooling differences matter less than input quality and consistency.

What does “pain hypothesis” mean in practice?

A falsifiable statement about what the prospect likely struggles with, tied to a trigger and a signal you can observe. If you can’t specify how you’d detect it, it’s not usable for outbound.

How many follow-ups should I send?

Run 4 steps to start because it forces discipline: initial, meeting ask, value asset, breakup. Add steps only after you have evidence that your list quality and deliverability can handle more volume.

Frequently Asked Questions

Should I generate one ai cold email per lead or one per segment?

Start with one per segment so you can launch today and learn faster. Then personalize per lead for your top accounts where a better opener materially changes outcomes.

What’s the minimum enrichment I need before I send?

How do I stop the model from inventing facts?

Smartlead vs Instantly?

Both work for getting campaigns out fast. Pick the one your team already knows, then focus on segmentation and QA; tooling differences matter less than input quality and consistency.

What does “pain hypothesis” mean in practice?

A falsifiable statement about what the prospect likely struggles with, tied to a trigger and a signal you can observe. If you can’t specify how you’d detect it, it’s not usable for outbound.

How many follow-ups should I send?

Run 4 steps to start because it forces discipline: initial, meeting ask, value asset, breakup. Add steps only after you have evidence that your list quality and deliverability can handle more volume.

Ready to build your AI growth engine?

I help CEOs use AI to build the growth engine their board is asking for.

Talk to Isaac

The Complete AI Outbound Playbook

Contents

The Complete AI Outbound Playbook (AI Runbook)

1. Objective

2. Inputs Required

3. Tool Stack

4. Prompt Pack

5. Execution Steps

6. Output Schema

7. QA Rubric

8. Failure Modes

9. Iteration Loop

Frequently Asked Questions

Should I generate one ai cold email per lead or one per segment?

What’s the minimum enrichment I need before I send?

How do I stop the model from inventing facts?

Smartlead vs Instantly?

What does “pain hypothesis” mean in practice?

How many follow-ups should I send?

Frequently Asked Questions

Related

Ready to build your AI growth engine?