The Complete AI Outbound Playbook
You can ship an AI-led outbound engine today by standardizing your ICP, automating enrichment, generating a pain hypothesis per account, then producing an ai cold email sequence with deterministic prompts and QA gates before sending through Smartlead/Instantly. This runbook gives you the exact tool stack, prompts, schemas, QA rubric, and iteration loop.
Key takeaways:
- Enrichment first, copy second: your sequence quality is capped by data quality.
- Force determinism with fixed schemas, capped word counts, and explicit “only use provided fields” rules.
- QA the output like code: factuality, relevance, deliverability, and compliance gates before you hit send.
I’ve run outbound-adjacent growth at Uber and Postmates where “spray and pray” fails fast, because brand damage and deliverability penalties compound. What works is systems: tight ICP definitions, consistent data enrichment, a repeatable pain hypothesis per segment, and messaging that’s short, specific, and provable.
AI makes this workflow faster, but only if you constrain it. Most teams point Claude/ChatGPT at a lead list and get generic emails, invented facts, and sequences that trigger spam filters. The fix is to treat AI like a junior growth operator: give it clean inputs, a strict schema, a scoring rubric, and a loop that improves the prompts based on measured outcomes.
This runbook is designed for a CEO, VP Growth, or growth engineer who needs an AI outbound workflow working right now. You’ll set up: (1) enrichment in Clay (or alternative), (2) vertical-specific pain hypothesis generation, (3) ai cold email sequence generation with personalization tokens, (4) objection-handling snippets, and (5) a QA gate that blocks hallucinations and deliverability issues before launch.
The Complete AI Outbound Playbook (AI Runbook)
1. Objective
Produce a ready-to-send, QA-approved ai cold email outbound campaign (lead list + enrichment + vertical pain hypotheses + 4–6 email sequence + objection handling snippets) that can be launched in Smartlead/Instantly today.
2. Inputs Required
- ICP definition (one-page): target roles, company size, industries, trigger events, exclusions.
- Offer definition: what you sell, primary value prop, proof points you can legally claim (case studies, logos you have permission to reference, quantified outcomes you can cite).
- Outbound domains + inboxes: at minimum 3–10 inboxes warmed (Google Workspace/M365).
- Sending platform access: Smartlead or Instantly (or Apollo Sequencing).
- Enrichment access: Clay (preferred) or Apollo/Clearbit/ZoomInfo.
- A seed lead list: 200–2,000 accounts/contacts to start (CSV is fine).
- Compliance constraints: CAN-SPAM basics, your company’s policy on personalization and claims, region constraints (US/EU).
- A “do-not-invent” list: fields AI is forbidden to fabricate (revenue, funding, headcount, customers, tech stack) unless present in enrichment data.
Assumptions:
- You can tolerate early iteration. Your first campaign won’t be perfect.
- You will measure outcomes by domain/inbox and by segment (vertical + persona).
3. Tool Stack
Primary stack (what I’d ship first):
- Clay for enrichment and workflow tables
- Alternatives: Apollo, ZoomInfo, Clearbit, People Data Labs (PDL)
- Claude (Sonnet/Opus) for longer reasoning + writing variants
- Alternatives: GPT-4.1 / GPT-4o, Gemini
- OpenAI API (optional) for batch generation at scale
- Alternatives: Anthropic API
- Smartlead for sending + inbox rotation + basic warm-up
- Alternatives: Instantly, Apollo, Outreach (enterprise)
- Cursor (or VS Code) for prompt/version control + scripts
- Alternatives: Windsurf, plain VS Code
- Google Sheets (or Airtable) for quick ops review
- Alternatives: Notion database, Coda
Nice-to-have:
- MxToolbox / Google Postmaster Tools for deliverability monitoring
- Alternatives: GlockApps (paid), Mailreach monitoring
- Webhook/Zapier/Make for moving rows from “QA Passed” to “Ready to Send”
- Alternatives: n8n (self-hosted)
4. Prompt Pack
Use these prompts exactly. They’re written to be deterministic, minimize hallucinations, and output in strict schemas.
# Prompt 1 (Claude / ChatGPT): ICP → Segment Map + Pain Hypotheses
You are my outbound strategy operator. Use ONLY the inputs I provide. Do not invent facts, metrics, customer names, or company details.
INPUTS:
- Company product: {{PRODUCT_DESC}}
- ICP: {{ICP_ONE_PAGER}}
- Offer: {{OFFER}}
- Proof points I can claim: {{PROOF_POINTS}}
- Segments I want to target (if any): {{SEGMENT_LIST_OR_EMPTY}}
- Constraints:
- No invented personalization.
- Each pain must map to a plausible trigger + observable signal.
- Output must follow the JSON schema exactly.
TASK:
1) Propose 3–6 ICP segments (vertical + persona) if not provided.
2) For each segment, generate:
- Top 5 pain hypotheses (each as a falsifiable statement)
- Triggers (events that increase likelihood of pain)
- Observable signals (data points we can enrich for)
- Messaging angle (1 sentence)
- Disqualifiers (who NOT to email)
OUTPUT FORMAT (JSON only):
{
"segments": [
{
"segment_id": "string",
"vertical": "string",
"persona": "string",
"pains": [
{
"pain_id": "string",
"hypothesis": "string",
"triggers": ["string"],
"observable_signals": ["string"],
"message_angle": "string"
}
],
"disqualifiers": ["string"]
}
]
}
# Prompt 2 (Claude / ChatGPT): Account-Level Personalization Brief (No Hallucinations)
You are generating an outbound personalization brief. Use ONLY the provided enrichment fields. If a field is empty, write "UNKNOWN" and do not guess.
ENRICHMENT JSON:
{{LEAD_ENRICHMENT_JSON}}
TASK:
Create a 1-page personalization brief that an SDR would trust. It must include:
- A single best-fit segment_id from our segment map
- 3 ranked pain hypotheses with evidence from fields
- 1–2 specific openers that reference ONLY the data we have
- A "safe claim list": statements we can say without risk
- A "do not say list": anything that would require guessing
- A suggested CTA type: {calendar_link, quick_question, value_asset}
OUTPUT FORMAT (JSON only):
{
"lead_id": "string",
"segment_id": "string",
"confidence": 0-100,
"ranked_pains": [
{
"pain_id": "string",
"reasoning_from_fields": ["string"]
}
],
"safe_openers": ["string"],
"safe_claims": ["string"],
"do_not_say": ["string"],
"cta_type": "calendar_link|quick_question|value_asset"
}
# Prompt 3 (Claude / ChatGPT): Generate the AI Cold Email Sequence (Deliverability-Safe)
You are writing a B2B ai cold email sequence. Use ONLY the personalization brief + offer inputs. No invented facts. Keep it plain text, no HTML, no images, no links except optional calendar link token.
INPUTS:
- Personalization brief JSON: {{PERSONALIZATION_BRIEF_JSON}}
- Offer: {{OFFER}}
- Proof points I can claim: {{PROOF_POINTS}}
- CTA rules:
- Email 1 CTA = quick yes/no question
- Email 2 CTA = ask for 15-min, optional calendar token {{CAL_LINK}}
- Email 3 CTA = offer a value asset (template/checklist) with no gate
- Email 4 CTA = breakup / permission to close file
- Style rules:
- 60–120 words each
- 1–2 short sentences per paragraph
- No buzzwords, no exclamation points
- Subject lines: 2 options per email, 2–5 words each
- Use tokens exactly: {{first_name}}, {{company}}, {{role}}, {{personalized_opener}}
- Include one optional PS line max in the entire sequence (choose best email)
OUTPUT FORMAT (YAML only):
sequence:
- step: 1
subjects: ["", ""]
body: |
...
- step: 2
subjects: ["", ""]
body: |
...
- step: 3
subjects: ["", ""]
body: |
...
- step: 4
subjects: ["", ""]
body: |
...
# Prompt 4 (Claude / ChatGPT): Objection-Handling Snippet Library (Reusable)
You are building an objection-handling library for replies to ai cold email. Use ONLY the offer + proof points provided. No invented metrics.
INPUTS:
- Offer: {{OFFER}}
- Proof points: {{PROOF_POINTS}}
- Common objections list (if any): {{OBJECTIONS_OR_EMPTY}}
TASK:
Generate 10 objections and responses. Responses must be:
- <= 70 words
- End with a single question
- Contain zero attachments and zero links (calendar token allowed)
- Avoid "just checking in" language
OUTPUT FORMAT (JSON only):
{
"snippets": [
{
"objection": "string",
"response": "string",
"best_next_step": "ask_question|offer_call|send_asset"
}
]
}
5. Execution Steps
Follow this sequence exactly. Don’t skip steps. This is where most teams get sloppy and blame the model.
-
Lock the ICP one-pager (30 minutes)
- Write: target industries, personas, employee ranges, geo, exclusions.
- Define 3 trigger events you care about (examples: hiring for RevOps, new product launch, new funding round). Use only triggers you can actually enrich for.
-
Build your Clay table (60 minutes)
- Columns (minimum):
lead_id,first_name,last_name,title,email,company,domain,linkedin_urlindustry,employee_count,countrytech_stack_signals(nullable)trigger_signals(nullable)personalized_opener(nullable)segment_id(nullable)qa_status(values:raw,enriched,briefed,sequence_generated,qa_passed,queued)
- Import your CSV.
- Columns (minimum):
-
Enrich deterministically
- In Clay, add enrichment steps in this order:
- Validate email (if your provider supports it).
- Company firmographics (industry, headcount, location).
- Role/persona normalization (map titles into a controlled set like
VP Growth,Head of RevOps,Founder). - Optional: tech stack signals only if your data source is reliable for your market.
- Hard rule: if a field is missing, keep it blank. Do not “fill” with AI.
- In Clay, add enrichment steps in this order:
-
Generate segment map + pains (Prompt 1)
- Store output JSON as
segments.jsonin your repo (yes, treat it like code). - Add the
segment_idoptions into Clay as a dropdown.
- Store output JSON as
-
Create per-lead personalization briefs (Prompt 2)
- Batch in groups of 25–50 leads to start.
- Write the brief JSON back into Clay column
personalization_brief_json. - Populate
personalized_openerfromsafe_openers[0]. - Set
qa_status = briefed.
-
Generate sequence YAML per lead OR per segment
- For speed, do per segment first:
- Generate one “base sequence” per
segment_id.
- Generate one “base sequence” per
- For quality, do per lead for your top 50–200 accounts:
- Run Prompt 3 per lead using their personalization brief.
- Save the output in Clay column
sequence_yaml. - Set
qa_status = sequence_generated.
- For speed, do per segment first:
-
Apply QA rubric (Section 7)
- Reject anything with invented facts, spammy language, or long paragraphs.
- Only move
qa_statustoqa_passedwhen it meets the pass threshold.
-
Push to Smartlead/Instantly
- Map fields:
{{first_name}},{{company}},{{role}},{{personalized_opener}}
- Create a campaign per segment_id to keep performance attribution clean.
- Start conservative:
- Low daily volume per inbox.
- Plain text.
- No attachments. Minimal links.
- Map fields:
-
Reply handling
- Route replies to a shared inbox (or Slack) with lead context.
- Use objection snippet library (Prompt 4) to draft responses, then human-send.
-
Measure and iterate
- Track by segment:
- Positive reply rate (booked meetings + interested replies)
- Negative replies (angry, spam complaints)
- Bounce rate
- Update pain hypotheses and openers weekly.
6. Output Schema
This is the strict schema your workflow should produce for each lead before sending.
{
"lead_id": "L-000123",
"contact": {
"first_name": "Ava",
"last_name": "Kim",
"title": "VP Growth",
"email": "ava@company.com",
"linkedin_url": "https://linkedin.com/in/...",
"company": "CompanyName",
"domain": "company.com"
},
"enrichment": {
"industry": "Fintech",
"employee_count": 250,
"country": "US",
"tech_stack_signals": ["UNKNOWN"],
"trigger_signals": ["Hiring: Lifecycle Marketing Manager"]
},
"strategy": {
"segment_id": "fintech_vp_growth",
"confidence": 82,
"ranked_pains": [
{
"pain_id": "activation_drop_mobile",
"reasoning_from_fields": [
"Role indicates ownership of growth funnel",
"Trigger: hiring lifecycle suggests focus on retention/activation"
]
}
],
"personalized_opener": "Saw you're hiring a Lifecycle Marketing Manager at {{company}}."
},
"sequence": {
"format": "yaml",
"content": "sequence:\n - step: 1\n subjects: ...\n body: |\n ..."
},
"qa": {
"qa_status": "qa_passed",
"checks": {
"no_hallucinations": true,
"deliverability_safe": true,
"cta_present_each_step": true,
"word_count_ok": true
},
"score": 9
}
}
7. QA Rubric
Score every lead (or every segment template) before sending.
| Category | Test (Pass/Fail) | How to Check | Points |
|---|---|---|---|
| Factuality | No invented claims about the prospect | Every specific claim traces to an enrichment field | 0/3 |
| Personalization | Opener references only known data | If any “guessing” appears, fail | 0/2 |
| Deliverability | Plain text, short paragraphs, no spam terms | Scan for hype, multiple links, excessive punctuation | 0/2 |
| CTA Quality | Single clear CTA per email matches rules | Email 1 yes/no, Email 4 breakup | 0/1 |
| Relevance | Pain matches persona + segment | Would VP Growth care? If not, fail | 0/2 |
Passing threshold:
- Per-lead sequence: 8/10 minimum and Factuality must pass.
- Per-segment base sequence: 9/10 minimum (because it scales).
8. Failure Modes
These are the ones I see in real teams, and the exact fixes.
-
Hallucinated personalization (“Congrats on the Series B…”)
- Cause: model tries to be helpful; enrichment didn’t include funding.
- Fix: In Prompt 2 and 3, keep “Use ONLY provided enrichment fields” and add “If unknown, write UNKNOWN.” Reject any output with external claims.
-
Generic pain statements that could fit anyone
- Cause: ICP not tight; pain hypotheses not falsifiable.
- Fix: In Prompt 1, require “observable signals” and “triggers.” If a pain can’t map to a signal you can enrich, delete it.
-
Sequences too long or over-written
- Cause: unconstrained word count and paragraph rules.
- Fix: Hard cap at 120 words, 1–2 sentences per paragraph, max 2 commas per sentence if needed. Enforce in QA.
-
Spammy formatting that tanks deliverability
- Cause: too many links, too many variables, marketing phrasing.
- Fix: No HTML, no images, no attachments, limit to one link (calendar token) in step 2 only.
-
Mismatch between CTA and buying motion
- Cause: asking for a meeting before establishing relevance.
- Fix: Step 1 is a yes/no question tied to the pain. Step 3 offers a free template/checklist (ungated) so the exchange feels fair.
-
Segment contamination (performance data becomes noisy)
- Cause: mixing multiple personas in one campaign.
- Fix: One campaign per
segment_id. Keep subject lines and pain angles consistent.
-
Over-personalization that doesn’t scale
- Cause: trying to write bespoke emails for every lead from day one.
- Fix: Start per-segment base sequence, then only personalize top accounts. At Postmates, we learned to standardize first, then add complexity after signal appears.
9. Iteration Loop
Run this weekly. It’s short, mechanical, and it compounds.
-
Collect outcomes by segment_id
- Tags: positive reply, neutral, objection, unsubscribe, bounce.
- Store examples of best/worst threads.
-
Update pain hypotheses
- For segments with weak replies, rewrite the top 2 pains to be more specific and tied to a trigger you can enrich.
-
Tighten the opener policy
- Add new “approved opener patterns” that performed well.
- Ban patterns that caused skepticism (“noticed you’re scaling fast”).
-
Prompt versioning
- Keep prompts in a repo with versions (
prompt_v3.md). - Record what changed and why. Treat it like growth experiments.
- Keep prompts in a repo with versions (
-
Expand enrichment only after messaging works
- Add one new signal column at a time (job posts, tech install, etc.).
- If it doesn’t change reply quality, remove it. Extra enrichment steps often add latency and failure points.
-
Scale volume slowly
- Increase daily send only when bounce/spam signals are stable and copy is passing QA consistently.
Frequently Asked Questions
Should I generate one ai cold email per lead or one per segment?
Start with one per segment so you can launch today and learn faster. Then personalize per lead for your top accounts where a better opener materially changes outcomes.
What’s the minimum enrichment I need before I send?
First name, company, title/persona, and one credible opener signal (hiring, role scope, industry, or a verified trigger). If you don’t have a real opener, send a segment-level opener and keep it honest.
How do I stop the model from inventing facts?
Force “ONLY use provided fields,” require UNKNOWN for missing fields, and fail QA if any claim can’t be traced to enrichment. Most hallucinations come from open-ended prompts, not from the model “being bad.”
Smartlead vs Instantly?
Both work for getting campaigns out fast. Pick the one your team already knows, then focus on segmentation and QA; tooling differences matter less than input quality and consistency.
What does “pain hypothesis” mean in practice?
A falsifiable statement about what the prospect likely struggles with, tied to a trigger and a signal you can observe. If you can’t specify how you’d detect it, it’s not usable for outbound.
How many follow-ups should I send?
Run 4 steps to start because it forces discipline: initial, meeting ask, value asset, breakup. Add steps only after you have evidence that your list quality and deliverability can handle more volume.
Frequently Asked Questions
Should I generate one ai cold email per lead or one per segment?
What’s the minimum enrichment I need before I send?
How do I stop the model from inventing facts?
Smartlead vs Instantly?
What does “pain hypothesis” mean in practice?
How many follow-ups should I send?
Ready to build your AI growth engine?
I help CEOs use AI to build the growth engine their board is asking for.
Talk to Isaac