How to Build a pSEO Engine with Claude Code
Build a programmatic seo with ai engine by using Claude Code to generate a keyword + data set, scaffold a Next.js page template, produce validated JSON content per page, and ship it to Vercel with a sitemap and Search Console wiring. This runbook gives you deterministic prompts, schemas, QA gates, and code to publish your first 100 pages today.
Key takeaways:
- You need a strict content schema + automated QA before you scale page volume.
- Claude Code can scaffold the repo, routes, sitemap, and generation scripts in one pass.
- Ship a small batch, validate in GSC, then expand via the iteration loop.
I’ve run growth teams where “ship velocity” beats “perfect strategy.” At Uber and Postmates, the teams that won were the ones that turned a hypothesis into a working system quickly, then instrumented it hard. Programmatic SEO is the same: you don’t need a giant content org to publish hundreds (or thousands) of high-intent pages. You need a reliable pipeline: data in, pages out, quality enforced, deployed, and indexed.
This runbook is written for a CEO, VP Growth, or growth engineer who needs programmatic seo with ai working right now, not as a concept. The workflow below uses Claude Code as the coding agent to scaffold the entire pSEO system: keyword set generation, a normalized page dataset, a Next.js template, page rendering, internal linking rules, sitemap generation, and a QA gate that blocks thin or broken pages.
I’ll be opinionated. Determinism beats creativity in pSEO. Your goal is repeatability: the same inputs create the same site structure and quality every time. Once you have that, scaling page count becomes a controlled variable instead of a gamble.
1. Objective
Launch a production pSEO site that programmatically generates, validates, and deploys SEO landing pages (Next.js + Vercel) from a structured dataset using Claude Code.
2. Inputs Required
- A clear pSEO “unit page” concept (example: “Best {tool} for {use_case}”, “{city} {service} pricing”, “{integration} with {product}”).
- A seed list of:
- 20–200 entities (tools, cities, categories, integrations, industries, templates, etc.)
- 20–200 modifiers (use cases, intents, comparison terms, “near me”, “pricing”, etc.)
- Your domain + Vercel account access.
- Google Search Console access for the domain.
- A repo host (GitHub) and a local dev environment (Node 18+).
- Assumptions (lock these to stay deterministic):
- You will publish in batches (start with 50–100 pages).
- Every page must pass automated QA (schema validity, content length bounds, internal links min, no duplicate titles).
- You will not index pages that fail QA.
3. Tool Stack
Primary path (fastest)
- Claude Code (coding agent)
- Alternative: Cursor agent, GitHub Copilot Workspace
- Next.js (App Router)
- Alternative: Astro, SvelteKit
- Vercel hosting + cron (optional)
- Alternative: Netlify, Cloudflare Pages
- OpenAI API for embeddings (optional for internal linking / clustering)
- Alternative: Cohere, Voyage, local embeddings
- Google Search Console (indexing + coverage)
- Alternative: Bing Webmaster Tools (additionally, not instead)
Validation / QA
- Zod schema validation in Node
- Alternative: Ajv (JSON schema)
- Playwright for smoke tests
- Alternative: Cypress
4. Prompt Pack
Use these prompts exactly. They’re designed to force schema-first outputs and deterministic code generation.
# Prompt 1 (Claude Code): Scaffold the pSEO repo (Next.js + dataset + routes + sitemap)
You are Claude Code acting as a senior growth engineer.
Goal: Create a Next.js (App Router) project that generates programmatic SEO pages from a JSON dataset, with strict schema validation and sitemap generation.
Constraints:
- Deterministic output. Do not ask me questions; make reasonable defaults and encode them in code comments.
- Use TypeScript.
- Use Zod to validate the dataset at build time.
- Pages must be generated from /data/pages.json.
- Route pattern: /p/[slug]
- Each page must render: H1, intro, 4-6 sections with H2s, FAQ (3-5 Qs), and a Related Pages list (internal links).
- Implement a sitemap at /sitemap.xml that includes all /p/[slug] pages.
- Implement robots.txt that allows crawling and points to sitemap.
- Add a script: "npm run validate:data" that fails if any page violates schema or QA rules (min words, min internal links, unique title/slug).
- Add a script: "npm run build:pages" that can generate pages.json from a source seeds file at /data/seeds.csv (create a placeholder CSV + parser).
- Include README with exact commands.
Output:
- Make all code changes directly.
- Show me the file tree and the key files in full.
# Prompt 2 (ChatGPT or Claude): Generate the initial seeds.csv deterministically
You are an SEO growth operator. Create a seeds.csv with 100 rows for a programmatic SEO engine.
Rules:
- The page concept: "Best {category} software for {use_case}"
- Columns: category,use_case
- Categories must be B2B SaaS categories (examples: CRM, data warehouse, customer support).
- Use cases must be concrete (examples: "series A fundraising", "SOC 2 compliance", "sales forecasting").
- No duplicate pairs.
- Avoid nonsense or overly broad use cases like "business growth".
- Output only valid CSV with a header row and 100 data rows.
Do not include any commentary.
# Prompt 3 (Claude or ChatGPT): Generate page JSON objects that match the schema exactly
You are generating content for programmatic seo with ai.
Task: For each row in seeds.csv, produce a JSON array of page objects matching this schema exactly:
- slug (kebab-case, unique)
- title (<= 65 chars)
- metaDescription (120-160 chars)
- h1 (close variant of title)
- intro (90-140 words)
- sections: array of 5 objects { heading, body } where body is 140-220 words
- faqs: array of 4 objects { q, a } where a is 40-80 words
- relatedSlugs: array of 6 slugs (must exist in this same batch)
- canonicalPath: "/p/{slug}"
Hard constraints:
- Each page must contain the exact phrase "programmatic seo with ai" once in either the intro or one section body (not in title).
- Do not mention specific company names, pricing, or statistics.
- Write in a direct operator tone.
- Ensure relatedSlugs are topically close (same category or use_case adjacency).
Output only JSON. No markdown.
# Prompt 4 (Claude Code): Add an automated QA gate + fix failures
You are Claude Code. Add a deterministic QA step for /data/pages.json.
QA rules:
- Validate schema via Zod.
- title unique; slug unique.
- intro word count 90-140.
- each section body word count 140-220.
- metaDescription 120-160 characters.
- relatedSlugs length exactly 6 and each slug exists.
- total internal links on page: at least 5 (count the Related Pages links plus any links in FAQ answers; add links if needed).
- Phrase "programmatic seo with ai" appears exactly once per page across all fields.
Implement:
- /scripts/qa.ts prints a report and exits non-zero on any failure.
- "npm run validate:data" runs qa.ts.
Then run through the dataset mentally and adjust the code to be robust (handle missing fields, print actionable errors).
Output the modified files in full.
5. Execution Steps
Follow this sequence exactly. If you skip steps, you’ll ship broken pages or pages that don’t index.
-
Pick one pSEO pattern and lock it
- Example used here:
Best {category} software for {use_case} - Write it in your README and don’t change it mid-run.
- Example used here:
-
Generate your first batch seeds
- Use Prompt 2 to produce
data/seeds.csvwith 100 rows. - Save it exactly as
data/seeds.csv.
- Use Prompt 2 to produce
-
Scaffold the repo with Claude Code
- Run Claude Code Prompt 1.
- You should end with a working Next.js repo that expects
data/pages.json.
-
Generate the page JSON
- Use Prompt 3 (Claude/ChatGPT) with your
seeds.csvcontent. - Save output as
data/pages.json.
- Use Prompt 3 (Claude/ChatGPT) with your
-
Install, validate, and fail fast
- Run:
npm i npm run validate:data - If it fails, do not “hand wave.” Fix the dataset or tighten the generator prompt.
- Run:
-
Add the QA gate if it’s missing or weak
- Run Prompt 4 in Claude Code to harden
/scripts/qa.tsuntil failures are actionable.
- Run Prompt 4 in Claude Code to harden
-
Render locally and smoke test
- Run:
npm run dev - Manually open 5 random pages:
/p/<slug>- confirm: title, H1, sections, FAQ, related links visible
- Run Playwright smoke if present:
npm run test
- Run:
-
Deploy to Vercel
- Connect GitHub repo to Vercel.
- Set build command
npm run build. - Deploy.
-
Wire Google Search Console
- Verify domain.
- Submit
/sitemap.xml. - Inspect 5 URLs and request indexing.
-
Expand page volume only after QA is stable
- Your gating condition is deterministic:
npm run validate:datamust pass on the full batch. - If you want a numeric gate, set your own operational standard. I usually won’t expand batches until I see repeated clean runs and no templating regressions across a full batch.
6. Output Schema
This is the contract. If your content generator deviates, your build should fail.
[
{
"slug": "best-crm-software-for-sales-forecasting",
"title": "Best CRM Software for Sales Forecasting",
"metaDescription": "Compare CRM options for sales forecasting. Use a structured evaluation checklist, key features, and pitfalls to avoid before you commit.",
"h1": "Best CRM Software for Sales Forecasting",
"intro": "You’re not shopping for a CRM, you’re shopping for forecast reliability. This page breaks down how to evaluate CRM software specifically for sales forecasting, with a practical checklist and implementation notes. The goal is simple: reduce spreadsheet drift, tighten stage definitions, and create a pipeline view you can trust. This is programmatic seo with ai applied to repeatable decision pages: consistent structure, consistent criteria, and pages that answer the same intent without fluff.",
"sections": [
{ "heading": "What to automate vs. what to keep manual", "body": "..." },
{ "heading": "Data model requirements for forecasting", "body": "..." },
{ "heading": "Workflow fit: stages, handoffs, and hygiene", "body": "..." },
{ "heading": "Reporting: leading indicators you should track", "body": "..." },
{ "heading": "Implementation plan for week 1 to week 4", "body": "..." }
],
"faqs": [
{ "q": "How do I compare CRMs for forecasting accuracy?", "a": "..." },
{ "q": "What integrations matter most for forecasting?", "a": "..." },
{ "q": "How long does a CRM forecasting rollout take?", "a": "..." },
{ "q": "What breaks forecasting after launch?", "a": "..." }
],
"relatedSlugs": [
"best-crm-software-for-soc-2-compliance",
"best-sales-engagement-software-for-sales-forecasting",
"best-revenue-analytics-software-for-sales-forecasting",
"best-customer-data-platform-software-for-sales-forecasting",
"best-data-warehouse-software-for-sales-forecasting",
"best-sales-enablement-software-for-sales-forecasting"
],
"canonicalPath": "/p/best-crm-software-for-sales-forecasting"
}
]
7. QA Rubric
| Category | What “Pass” Means | Score (0-5) | Fail Conditions |
|---|---|---|---|
| Schema validity | All required fields present; types correct; arrays correct length | 5 | Missing fields, wrong types, empty arrays |
| Uniqueness | slug and title unique across dataset |
5 | Any duplicates |
| Meta description | 120–160 characters | 5 | Outside bounds |
| Intro length | 90–140 words | 5 | Outside bounds |
| Section compliance | Exactly 5 sections; each body 140–220 words | 5 | Wrong count or word bounds |
| FAQ compliance | Exactly 4 FAQs; each answer 40–80 words | 5 | Wrong count or word bounds |
| Phrase control | Exact phrase “programmatic seo with ai” appears exactly once per page (across all fields) | 5 | 0 occurrences or >1 occurrences |
| Internal links | At least 5 internal links rendered per page (Related Pages counts toward this) | 5 | Fewer than 5 |
| Related integrity | relatedSlugs length 6; all exist in dataset |
5 | Missing slugs or wrong length |
| Render sanity | Page builds and renders without runtime errors | 5 | Build fail, route errors |
Release gate (deterministic):
- Pass if all categories score 5 for every page in the batch.
- Fail if any single page fails any single category.
8. Failure Modes
These are the problems I’ve seen repeatedly shipping templated SEO pages quickly (and fixing them under pressure).
-
Duplicate or near-duplicate titles
- Symptom: QA fails uniqueness; SERP cannibalization later.
- Fix: Make title formula include both variables explicitly and force case normalization.
- Implementation: In your generator, compute
title = Best {Category} Software for {Use Case}and reject if already used.
-
Related links point to non-existent pages
- Symptom: Broken internal links; QA fail.
- Fix: Build
relatedSlugsonly from the set of generated slugs in-memory, then write JSON. - Practical tip: Generate all slugs first, then assign related.
-
Phrase frequency violations (“programmatic seo with ai” appears 0 or 2+ times)
- Symptom: QA fail; content generator repeats the phrase in sections and FAQs.
- Fix: Put the phrase in exactly one controlled field (I prefer intro), then add a QA check that counts occurrences across the serialized page object.
- Prompt fix: “Exact phrase appears once, and only in intro.”
-
Thin pages that “look fine” but fail word-count constraints
- Symptom: Sections under 140 words; intros too short.
- Fix: Enforce word counts via script, not human review. Most teams try to eyeball this and ship bad batches.
- Prompt fix: Use hard bounds per field and forbid bullets if the model tends to shorten prose.
-
Pages render but aren’t indexable
- Symptom: Crawled but not indexed; or “Discovered, currently not indexed.”
- Fix: Confirm:
robots.txtallows crawling- sitemap contains full absolute URLs (or correct relative handling)
- canonical tags consistent
- no accidental
noindex
- Operator note: I’ve seen teams accidentally ship
noindexfrom a staging config and waste weeks.
-
Programmatic template footprint too obvious
- Symptom: Pages feel like clones; lower engagement; higher bounce.
- Fix: Keep the same structure, but vary:
- section headings slightly based on use case
- example “implementation plan” specifics
- decision checklist items
- Deterministic approach: Use conditional heading maps per category, not “be creative.”
-
Internal links below minimum
- Symptom: QA fails “at least 5 internal links.”
- Fix: Always render:
- 6 related links (already meets requirement)
- plus a “Browse by category” component (optional)
- Common mistake: “Related pages” exists in JSON but not actually rendered as
<a href>.
9. Iteration Loop
This is how you turn the first batch into an engine you can trust.
-
Run batch 1 (100 pages)
- Ship, submit sitemap, request indexing for a handful of URLs.
-
Collect failure signals
- From QA script: which constraints fail most often (word counts, duplicates, related integrity).
- From GSC: Coverage issues, canonical mismatches, crawl anomalies.
-
Tighten the generator prompt, not the human process
- Update Prompt 3 with the exact failure you saw:
- If intros too long: reduce bounds and add “do not exceed.”
- If related slugs invalid: require selection from provided slug list.
- Update Prompt 3 with the exact failure you saw:
-
Add deterministic enrichment only after stability
- Once QA is consistently clean:
- Add embeddings-based related linking (optional).
- Add category hubs (
/c/[category]) that list pages and improve crawl paths.
- Once QA is consistently clean:
-
Expand by controlled batches
- Batch 2: 300 pages
- Batch 3: 1,000 pages
- Each batch must pass the same QA gate. No exceptions.
-
Evolve template carefully
- Change one thing at a time: title formula, section map, internal link module, FAQ logic.
- Re-run the full batch QA after every template change.
Frequently Asked Questions
Can I do programmatic seo with ai without Next.js?
Yes. Astro or SvelteKit work fine. The non-negotiable piece is the schema + QA gate that fails the build when content breaks constraints.
Do I need embeddings for related links?
Not at the start. A deterministic related rule (same category, adjacent use case) gets you live fast. Add embeddings once you want tighter topical clusters and better click paths.
How many pages should I publish first?
Publish a small batch you can manually spot-check plus fully validate in QA. I default to 50–100 pages to start, then expand in controlled batches.
What’s the fastest way to break a pSEO rollout?
Shipping without a build-breaking QA gate. You’ll end up with silent failures: duplicates, thin sections, broken internal links, and a sitemap full of pages you don’t want indexed.
How do I keep AI content from sounding generic?
Force specificity via section requirements (implementation steps, decision criteria, failure modes) and ban empty claims in the prompt. Keep structure constant, vary details deterministically by category/use case.
Frequently Asked Questions
Can I do programmatic seo with ai without Next.js?
Do I need embeddings for related links?
How many pages should I publish first?
What’s the fastest way to break a pSEO rollout?
How do I keep AI content from sounding generic?
Ready to build your AI growth engine?
I help CEOs use AI to build the growth engine their board is asking for.
Talk to Isaac