Generative Engine Optimization: Get Cited by AI
Execute generative engine optimization by shipping an “AI-citable” content cluster: a hub page with tight definitions and sourced facts near the top, plus spoke pages that answer specific questions and link back to the hub early. Add schema (FAQPage, DefinedTerm, HowTo), enforce entity consistency, and verify citations via AI query audits.
Key takeaways:
- Publish a hub-spoke cluster built for AI citation: definitions first, facts early, consistent entities, and explicit internal links.
- Increase “citable density” with primary-source references, tables, and FAQ blocks that models can quote cleanly.
- Run a deterministic audit loop: prompt AI for citations, compare answers to your pages, patch gaps, re-test, repeat.
If you want ChatGPT, Perplexity, or Google AI Overviews to cite your company, you need pages that behave like “quotable reference objects,” not brand narratives. In practice, that means: clear definitions, stable terminology, high fact density with named sources, and page structures that map to the way models extract answers (headers, lists, tables, and FAQs).
I learned this pattern the hard way running growth at Uber and Postmates: if your content can’t be parsed fast, it won’t win distribution. AI distribution is the same story with different mechanics. You’re competing for inclusion in a synthesized answer, and the model’s incentives push it toward pages that are specific, source-backed, and internally consistent.
This runbook is designed for a CEO, VP Growth, or growth engineer who needs an executable workflow today. It’s deterministic: you’ll produce a GEO spec, a hub-spoke outline, schema blocks, an internal linking plan, and an AI citation audit script you can run repeatedly. The target keyword is generative engine optimization and the outputs are set up so you can hand them directly to Claude/ChatGPT, a writer, or a dev to publish.
1. Objective
Publish and validate a generative engine optimization content cluster that gets your pages cited by AI answer engines for a defined set of target queries.
2. Inputs Required
- Your domain + CMS access (Webflow/WordPress/Next.js/Headless)
- Google Search Console access (or at minimum, a list of your top organic landing pages)
- A list of 20–50 target queries you want AI to cite you for (can be manual)
- Your product’s canonical positioning statement (1–2 sentences)
- A list of entities you must be consistent on (brand name, product name, category terms, key competitor names)
- 3–10 credible sources you’re allowed to cite (docs, standards bodies, government sites, major industry references). Avoid affiliate blogs.
- A dev or technical editor who can add schema markup and adjust templates
- Assumption: you can publish at least 1 hub page + 6 spoke pages within 2 weeks (my recommended minimum for observable movement)
3. Tool Stack
- LLM for drafting + audits
- Primary: Claude 3.5 Sonnet
- Alternative: ChatGPT (GPT-4.1/4o), Gemini Advanced
- Programmatic AI query testing
- Primary: OpenAI API (Responses API) + lightweight script
- Alternative: Anthropic API, Perplexity API (if available in your environment)
- Editing + implementation
- Primary: Cursor (for code + content ops)
- Alternative: VS Code + Copilot, or a CMS editor
- Schema validation
- Primary: Google Rich Results Test + Schema.org validator
- Alternative: Screaming Frog (structured data extraction)
- Crawl + internal linking checks
- Primary: Screaming Frog SEO Spider
- Alternative: Sitebulb
- Rank + visibility proxies
- Primary: GSC + manual AI queries (logged)
- Alternative: Ahrefs/Semrush for traditional SERP monitoring (useful but not sufficient for GEO)
4. Prompt Pack
# Prompt 1 (Claude / ChatGPT): GEO Hub-Spoke Blueprint Generator
You are my AI Growth Architect. Build a Generative Engine Optimization (GEO) content cluster that maximizes the chance AI answer engines (ChatGPT, Perplexity, Google AI Overviews) will cite our site.
INPUTS:
- Company: {{company_name}}
- Domain: {{domain}}
- Product: {{1-2 sentence description}}
- Category / main entity: {{primary_entity_term}}
- Secondary entities to stay consistent on: {{list}}
- Target queries (paste 20–50): {{queries}}
- Allowed citation sources (paste URLs/domains): {{sources}}
- Constraints: No invented stats; every number must include (Source: Name, Year). Prefer definitions, tables, checklists, and FAQs.
OUTPUT:
1) Hub page spec:
- Title (include keyword: generative engine optimization)
- First 200 words (must contain: 1 definition + 2 named-source facts + 1 internal link placeholder to a spoke)
- H2/H3 outline with “AI-citable blocks” (definitions, bullet lists, tables)
2) 6–12 spoke page specs:
- Each with a single intent query, title, slug suggestion, and a 120-word “answer-first” intro
- Each spoke must link back to the hub within the first 150 words using exact-match anchor text: “generative engine optimization”
3) Entity dictionary:
- Canonical names, synonyms allowed, synonyms forbidden
4) Internal linking map:
- Hub -> all spokes
- Each spoke -> hub + 2 other spokes
Return in structured Markdown with clear sections.
# Prompt 2 (Claude / ChatGPT): Citable Fact Pack + Source Mapping
Create a “Citable Fact Pack” for our GEO hub page. Only include facts you can attribute to a named source and year. If a claim cannot be sourced, omit it.
INPUTS:
- Topic: generative engine optimization / AEO
- Our product angle: {{product_angle}}
- Allowed sources: {{sources}}
OUTPUT FORMAT:
- 12–20 facts, each as:
- Fact statement (<= 25 words, quotable)
- Source (Organization/Document, Year)
- URL
- Where to place it on the hub page (Intro / Section name / FAQ)
- 5 definitions:
- Term: ...
- Definition: ... (<= 40 words)
- Preferred phrasing (exact sentence to reuse across pages)
# Prompt 3 (ChatGPT / Claude): Schema Markup Builder (FAQPage + DefinedTerm + HowTo)
Generate JSON-LD schema for the following page drafts. Follow Schema.org. Keep it valid and minimal.
INPUTS:
- Page type: {{Hub or Spoke}}
- Page title: {{title}}
- Canonical URL: {{url}}
- Primary entity: {{entity}}
- Definitions (paste): {{definitions}}
- FAQs (paste Q/A): {{faqs}}
- If HowTo applies, paste steps: {{steps}}
OUTPUT:
1) JSON-LD for DefinedTermSet or DefinedTerm (if definitions exist)
2) JSON-LD for FAQPage (if FAQs exist)
3) JSON-LD for HowTo (if steps exist)
Return 3 separate JSON blocks and state where to embed them in HTML.
# Prompt 4 (OpenAI/Anthropic): AI Citation Audit Prompt (copy-paste into model UI)
You are evaluating whether AI answer engines should cite our page.
TASK:
For each query below:
1) Provide the best 6–10 sentence answer.
2) Provide 3–7 citations as clickable URLs. Prefer primary sources; include our site only if it deserves citation.
3) Explain why each citation was chosen (1 sentence each).
QUERIES:
{{paste queries}}
OUR PAGES TO CONSIDER:
{{paste published URLs}}
CONSTRAINTS:
- Do not cite our page unless it contains a clear, quotable definition, a specific checklist/table, or a sourced fact.
- If our page is missing information needed to cite it, list the missing blocks precisely.
5. Execution Steps
-
Lock your entity dictionary (30 minutes).
Create a single source of truth for: your brand name, product name, category term, and 5–15 related terms. Decide which synonyms are allowed. You’re preventing “entity drift,” which causes models to treat your content as multiple different things. -
Pick your GEO hub topic and query set (60 minutes).
Your hub should map to a durable concept you want to own (here: generative engine optimization). Your spoke pages map to narrow intents (e.g., “GEO vs SEO,” “FAQPage schema for AI Overviews,” “How to get cited by Perplexity”). -
Generate the hub-spoke blueprint (Prompt 1).
Do not start writing freeform. Start with structure and linking rules so every page has a deterministic job. -
Build a Citable Fact Pack (Prompt 2).
You are collecting “quotable atoms.” In my experience, teams fail here because they write good advice but fail to anchor it to named sources, so models downgrade it versus Wikipedia/standards bodies/platform docs. -
Draft the hub page with “citation-first” ordering.
Put these blocks near the top of body content:- A one-sentence definition of generative engine optimization
- 2–3 sourced facts with named sources and years (only if you can truthfully source them)
- A short table or checklist that can be quoted Keep intros short. AI systems often weight early sections heavily because they look like summary blocks.
-
Draft 6–12 spokes with narrow intent and early hub backlink.
Each spoke:- Answers one query directly in the first 120 words
- Links back to the hub within the first 150 words with anchor: “generative engine optimization”
- Includes 1–2 unique citable blocks (table, checklist, definition, steps)
- Includes FAQs (3–6) aligned to the query’s follow-ups
-
Add schema: FAQPage + DefinedTerm/DefinedTermSet + HowTo (Prompt 3).
Embed JSON-LD in the page HTML (head or body, consistent across your site). Validate with Google Rich Results Test and Schema.org validator. Fix warnings that affect eligibility. -
Implement hub-spoke internal linking exactly.
- Hub links to every spoke in a visible “Cluster index” section.
- Each spoke links to hub early, plus two other spokes in a “Related” block.
This creates a crawlable semantic cluster and gives models repeated co-occurrence signals between your hub entity and the spokes.
-
Run the AI citation audit (Prompt 4) before and after publish.
- Before publish: use draft URLs (staging) or paste text.
- After publish: use live URLs. Log whether your pages get cited, and why not.
-
Patch gaps in a controlled way (no random edits). When the audit says “missing X,” add a specific block:
- Add a definition
- Add a comparison table
- Add a sourced fact (only if you can cite)
- Add an FAQ that matches the phrasing of the query
- Re-run audits on a schedule. Weekly for the first month, then biweekly. GEO compounds as models and indexers re-crawl and retrain.
6. Output Schema
Use this schema to keep your workflow deterministic and repeatable across clusters.
{
"cluster": {
"primary_keyword": "generative engine optimization",
"hub": {
"title": "Generative Engine Optimization (GEO): How to Get Cited by AI",
"slug": "/generative-engine-optimization/",
"primary_entity": "Generative Engine Optimization",
"intro_requirements": {
"max_words": 200,
"must_include": [
"1-sentence definition",
"2 sourced facts with (Source: Name, Year)",
"1 internal link to a spoke"
]
},
"citable_blocks": [
{
"block_type": "definition",
"text": "..."
},
{
"block_type": "table",
"title": "GEO vs SEO vs AEO",
"rows": [["...", "..."]]
}
],
"schema": ["DefinedTermSet", "FAQPage", "HowTo"],
"internal_links_out": [
{ "to": "/geo-faqpage-schema/", "anchor": "FAQPage schema for AI answers" }
]
},
"spokes": [
{
"intent_query": "What is generative engine optimization?",
"title": "What Is Generative Engine Optimization (GEO)?",
"slug": "/what-is-generative-engine-optimization/",
"early_link_rule": {
"to_hub_slug": "/generative-engine-optimization/",
"anchor_text_exact": "generative engine optimization",
"within_first_words": 150
},
"required_sections": ["Answer", "Checklist/Table", "FAQs"],
"schema": ["FAQPage", "DefinedTerm"]
}
],
"entity_dictionary": {
"canonical": [
{ "entity": "Generative Engine Optimization", "short": "GEO" }
],
"allowed_synonyms": ["Answer Engine Optimization", "AEO"],
"forbidden_synonyms": ["Generative search optimization"]
},
"audit_plan": {
"queries": [],
"models": ["ChatGPT", "Claude", "Perplexity"],
"pass_condition": ">=30% of target queries cite hub or a spoke within top citations list"
}
}
}
7. QA Rubric
| Category | Test | Pass/Fail Rule | Score (0-5) |
|---|---|---|---|
| Entity consistency | Canonical entity term used consistently in title/H1/first definition | Pass if identical phrasing appears on hub + all spokes | 0-5 |
| Answer-first structure | First 120 words answer the query directly | Pass if no brand preamble and includes definition or direct steps | 0-5 |
| Fact density (sourced) | Named-source facts per 500 words | Pass if at least 3 sourced facts per 500 words when making factual claims; otherwise no numbers | 0-5 |
| Citation placement | Sourced facts appear early | Pass if at least 1–2 sourced facts appear in first 200 words of hub | 0-5 |
| Schema validity | JSON-LD validates | Pass if no errors in Rich Results Test / Schema validator | 0-5 |
| Internal linking | Hub-spoke links implemented | Pass if hub links to all spokes and each spoke links back to hub within first 150 words | 0-5 |
| Citable blocks | Tables/checklists/definitions are quotable | Pass if each page contains at least 2 blocks that can be quoted verbatim | 0-5 |
| AI citation audit | Models cite your pages | Pass if hub/spokes are cited for target queries in at least 30% of runs | 0-5 |
Threshold: ship only if total score ≥ 28/40 and schema + internal linking are hard-pass.
8. Failure Modes
-
You publish “thought leadership” intros and bury the definition.
Fix: rewrite first 120–200 words as: definition → scoped promise → list of steps → link to relevant spoke. I’ve watched teams at scale lose distribution because the first screen was brand story, not the answer. -
Unsourced numbers sneak in through AI drafting.
Fix: enforce a rule in your editor: any%,$, or “X times” must include “(Source: Name, Year)” or be removed. Add a CI-style content check (simple regex) before publish. -
Schema is added but invalid, so it’s ignored.
Fix: run Rich Results Test on every template change. Common offenders: missingmainEntityin FAQPage, invalid URL formats, or mixing HowTo steps without required fields. -
Entity drift across pages (“GEO,” “Generative SEO,” “AI SEO” used interchangeably).
Fix: lock an entity dictionary and add a lint step: search/replace forbidden synonyms; ensure first definition sentence is identical across pages. -
Spokes don’t link back early, so the hub doesn’t accrue authority in the cluster.
Fix: enforce the deterministic rule: hub backlink within first 150 words with exact anchor text “generative engine optimization.” Add it right after the opening definition paragraph. -
Your pages are good, but not uniquely citable.
Fix: add “reference objects” that general web pages lack:- comparison tables (GEO vs SEO vs AEO)
- implementation checklists
- copy-paste prompts
- schema examples Models cite concrete artifacts.
-
AI audit prompts are inconsistent, so results flap.
Fix: pin the same query list, same instruction format, same evaluation rubric, and run 3 trials per model. Log outputs so you can compare deltas after edits.
9. Iteration Loop
- Weekly query audit (same prompts, same list). Track: cited URLs, citation position, and “missing blocks.”
- Patch only what the audit requested. Add one new citable block per failing query (definition/table/FAQ/sourced fact).
- Expand spokes based on “adjacent questions.” If the model answers a follow-up you don’t cover, that’s your next spoke.
- Tighten entity consistency every cycle. Drift increases as more people touch content.
- Refresh sources quarterly. Replace stale citations with updated docs where applicable, but do not churn URLs. Stable URLs get cited more reliably.
Executable code: AI citation audit runner (OpenAI API)
Use this to run the same audit prompt across your query set and store results as JSON. Replace model name as needed.
# geo_audit.py
import json, os, time
from datetime import datetime
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
MODEL = os.environ.get("OPENAI_MODEL", "gpt-4.1-mini")
def run_audit(queries, urls):
prompt = f"""
You are evaluating whether AI answer engines should cite our page.
For each query:
1) Provide the best 6–10 sentence answer.
2) Provide 3–7 citations as clickable URLs. Prefer primary sources; include our site only if it deserves citation.
3) Explain why each citation was chosen (1 sentence each).
QUERIES:
{chr(10).join("- " + q for q in queries)}
OUR PAGES TO CONSIDER:
{chr(10).join("- " + u for u in urls)}
CONSTRAINTS:
- Do not cite our page unless it contains a clear, quotable definition, a specific checklist/table, or a sourced fact.
- If our page is missing information needed to cite it, list the missing blocks precisely.
Return a JSON array with fields:
query, answer, citations[{url, reason}], our_site_cited(boolean), missing_blocks[]
""".strip()
resp = client.responses.create(
model=MODEL,
input=prompt,
temperature=0
)
return resp.output_text
if __name__ == "__main__":
queries = json.load(open("queries.json"))
urls = json.load(open("urls.json"))
out = run_audit(queries, urls)
ts = datetime.utcnow().strftime("%Y%m%d-%H%M%S")
with open(f"audit-{ts}.json", "w") as f:
f.write(out)
print("Wrote:", f"audit-{ts}.json")
queries.json example:
[
"generative engine optimization definition",
"how to get cited by chatgpt",
"FAQPage schema for AI Overviews"
]
urls.json example:
[
"https://yourdomain.com/generative-engine-optimization/",
"https://yourdomain.com/geo-faqpage-schema/"
]
Executable config: lightweight “no unsourced numbers” content check
Run this on markdown/html before publishing.
# flags lines containing %, $, or "x" multipliers without a "(Source:" pattern nearby
python - << 'PY'
import re, sys, pathlib
path = sys.argv[1] if len(sys.argv) > 1 else "draft.md"
text = pathlib.Path(path).read_text()
lines = text.splitlines()
bad = []
for i, line in enumerate(lines, 1):
if re.search(r'(\d+%|\$[\d,]+|\b\d+(\.\d+)?x\b)', line) and "(Source:" not in line:
bad.append((i, line.strip()))
if bad:
print("FAIL: Unsourced numeric claims found:")
for i, l in bad:
print(f"{i}: {l}")
sys.exit(1)
else:
print("PASS: No unsourced numeric claims detected.")
PY
Frequently Asked Questions
What is generative engine optimization?
Generative engine optimization is the practice of structuring and sourcing content so AI answer engines can quote and cite it. It focuses on definitions, citable artifacts (tables/checklists), schema, and entity consistency.
How is GEO different from SEO?
SEO targets ranking in a list of links, while GEO targets inclusion as a cited source inside an AI-generated answer. GEO cares more about quotable blocks, explicit definitions, and source-backed statements.
What schema matters most for GEO?
FAQPage is the fastest win for query-shaped pages, DefinedTerm/DefinedTermSet helps stabilize definitions, and HowTo works for step-based queries. Validate schema so engines can trust it.
How do I know if AI engines will cite my page?
Run a repeatable audit prompt across a fixed query set and log whether your URLs show up as citations. If you’re not cited, the audit should tell you exactly what “missing blocks” to add.
How many pages do I need for a GEO cluster?
Start with 1 hub and 6 spokes so you can cover the core intents and create a real internal link graph. Expand spokes based on what the AI audit reveals as unanswered follow-ups.
Why do spokes need to link back to the hub early?
Early hub links reinforce the hub as the canonical entity page for the cluster, both for crawlers and for models that learn associations from repeated co-occurrence. Put the link right after the opening definition.
Frequently Asked Questions
What is generative engine optimization?
How is GEO different from SEO?
What schema matters most for GEO?
How do I know if AI engines will cite my page?
How many pages do I need for a GEO cluster?
Why do spokes need to link back to the hub early?
Ready to build your AI growth engine?
I help CEOs use AI to build the growth engine their board is asking for.
Talk to Isaac