Optimize for LLM citation
ChatGPT, Claude, Perplexity, Bing Chat, and Google’s AI Mode all browse the open web and cite a small set of pages per answer. Getting cited there — Generative Engine Optimization, or GEO — is its own distribution channel, and the patterns that win there are not the same as traditional SEO. This page covers what Essel already does on the writing side and what to add on your own site so the blogs you publish from our consumer API are as cite-able as possible.
What Essel already does for you
Every released blog runs through a geo rule category enforced by the
draft and revision agents. You don’t need to brief any of this — it is
automatic on every article ≥ 800 words:
- Named source. At least one specific study, report, framework, standards body, or quoted expert by its actual name. Generic “studies show” or “experts agree” phrasing is stripped by the revision agent.
- Attributed statistic. At least one specific number with a real,
named source and year, formatted either inline (
72% (Source, 2024)) or as a GFM footnote with a URL. - Recency anchor. An explicit “as of <Month Year>” or dated lead in the intro. LLM browsing agents de-prioritize content that reads as undated.
- Extractable answers. Each H2 opens with a self-contained sentence that works as a standalone quote — the structural pattern LLMs lift most reliably.
- Key takeaways block. Articles ≥ 1,200 words include a 3–5 bullet “Key takeaways” block right after the intro, where each bullet is a complete declarative sentence that can be cited in isolation.
- No fabricated sources. Inventing the number, the source, or the date is a critical failure that blocks release.
The rest of this guide is the render-and-distribution side — the patterns that only matter once the markdown is published to a real URL. Those live on your domain, so they’re yours to add.
What to add on your site
1. JSON-LD structured data
LLM browsing agents read schema.org markup as a high-confidence signal
of what a page is and what it claims. The Essel consumer API
already returns everything you need to emit Article markup — add it
to the <head> of every blog page.
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Title from blog.title",
"description": "Excerpt from blog.excerpt",
"datePublished": "2026-05-28T09:00:00Z",
"dateModified": "2026-05-28T09:00:00Z",
"author": { "@type": "Person", "name": "blog.author.name" },
"publisher": {
"@type": "Organization",
"name": "Your company",
"logo": { "@type": "ImageObject", "url": "https://yoursite.com/logo.png" }
},
"image": "blog.coverImage.url",
"mainEntityOfPage": "https://yoursite.com/blog/blog.slug"
}
</script> When the article shape warrants it, layer additional schemas on the same page — they compound rather than conflict:
| Article shape | Add this schema | Why |
|---|---|---|
tutorial | HowTo with each H2 step as HowToStep | Bing Chat lifts HowTo steps verbatim. |
faq_driven or any article with a FAQ section | FAQPage with each Q/A pair | Perplexity reads FAQPage for quick-answer extraction. |
explainer with a defined term in the intro | DefinedTerm on the first paragraph | LLM definition-lookups frequently cite DefinedTerm pages. |
| Anything cited by voice assistants | speakable selector pointing at the Key takeaways block | Marks the highest-confidence answer span. |
Use the Essel blog fields directly — the agents structure the
markdown so the H2/H3 hierarchy maps cleanly to HowToStep or Question blocks without rewriting.
2. Recency markup the parsers can read
A literal “as of 2026” in the prose helps — but machines also want machine-readable timestamps. Two cheap wins:
<!-- Published date in the article header -->
<time datetime="2026-05-28T09:00:00Z">May 28, 2026</time>
<!-- Open Graph for crawlers that don't parse schema -->
<meta property="article:published_time" content="2026-05-28T09:00:00Z" />
<meta property="article:modified_time" content="2026-05-28T09:00:00Z" /> If you re-publish a Essel blog with edits, update dateModified in the JSON-LD and article:modified_time. LLM crawlers boost
recently-modified pages.
3. Raw markdown access
LLM crawlers prefer clean markdown over rendered HTML — fewer tokens, no parsing ambiguity. Exposing a raw-markdown variant of every blog is one of the highest-leverage GEO moves you can make and takes a single route:
// Example: Next.js App Router
// app/blog/[slug].md/route.ts
export async function GET(_req: Request, { params }: { params: { slug: string } }) {
const res = await fetch(
`https://api.contentpilot.uixlabs.co/api/consumer/blogs?slug=${params.slug}`,
{ headers: { "x-api-key": process.env.CONTENT_PILOT_API_KEY! } },
);
const { data } = await res.json();
return new Response(data.body.markdown, {
headers: { "Content-Type": "text/markdown; charset=utf-8" },
});
} Then link to the markdown variant from the HTML page via a <link rel="alternate">:
<link
rel="alternate"
type="text/markdown"
href="https://yoursite.com/blog/your-slug.md"
/> Crawlers that respect alternates (ChatGPT, Perplexity) will fetch the markdown form when it’s cheaper than the HTML.
4. llms.txt
llms.txt is an emerging convention — an LLMs.txt file at the root
of your domain that lists what you want LLMs to read, in priority
order. It’s not a standard yet, but it’s already respected by several
indexing pipelines and costs nothing to add.
Place at https://yoursite.com/llms.txt:
# Your Company
> One-sentence description of what you do.
## Blog
- [Title of post one](https://yoursite.com/blog/slug-one.md): one-line summary
- [Title of post two](https://yoursite.com/blog/slug-two.md): one-line summary
## Docs
- [Getting started](https://yoursite.com/docs.md): one-line summary Two patterns to follow:
- Link to the markdown variant of each blog (from §3), not the
HTML. The whole point of
llms.txtis to give LLMs the clean form. - Order entries by what you most want surfaced, not chronologically. Crawlers truncate. Pin your highest-value evergreen content at the top.
A larger llms-full.txt can include the full text of every blog inline
— Anthropic’s documentation publishes one, and it works well for
domains under ~500 pages.
5. robots.txt for LLM crawlers
If your robots.txt uses an explicit allowlist (rather than the
default-allow User-Agent: *), name the LLM bots directly so they
don’t get caught by a generic catch-all deny. The relevant agents as
of 2026:
# Explicit allows for major LLM crawlers
User-Agent: GPTBot
Allow: /blog/
Allow: /docs/
User-Agent: ClaudeBot
Allow: /blog/
Allow: /docs/
User-Agent: PerplexityBot
Allow: /blog/
Allow: /docs/
User-Agent: OAI-SearchBot
Allow: /blog/
Allow: /docs/
User-Agent: Google-Extended
Allow: /blog/
Allow: /docs/
# Always include your sitemap
Sitemap: https://yoursite.com/sitemap.xml GPTBot powers ChatGPT browsing and the OpenAI training corpus. OAI-SearchBot is the dedicated search-index agent and is treated
separately. Google-Extended controls inclusion in Gemini and Google’s
AI Mode without affecting your regular Google rank.
If your stance is “yes to citation, no to training,” only the OpenAI side currently distinguishes the two via separate user agents — the others bundle both intents under one token.
Quick checklist
When you publish a Essel blog to your domain, the GEO surface area you control is:
- JSON-LD
Articleschema in<head>, plusHowTo/FAQPage/DefinedTermwhere the shape warrants. <time datetime>andarticle:published_time/article:modified_timemeta tags.- A raw-markdown route (
/blog/[slug].md) plus a<link rel="alternate">on the HTML page. llms.txtat the domain root pointing at the markdown variants.- Explicit
GPTBot,ClaudeBot,PerplexityBot,OAI-SearchBot, andGoogle-Extendedallow rules inrobots.txt.
The writing side — citations, statistics, recency, extractable answers, Key takeaways — is already enforced by the draft and revision agents. Items 1–5 above are the rest of the loop.
Further reading
- Princeton GEO paper — “GEO: Generative Engine Optimization” (Aggarwal et al., 2024). The original study identifying citations, statistics, and quotes as the highest-leverage tactics.
- Schema.org — the canonical reference for
Article,HowTo,FAQPage,DefinedTerm, andSpeakablemarkup. llms.txtspecification —llmstxt.org.- OpenAI’s bot documentation —
platform.openai.com/docs/botsfor the up-to-date list of GPTBot / OAI-SearchBot user agents. - Anthropic’s crawler documentation —
support.anthropic.comfor the current ClaudeBot andclaude-userdocumentation.