Query fan-out: how to find out what AI search engines actually look for

Q: What is query fan-out in relation to AI search engines?

Query fan-out refers to how AI search engines generate multiple sub-queries for a single search prompt. Instead of matching keywords directly against a database, these engines explore various paths to deliver a comprehensive answer.

Q: How does query fan-out affect content visibility?

Content that aligns with multiple sub-queries generated through query fan-out analysis is more likely to be cited by AI engines. If optimisation focuses solely on the primary question, it may miss visibility in many retrieval paths.

Q: How can I conduct query fan-out analysis for my content?

To conduct query fan-out analysis, use real APIs that allow you to extract sub-queries from AI engines. This data helps you structure content that aligns with what AI actually searches for instead of just human keywords.

Q: What role does semantic SEO play in query fan-out analysis?

Semantic SEO focuses on the meaning of search queries rather than just keywords. In query fan-out analysis, the sub-queries generated reflect semantic components, improving topical relevance and depth for AI search optimisation.

Q: How many sub-queries do AI engines generate during a typical search?

Google AI Mode typically generates 8-12 sub-queries for a standard prompt, while ChatGPT can produce between 4-20 based on the question's complexity. This variation highlights the importance of catering to multiple retrieval paths.

Q: Why is it important to consider sub-queries beyond the main search question?

Considering sub-queries is vital because it expands your content's potential reach. By addressing these additional queries, you improve your chances of being cited by AI engines and enhance your content's relevance.

Graphic illustrating the concept of query fan out with arrows and shapes on an orange background.

Owen Steer March 31, 2026 14 min read

How to Awareness Nurture and activation

How do I find out what AI search engines actually look for when answering my customers' questions?

Query fan-out helps you uncover multiple sub-queries used by AI search engines. By focusing on these sub-queries, you can optimise your content for better visibility and citations from AI sources.

Query fan-out is how you find out what AI search engines actually look for. When someone asks ChatGPT, Perplexity, or Google AI Mode a question, the AI doesn’t just match keywords against a database. It generates multiple sub-queries, searches for each one separately, then synthesises the results into a single answer. Query fan-out analysis lets you extract those sub-queries so you can build content that matches what AI actually searches for, not just what humans type into Google.

Google AI Mode fires 8-12 sub-queries for a standard prompt. ChatGPT generates 4-20 depending on complexity. Each sub-query is a separate retrieval path, and your content either appears in those paths or it doesn’t. If you’re only optimising for the original question, you’re visible to one retrieval path out of a dozen. That’s why pages with strong fan-out coverage are significantly more likely to be cited by AI engines.

Every page in the AI search optimisation cluster at Fifty Five and Five was planned and structured using query fan-out analysis. This page explains the methodology: what it is, how to extract the sub-queries using real APIs, and how to turn that data into content structure and topic cluster decisions. This isn’t theory. It’s the process I run on every piece of content we produce.

How semantic SEO connects to what AI engines actually search for

Semantic SEO is the reason query fan-out data looks the way it does. Traditional SEO targets keywords: specific search terms with specific volumes. Semantic SEO targets meaning: entities, relationships between concepts, and topical depth. When you run query fan-out analysis, the sub-queries AI engines generate map to semantic components, not keyword variations.

Take a real example from this cluster. When I ran QFO on the question “How to build content that AI search engines actually cite?”, the sub-queries Gemini fired included “E-E-A-T and AI content ranking”, “structured data for AI search”, “semantic SEO for AI”, and “how to improve brand visibility for AI models.” Those aren’t keyword variations of the original question. They’re the semantic components AI needs to answer it: expertise signals, technical infrastructure, content structure, and offsite presence. Each one is a concept the AI researches independently.

AI engines process meaning through entity relationships, not keyword matching. When Gemini decomposes a question, it identifies the entities involved (E-E-A-T, schema markup, AI Overviews), the attributes it needs to evaluate (how they work, why they matter, what the data shows), and the comparisons it needs to make (SEO vs GEO, old approach vs new). The sub-queries it fires reflect that decomposition directly. A question about “AI content marketing” doesn’t generate sub-queries like “AI content marketing tips” or “AI content marketing examples” (keyword variations). It generates sub-queries like “E-E-A-T for AI content”, “author profiles content marketing”, and “content structure for AI citation” (semantic components). This is why semantic SEO (building topical depth, covering entity relationships, linking related concepts) directly maps to what QFO reveals. You’re not guessing at what “topical authority” means anymore. The QFO data shows you exactly which concepts AI considers part of the topic.

The practical implication: if your content only matches the surface-level question but doesn’t cover the semantic components AI researches underneath it, you’re invisible to most of the retrieval paths. Semantic SEO and QFO are two sides of the same coin. Semantic SEO builds the depth. QFO reveals what that depth should cover.

AI keyword research starts with understanding how AI decomposes your customer’s question

AI keyword research is fundamentally different from the keyword research most marketers are used to. Traditional keyword research asks: “What terms do humans type into Google, and how much volume does each one have?” AI keyword research asks: “When a human asks AI a question, what does the AI search for to answer it?”

The difference in search behaviour is stark. AI search queries average 70-80 words, compared to just 3-4 on traditional Google (iPullRank / SimilarWeb ). People don’t type keywords into ChatGPT. They ask full questions with context, constraints, and follow-ups. The AI then decomposes that long, conversational query into component sub-queries that each target a specific facet of the answer.

Traditional keyword research gives you the demand side: what people search for and how often. QFO gives you the supply side: what AI searches for when it goes to find the answer. The keywords that appear in both (what I call “synergy” keywords) are the most valuable because they have search volume for traditional SEO AND appear in the sub-queries AI engines fire. They rank in Google AND get cited by AI.

The classification system I use when running QFO makes this actionable:

Cross-platform: Sub-queries that appear in both Gemini AND OpenAI results. These are the strongest signals because both AI systems independently decided this topic matters.
Stable: Sub-queries that appear in 2+ runs on at least one platform. Reliable but not yet confirmed across platforms.
Weak: Sub-queries that appear only once. Could be noise. Not worth building content around unless other evidence supports them.

When I ran QFO on the difference between SEO and GEO page in this cluster, the cross-platform themes included “difference between SEO and GEO”, “GEO vs SEO marketing budget”, and “is SEO dying because of AI search.” Those became the foundation for the page’s H2 structure. The stable themes that didn’t have enough volume for their own sections became passage briefs woven into the body copy. Nothing was guessed. Everything was data-driven.

92-94% of AI Mode searches produce zero clicks (Ekamoira ). The value from AI search isn’t in click-throughs. It’s in citations. And citations go to the content that matches what AI actually searches for, which is what QFO reveals.

Want help running query fan-out on your content?

We run QFO analysis to find out what AI search engines actually look for when your customers ask questions.

Talk to us

How to extract what AI engines search for using Gemini and OpenAI APIs

The extraction method is simpler than most people expect. Both Google (via Gemini) and OpenAI expose the search queries their models fire when answering a question. You don’t need proprietary tools. You need API access and a structured process.

Step 1: Generate prompt variants. Start with your customer question and write 10-15 variations of how real people would phrase it. Vary the specificity, the phrasing style, and the context. Include at least two with role/industry context (“I’m a CMO at a SaaS company and…”) and at least one comparison-style variant. Select the 5 most distinct for extraction.

Step 2: Run through Gemini API with Google Search grounding. For each variant, call the Gemini API with Google Search grounding enabled. The response includes a groundingMetadata object containing webSearchQueries: an array of the actual Google searches the model fired to answer your question. Run each variant twice (10 calls total) to capture variation. Gemini typically fires 4-8 sub-queries per call.

When AI models like Gemini generate these queries, they’re not randomly selecting search terms. The model analyses the semantic intent of the prompt, identifies the component topics it needs to research, and constructs targeted searches for each one. Gemini’s grounding metadata gives you a direct window into that decomposition process. Each sub-query represents a topic the AI decided was necessary to answer the question properly. OpenAI’s Responses API provides the same window through its web_search_call action items, though OpenAI typically fires fewer, broader queries compared to Gemini’s more granular decomposition. The difference between the two is itself useful signal: themes that both platforms search for independently (cross-platform) are the strongest indicators of what matters for a topic. That’s the data you’re extracting, and it’s accessible to anyone with API keys.

Step 3: Run through OpenAI Responses API with web search. For each variant, call the Responses API with web_search_preview enabled. Extract items where type == "web_search_call" and capture the action.query. Run each variant twice (10 calls total). OpenAI typically fires just 1 search per call, but across 10 calls you get a useful signal of which broad themes it prioritises.

Step 4: Classify. Collect all sub-queries from both APIs. Normalise them (lowercase, remove minor phrasing differences). Group by theme. Classify each theme as cross-platform (both APIs), stable (2+ runs on one platform), or weak (single appearance).

Only 27% of fan-out queries remain stable across repeated searches (Ekamoira ). That instability is why running multiple variants across multiple runs matters. A single API call gives you a snapshot. Ten calls across five variants give you a pattern. The cross-platform themes that appear consistently are the ones worth building content around.

The platform differences are themselves useful data. When I ran QFO on the SEO vs GEO topic, Gemini frequently confused “GEO” with geotargeting (6 out of 10 queries), while OpenAI consistently interpreted it as Generative Engine Optimisation. That finding became a disambiguation passage in the published page, because real users searching for “GEO” encounter the same confusion.

Enjoying this article?

Get more B2B marketing insights delivered straight to your inbox.

Using query fan-out data to decide what your content should cover

QFO data becomes actionable when you map it to content decisions. The classification system from the extraction step feeds directly into three types of content decisions: H2 keywords, passage briefs, and cluster architecture.

Synergy keywords become H2s. When a QFO theme has both search volume (from DataForSEO or similar) AND appears as a cross-platform or stable sub-query, it’s a synergy keyword. These are the strongest picks for H2 sections because they rank in traditional search AND match what AI engines look for. On this very page, all five secondary keywords have synergy: semantic SEO (1,000 volume, stable), AI keyword research (170, cross-platform), topic clusters SEO (140, stable), and the rest. The same was true for the SEO vs GEO page, where every H2 keyword was validated by QFO data. That’s not a coincidence. It’s QFO guiding keyword selection.

AI-only themes become passage briefs. QFO themes that appear cross-platform or stable but have no search volume don’t justify their own H2 section. Instead, they become passage briefs: self-contained answer blocks of 130-170 words woven into relevant sections. Each passage brief directly answers a specific question the AI is asking, without requiring context from the rest of the page.

What AI search engines weigh when deciding which sources to cite

The ranking factors AI engines use for citation decisions are distinct from traditional Google ranking factors. QFO data reveals which factors matter for a specific topic, because the sub-queries tell you what the AI is evaluating. Across the topics I’ve analysed in this cluster, four factors consistently appear: content extractability (can AI pull a clean answer from the first two sentences of a section?), E-E-A-T strength (is there a named author with verifiable expertise?), source freshness (was the content recently updated?), and cross-referenceability (can the AI verify claims against other sources?). Content that matches the sub-queries AI fires gets cited more than content optimised only for the original query. The QFO data doesn’t just tell you what topics to cover. It tells you what the AI is evaluating when it decides whether to cite you.

Cluster decisions come from the pillar QFO. When you run QFO on a pillar page question, the resulting themes tell you which sub-topics are substantial enough to be their own cluster pages and which should be passage briefs within other pages. This AI search optimisation cluster started with four pages. The pillar QFO revealed that “SEO vs GEO” (1,600 volume, cross-platform) deserved its own page, and “query fan-out” (now 320 volume, cross-platform) warranted a methodology page. Both were added to the cluster plan based on QFO data, not guesswork.

Topic clusters built on query fan-out, not just keyword volume

Traditional topic clusters are built on keyword volume. You pick a broad topic for the pillar, find related keywords with volume for cluster pages, and link them together. It works for Google. It doesn’t account for what AI engines actually search for.

Topic clusters built on QFO data add a layer. The pillar QFO tells you which themes AI engines consider essential to the topic. Some of those themes have volume (synergy) and become cluster pages. Some don’t have volume but appear consistently in AI sub-queries, and become passage briefs distributed across the cluster. The cluster plan becomes a living document, updated every time you run QFO on a new page.

This AI search optimisation cluster demonstrates the approach. It started with a pillar and three pre-existing cluster pages (AI content marketing, AI citations, GEO audit). The pillar QFO revealed two additional pages worth building (SEO vs GEO, query fan-out) and 13 passage briefs distributed across all pages. Each time I ran QFO on a cluster page, new passage briefs emerged and got added to the cluster plan. The architecture evolved from the data, not from a content calendar brainstorm.

The AI content marketing page received 4 passage briefs from its QFO run. The AI citations page received 3. The GEO audit page received 3. Each set of briefs was unique to that page’s topic, informed by what AI engines actually search for when answering questions in that specific domain. The cluster plan tracked all of it.

The difference between a keyword-driven cluster and a QFO-driven cluster is precision. Both cover the topic. But the QFO-driven cluster covers the specific angles AI engines look for, with self-contained passage briefs that AI can extract and cite independently. A keyword-driven cluster asks “what do people search for?” A QFO-driven cluster asks “what do people search for AND what does AI search for when answering them?” The second question produces content that works in both channels. That’s the edge, and it’s measurable: pages with strong fan-out coverage are 161% more likely to be cited by AI engines (Ekamoira ).

How do you find out what AI search engines actually look for

The question was how to find out what AI search engines actually look for when answering your customers’ questions. The answer is query fan-out analysis: extract the sub-queries AI engines fire, classify them by strength, and use them to structure your content and plan your topic clusters.

Query fan-out reveals the semantic components AI needs to answer a question. Traditional keyword research tells you what humans search for. QFO tells you what AI searches for to answer those humans. The extraction method uses real APIs (Gemini grounding metadata, OpenAI Responses API web search) and a classification system (cross-platform, stable, weak) that turns raw data into content decisions.

Synergy keywords (volume + QFO signal) become your H2s. AI-only themes become passage briefs. Pillar QFO data shapes the cluster architecture. Every decision is traceable to the data.

The methodology works. Every page in this AI search optimisation cluster was built using the process described here. The passage briefs, the H2 structures, the cluster decisions: all QFO-informed, all traceable.

If you want help running query fan-out analysis on your own content, get in touch . I’ll walk you through how we do it.

Frequently asked questions

What is query fan-out in relation to AI search engines?

How does query fan-out affect content visibility?

How can I conduct query fan-out analysis for my content?

What role does semantic SEO play in query fan-out analysis?

How many sub-queries do AI engines generate during a typical search?

Why is it important to consider sub-queries beyond the main search question?