What is query fan-out and why does it matter?

Query fan-out is the mechanism AI systems use to decompose a single user query into 10 to 20 subqueries before retrieval. A query like 'best CRM for mid-market SaaS' fans out into subqueries about specific feature areas, pricing tiers, integration depth, deployment time, and many other dimensions. The model retrieves candidates against each subquery and synthesizes from the combined retrieval set. Brands with deep topical entity coverage win across the subquery branches; brands with thin coverage win on the headline query at best and lose the long-tail subqueries that drive comprehensive citation share.

Why does topical depth beat topical breadth for entity work?

AI retrieval is entity-aware. A brand demonstrating authoritative coverage across the full topical neighborhood (definitions, sub-aspects, comparisons, edge cases, methodology, common mistakes, industry variants) becomes the recognized topical entity for that neighborhood. A brand with scattered coverage across many topics has authoritative coverage on none. The depth-before-breadth pattern from Chapter 6 applies directly to topical entity work: 5 to 10 pages on one topic outperform 50 pages across 10 unrelated topics for entity-recognition purposes.

What is the role of Wikipedia and Wikidata for entity optimization?

Wikipedia inclusion is one of the strongest entity signals for AI Search; Wills's March 2026 research measured a Spearman correlation of 0.577 between Wikipedia citation density and AI Overview visibility in CRM software, with similar correlations in adjacent B2B categories. Wikidata is the structured-data backbone that flows into Wikipedia and into the Google Knowledge Graph. The brand should have a Wikidata entity record at minimum; Wikipedia inclusion depends on independent third-party coverage meeting Wikipedia's notability standards (research, analyst coverage, news features) which is why the Mentions pillar work compounds with entity optimization.

How does Person entity work compound with brand entity work?

Brand and Person entities reinforce each other multiplicatively. A brand with one publishing operator (founder, CEO, senior SME) builds entity authority through both the Organization and the Person; the Person entity inherits some of the brand's authority while building independent signal through contributed pieces, podcasts, and named-author content. A mature brand eventually operates with two to four publishing Person entities whose overlapping authority signals produce category-defining recognition. The investment pattern is sequential: establish the brand entity first, then build the founding operator's Person entity, then add additional Persons as the program matures.

Entity Optimization for AI Search

AI systems pull and tag content by entity, not by keyword. Entity work compounds. The retrieval index sees a brand, a person, a product, or a topic as one entity, and then tracks each entity's authority signals across the open web because the same entity surfaces across many subqueries within a single user prompt. Entity work makes the four types into clean retrieval targets. The four types are brand, people, products and services, and topical. This chapter covers six core moves. First, how entity recognition works. Second, query fan-out, which makes entity work pay off across many queries at once. Third, depth before breadth for topical entities. Fourth, the Knowledge Panel claim. Fifth, Wikidata and Wikipedia work. Sixth, the named-expert pattern, which turns founders into entities that can be cited.

Why This Technique Matters

AI Search pulls by entity, not by keyword. The model decides. A user asks about a brand, a product, a topic, or a named expert, and the model looks up the entity in its index before pulling any content the entity is known for, because retrieval keys on the entity rather than the words. Pages tied to the right entity get pulled. Pages tied to vague IDs do not. The entity is the unit of authority. The content is what the entity earns citations for.

The four entity types run in parallel. Brand entity: the firm itself, with canonical name, logo, location, and IDs. People entities: founders, CEOs, execs, and named SMEs who serve as the public face of the brand. Product and service entities: the offerings the brand sells. Each has its own features and buyer groups. Topical entities: the subject domains the brand covers. These are areas like project management, AI SEO, marketing automation, or the category the brand wants to own.

The work is multi-faceted because the four types back each other up. A strong brand entity with weak people entities makes faceless authority. It earns lower citation rates than brand-plus-named-expert pairs. A strong people entity with weak topical coverage makes a known operator. But the operator has no canonical topic to anchor to. The compounding effect is large. Brands that invest in all four entity types through a sustained multi-cycle program earn category-defining lift. Brands that invest in one or two plateau early.

Weak entity work compounds with brand age. Mentions and Evidence work make citations that go to whatever entity the model thinks the brand is. Vague tagging scatters citation share across many entities. These might be the firm, a rival with a close name, or another firm that shares part of the name. Strong entity work focuses citations on the right entity. Weak entity work leaks share to whatever the model picks as the closest match.

The entity is the unit of authority. The content is what the entity earns citations for.

The Query Fan-Out Mechanism

AI Search does not retrieve against a single query. The user's query gets broken into 10 to 20 subqueries before retrieval. The model then synthesizes from the combined results. A query like "best CRM for mid-market SaaS" fans out. Subqueries cover feature needs, pricing, integrations, support quality, deployment time, scale, and partner ecosystems. Each subquery pulls its own candidate set. Each set feeds the final response.

What this means for entity work is big. A brand with deep coverage of the topic wins across many subquery branches. A brand with one strong page on "best CRM for mid-market SaaS" loses. It has no support pages on feature comparison, pricing, integrations, or deployment. It wins the headline query at best. It loses every long-tail subquery that drives full citation share.

The mechanism rewards brands that build topical entities to full depth. A brand that has covered project management across 15 to 25 pages wins across many subquery branches. This holds even for queries the brand did not target. The compounding effect is what creates the "best in category" tag AI systems give to mature topical entities.

Current retrieval research reinforces the same depth effect. GraphRAG improves global question answering by indexing entities and relationships explicitly, which is why broad topic ownership built as a connected entity graph outperforms flat keyword coverage for multi-part prompts (Edge et al., Microsoft Research 2024).

The fan-out mechanism also shows why scattered breadth fails. A brand with one page each across 25 unrelated topics has no topical entity to pull against. Each query's fan-out pulls from a different topic. The brand wins on none of them. It has not built deep coverage in any neighborhood. Depth before breadth is not a preference. It is the only setup that works with query fan-out.

Figure 1. The query fan-out mechanism. AI systems break one query into 10 to 20 subqueries, which is why depth-before-breadth is the only entity setup that earns citation share across the long-tail branches.

The Four Entity Types

Figure 2. The four entity types. They run in parallel and reinforce each other, so a brand that invests in only one or two plateaus while a brand investing in all four earns category-defining lift.

Brand Entities

The company itself. Anchored by Organization schema from Chapter 9. The schema carries canonical name, logo, address, contact info, and deep sameAs arrays. The sameAs links point to Wikipedia, Wikidata, Crunchbase, LinkedIn, X, and industry directories. The brand entity is the parent ID all other content and entities link back to. Teams new to the markup itself should start with a working guide to structured data before layering on the entity discipline.

The brand entity work has three parts beyond schema. First, canonical name consistency. The brand uses the same name, casing, and styling on every surface (site, social, directories, contributed pieces). Variations scatter the entity signal. These include uppercase vs lowercase, with vs without trailing marks, and abbreviated vs full. Designating an anchor set of authoritative pages that hold the canonical definition of the brand entity gives the rest of the corpus a fixed point to reference. Second, descriptive consistency. The brand's one-sentence pitch and category line stay stable across surfaces. Third, identifier consistency. The brand's Crunchbase ID, LinkedIn ID, and Wikidata Q-number all stay tied to the same canonical record.

Mid-market brands often have legacy issues. Name changes, mergers, or rebrands leave multiple entity records in retrieval indexes. The cleanup work is operational. Merge records via canonical references. Update sameAs arrays to drop old IDs. Use the current canonical name in all new content.

People Entities

Founders, CEOs, executives, and named SMEs serve as the public face of the brand. Each named publishing operator gets a Person entity. The build uses Chapter 9 schema plus broader profile work that builds independent authority.

Person entity parts: a canonical bio page on owned domain with Person schema, a LinkedIn profile with steady bio detail, an X profile, a personal site or speaker profile, a Wikipedia article where it fits, a Crunchbase profile (for execs at funded firms), and authoritative directory listings. Directory listings include industry awards, conference speaker pages, and podcast guest pages. Each surface backs up the same Person entity through steady bio detail and sameAs cross-links.

The Person entity also gathers content-level authority. Articles by the operator, podcasts with the operator, contributed pieces, and conference talks all attach to the Person entity. They attach through author and creator schema properties. The compounding effect is large. A Person entity with a hundred attributed pieces across third-party sources becomes a known authority. It earns citations apart from the brand's content.

Product and Service Entities

The offerings the brand sells. Each product or service line becomes a Product entity (for products) or Service entity (for services). Each gets Schema.org markup, a dedicated owned-domain page, and support content. Support content covers features, pricing, integrations, and customer outcomes.

Product-entity work pays off most for brands selling many distinct products or services. A brand with three product lines should have three distinct Product entities. Each gets its own canonical page, schema, and support content. The retrieval mechanic attributes citations for each product line's queries to the right product entity. A single merged entity scatters the citation share. For retail and catalog brands, this entity discipline is one layer of broader ecommerce SEO work, where clean product-level structure drives both organic and AI retrieval.

The exception is brands where product distinctions are fuzzy. Or the lines share too many features to keep clean boundaries. These brands should merge to one Product entity covering the full offering. They use support content with structured comparisons instead of separate entities.

Topical Entities

The subject domains the brand covers. Topical entities pay off the most of the four types. They decide which queries the brand gets retrieved for at scale. A topical entity is not a single page or a keyword. It is the cluster of content covering a topic at enough depth and breadth that the brand becomes the known authority. Building that depth at category scale is the core of a national SEO strategy, where topical authority across a full subject domain is what separates a category leader from a follower.

Topical-entity work overlaps a lot with Chapter 6's citation reinforcement work. The same cluster pattern builds the topical entity. The pattern is one hub plus four to nine spokes covering the topic's sub-aspects. Wills's March 2026 research measured topical-entity recognition through citation density across the topic. Brands with set topical entities earn citations at multiples of brands with scattered coverage.

The depth-before-breadth rule is non-negotiable for topical entities. A brand picks one topical entity and builds 15 to 25 pages of strong coverage. It beats a brand spreading the same budget across five topics. The reason is query fan-out. One topical entity covered fully wins across many subquery branches. Five topical entities each covered thinly wins across none.

The Knowledge Panel Claim Process

Google Knowledge Panels appear when brand or person entities hit enough prominence. The triggers are demand, search volume, and citation density. The panel itself is built by Google's knowledge graph systems. You cannot create one by adding schema or by request. The work is to build the entity prominence that triggers panel creation. Then claim the panel when it shows up.

The prerequisites for panel eligibility are clear. First, steady search volume for the brand name (often a sustained climb in brand queries). Second, third-party coverage setting the entity (news mentions, analyst coverage, third-party encyclopedia entries). Third, steady entity signals across the open web. These include sameAs depth, canonical name consistency, and structured-data setup. Fourth, a Wikidata entity record. This is often the most direct path to a Knowledge Panel for mid-market brands without Wikipedia inclusion.

Eligibility prerequisite	What it requires
Brand-name search volume	A steady, often climbing level of branded queries over time.
Third-party coverage	News mentions, analyst coverage, and third-party encyclopedia entries that set the entity.
Steady entity signals	sameAs depth, canonical name consistency, and structured-data setup across the open web.
A Wikidata entity record	The most direct path for mid-market brands without Wikipedia inclusion.

The claim process kicks in when a panel appears. Google shows a "Claim this knowledge panel" link in the panel for unclaimed entities. The flow needs proof of authority over the entity. The most common methods are Google Business Profile domain verification, social-profile verification, or email at the brand's domain. The check takes time to clear. The wait depends on the method and Google's review backlog.

The post-claim work matters. Claimed panels let the brand suggest edits and updates. These include fixed bio details, new images, current exec bios, and recent news. The brand should review the panel each quarter and send fixes as needed. Unclaimed panels are at risk. A rival might claim it. A former employee might claim a person panel. Or it just goes stale with info that turns wrong over time. The claim is not optional once a panel appears for the brand.

Wikidata and Wikipedia Entity Work

Wikidata is the structured-data backbone. It flows into Wikipedia, Google Knowledge Graph, and many other AI retrieval surfaces. Wikipedia inclusion is a strong entity signal in its own right. Wills's March 2026 research found Spearman correlations as high as 0.577 between Wikipedia citation density and AI Overview visibility in some B2B categories.

The Wikidata Setup

Every brand should have a Wikidata entity record. The work is operational, not political. First, create an account on Wikidata. Second, create an item for the brand with canonical name, description, and IDs. The IDs are P-properties linking to Crunchbase, LinkedIn, X, the brand's domain, and other trusted IDs. Third, add support statements for the brand's category, HQ, founding date, and other factual properties. Wikidata reviewers approve well-cited records quickly.

The downstream effect of Wikidata inclusion matters even before Wikipedia inclusion. Google's Knowledge Graph pulls heavily from Wikidata for entity disambiguation. AI Search systems pull content tagged to Wikidata IDs with more confidence than content tagged to vague IDs. The Wikidata record is the entity backbone for retrieval beyond Google. iPullRank's GEO chapter of the AI Search Manual covers the same mechanism in technical depth: how structured signals and entity disambiguation help generative engines select content for synthesis.

The retrieval-grounding side supports the same workflow. REALM showed that retrieval-aware pretraining improves factual QA and uses salient span masking focused on named entities and dates, which is a direct technical reason entity-rich records with clean IDs improve inclusion quality over generic pages (Guu et al., ICML 2020).

Wikipedia Inclusion Standards

Wikipedia notability standards need third-party coverage of real depth in trusted sources. The standards rule out most direct self-publication. Brands cannot create their own Wikipedia articles in good faith. The path is indirect. Build the third-party coverage from Chapter 3. This means contributed pieces, analyst coverage, and research-driven mentions that meet notability. Then let independent Wikipedia editors decide whether to create the article. Direct self-creation often gets rejected. The brand may be blocked from future fair inclusion.

Once an article exists, the brand can suggest factual fixes and updates. The path is through the talk page, not direct edits. Direct editing of one's own brand article is frowned on by Wikipedia community standards. It can lead to conflict-of-interest tags that cut the article's authority. The pattern is to make the third-party coverage strong enough. Wikipedia editors will then maintain accuracy and recency on their own.

Why Wikipedia Citation Density Compounds

The Wills correlation is the empirical floor. The mechanism behind it is the more useful question, and Searchbloom's reading is that Wikipedia citation density correlates with AI Overview visibility because Wikipedia is the only widely-indexed surface that simultaneously provides three properties AI retrieval needs. A stable canonical entity identifier (the Wikidata QID plus article URL the model uses to disambiguate the entity). An editorially-filtered notability gate (the editor consensus on what counts as a verifiable secondary source). A cross-reference graph linking entities to other entities (the network of inline citations and "see also" links across articles). News sites have editorial gates but no entity graph. Wikidata has the graph but no notability gate at the page level. Trade directories carry brand mentions but no editorial filter. No other indexable surface combines all three properties. That combination is what makes Wikipedia citation density a measurable correlation rather than a noisy proxy.

0.577

Spearman correlation between Wikipedia citation density and AI Overview visibility in CRM software, the strongest entity signal Wills's March 2026 research measured in that category.

Correlation strength varies across the 145 industries Wills measured, and the variance is explained by which of the three properties is missing in the category. B2B software, technology, and science show strong correlations (Spearman 0.45 to 0.60) because all three properties are intact. Consumer brands and retail show positive but lower correlations because the cross-reference graph in consumer category articles tends to be thin. Local services, professional services, and small B2B niches show low correlations because most relevant entities sit below Wikipedia's notability threshold and the substrate is missing entirely. Cross-check your category before allocating Wikipedia-focused effort; the correlation is not uniform.

Three other research bodies refine the reading. Kevin Indig's ghost-citation work separates being cited from being mentioned: brands can be heavily referenced across Wikipedia without being named in AI-generated responses that quote those articles. Track both surfaces; they respond to different operational moves. Cyrus Shepard's 54-study synthesis weighted Wikipedia presence as a top-tier factor, corroborating Wills's finding across an independent methodology. The convergence across two methods is the stronger evidence than either alone. Lily Ray's algorithmically-demoted-subfolder study measured AI citation decline of roughly 22.5 percent on average when organic visibility dropped; Wikipedia density is a partial buffer because it is a separate signal layer AI reads even when organic rank moves.

The single largest operational mistake on this signal is reading Wikipedia work as a marketing campaign. It is a sustained compounding investment in entity disambiguation infrastructure, and the sequence is non-negotiable. The work runs in stages. First comes the substrate (Wikidata entity, third-party coverage from Chapter 3, schema sameAs). Next comes article creation by an independent editor once notability is satisfied, or the conflict-of-interest disclosed Talk page model. Then comes cross-reference accumulation, where the brand appears as a referenced entity in adjacent Wikipedia articles. The AI citation correlation Wills measured shows up at that final stage, not at publication. Brands that respect the sequence capture the signal; brands that try to compress the curve through paid editing services or self-creation forfeit it entirely.

The Named-Expert Pattern

Named-expert (Person) entities compound multiplicatively with brand entities. Here is the pattern that scales. First, build the brand entity. Then build the founding operator's Person entity as a sustained effort. Then add more Persons as the program matures.

The Person entity build order overlaps with Chapter 6's attribution-network plan. The foundation work builds the entity. This covers canonical bio, LinkedIn presence, monthly contributed-piece cadence, and podcast spots. The densification phase that follows compounds the entity. This adds industry publication ties, conference speaking, and named-author content on the owned domain. By the end of that phase, the Person entity has its own authority. It earns citations on the operator's topical expertise.

Many Persons in the same brand back each other up rather than split apart when set up right. The boost happens through steady affiliation. Every Person attributes to the same Organization via worksFor. The Persons share the same content corpus. The brand's content names many Persons by topic, where each owns a clear area. The split happens when Persons publish across unrelated topics. Or when the affiliation refs are uneven across surfaces.

The mature operating shape for established brands. Three to four publishing Person entities. Each has its own topical authority in nearby areas. All attribute to the same Organization. The brand's total entity authority is more than the sum of the Person authorities. The brand entity inherits the reflected authority across the network. This is the ceiling target for serious entity programs.

Defending the Brand Entity Against Confused Mentions

Entity work focuses citation share on the right brand only when AI Search can tell the brand apart from other entities. The conflation problem is the failure mode. AI Search returns mixed results because the brand collides with another entity in the model's index. The collision can hit many targets. It might be a different-industry firm with a similar name. Or a place. A historical figure. A generic word. Or another category brand close enough in name to confuse the model. A brand named "Origin" might collide with the dev tools company Origin, the EA gaming platform, the geographic concept of point origin, and the dictionary word for source or start. Each collision cuts citation share for the right brand on queries where the model cannot tell which Origin the user means.

The detection workflow is monthly sampling. The brand runs a small set of brand-name queries across the major AI platforms. The platforms are ChatGPT, Claude, Perplexity, Google AI Overviews, Gemini, and Copilot. The brand logs the responses. The sample covers three types. Bare brand-name queries ("what is Origin", "tell me about Origin"). Brand-name plus category queries ("Origin developer tools", "Origin the company"). And rival-comparison queries ("Origin vs other names that show up in the confusion pattern"). The conflation pattern shows up three ways. Responses switch entities mid-paragraph. Or they mix facts from the wrong entity. Or they describe the right entity with traits from the wrong one. The brand logs each conflation by query, platform, and pattern. That way the fix work can target the queries that matter most.

The disambiguation toolkit at the brand level has four main moves. The first is Organization schema with rich sameAs to trusted directories that name the right entity. Every trusted directory listing serves as a disambiguator. The deeper the sameAs array, the more signals point the model to the right ID. The sameAs work for disambiguation overlaps with general entity work. But it leans on directories where the colliding entity does not appear. Those listings carry more weight than directories where both entities appear and the model has to choose.

The second move is a Wikidata record with category-explicit class. The Wikidata record uses subClassOf to place the brand in a category that splits it from the colliding entity. Origin the dev tools company classifies as Software Company. The Wikidata statements name the software developer tools category. Origin the gaming platform classifies as Video Game Distribution Platform. The category-explicit class gives AI retrieval the signal it needs. It can now attribute queries about software developer tools to the right Origin without confusion.

The third move is steady canonical name use with category tags across all surfaces. The brand writes its canonical name with the category attached. So "Origin, the developer tools platform" rather than just "Origin" in places where the rival entity is likely to show up in the user's mental model. The category tag appears in title tags, meta descriptions, JSON-LD name and description fields, contributed-piece bylines, podcast guest descriptions, and conference speaker bios. The repetition trains the retrieval index. It learns to link the category tag to the canonical name. Over time, the model pulls the right entity even when the user query drops the category tag.

The fourth move is the brand-name-plus-category query audit. The brand spots the queries that conflate brand-name with the rival entity. Then it builds owned-domain content tuned for the disambiguating versions of those queries. A page titled "Origin developer tools: what the platform does" gives a direct retrieval target. The target serves queries that pair the brand name with the category. The disambiguating page does not have to be a marketing page. It can be a docs page, a glossary entry, or a comparison page that hits the conflation head-on.

Brand-level move	How it disambiguates
Organization schema with rich sameAs	Leans on trusted directories where the colliding entity does not appear, so each listing points the model to the right ID.
Wikidata record with category-explicit class	Uses subClassOf to place the brand in a category that splits it from the colliding entity.
Canonical name plus category tag	Repeats the name with its category across titles, schema, bylines, and bios so the index learns the link.
Brand-name-plus-category page coverage	Builds owned-domain pages tuned to the disambiguating versions of the conflated queries.

The Person-level disambiguation toolkit mirrors the brand toolkit with three tweaks. First, Person schema with sameAs depth. The schema tells the named expert apart from other public figures who share part of the name. The sameAs array carries IDs unique to the person. These include a specific LinkedIn URL, a specific conference speaker profile, and a specific podcast guest profile. The array drops IDs that the colliding person also holds.

Second, job title and affiliation are clear in all schema and external bios. The affiliation links the Person to the right Organization. It rules out other people with similar names but different affiliations. Third, photographic identity uses a steady professional headshot. The model's multimodal retrieval treats the image as a stable ID for the Person. It degrades when different headshots appear for what should be the same entity. The named expert's headshot is the same on LinkedIn, the brand site, contributed pieces, podcast guest pages, and conference speaker pages.

The "claim the longer name" pattern is the last resort for severe collision cases. The conflation may not solve at the entity level. The colliding entity might be too prominent. Or the categories might be too close. In those cases, brands sometimes adopt a longer legal name. The longer name adds a corporate or category modifier. Apple Inc. uses the longer corporate name in legal contexts. It splits the brand from Apple Corps, the Beatles' record label. The pattern is rare and costly. It touches trademark filings, contracts, and the brand's visual identity. The option exists for collision cases bad enough to justify the cost. But most brands solve disambiguation through the entity-graph work alone.

The rival-confusion pattern is a variant of the conflation problem. The colliding entity is a rival with a similar name rather than an unrelated entity. The fix work runs the same way. It uses sameAs depth, Wikidata category statements, canonical name with category tags, and brand-plus-category page coverage. But the brand can also use the Chapter 1 review-platform work to position the brand against the confused rival. Comparison pages that name both entities and spell out the differences give AI retrieval a clear signal. The signal tells the model which brand the user is asking about. The comparison page also helps the user choose. That produces good engagement signals that back up the entity work.

The maintenance question is whether disambiguation is a one-time fix or an ongoing program. The answer is ongoing. Disambiguation is not a fix-and-forget job. The retrieval index keeps pulling in new content about both entities. The relative prominence of the colliding entities can shift over time. The brand should re-sample the disambiguation queries each quarter to catch drift. The entity-graph maintenance work has a similar cadence and shape to the narrative-drift detection work in Chapter 14. But the focus here is entity conflation, not narrative drift.

The quarterly re-sample catches three common drift patterns. First, the colliding entity publishes new content that briefly outweighs the brand's disambiguation signals. Second, a third entity with a similar name shows up in the index and makes a three-way conflation. Third, the brand's own content stops using the canonical name with category tag, and the model's recognition decays. This last pattern is a form of semantic-relationship drift: the associations the model holds between the entity and its category loosen as the corpus stops reinforcing them. Each drift pattern has a matching fix step that the maintenance program triggers.

The Entity Authority Score

The four entity types compound when each is strong. The Entity Authority Score is a Searchbloom-coined composite that measures the brand's entity setup across all four types on a 0 to 100 scale. The score lets brands compare year-over-year progress and benchmark against rivals. Each entity type contributes up to 25 points.

Brand Authority (0 to 25 points). Score the Organization schema completeness (canonical name, address, contact, sameAs depth), the Wikidata record presence and ID coverage, the Wikipedia article existence and quality, and the Knowledge Panel status (unclaimed, claimed, or absent). Full Wikipedia + claimed Knowledge Panel + 10+ sameAs links + Wikidata record with full ID coverage scores 25. Schema only with no third-party entity records scores 0 to 5.
Person Authority (0 to 25 points). Score the named-expert entity setup for the brand's primary publishing operator. Person schema, LinkedIn presence with steady publishing, contributed-piece volume, podcast/conference appearance count, Wikidata record, Wikipedia article, and claimed Knowledge Panel each contribute. Mature operator entities (full build complete, Wikipedia article, 30+ contributed pieces) score 22 to 25. Bare LinkedIn profile only scores 0 to 5.
Product Authority (0 to 25 points). Score the Product or Service schema coverage across the brand's offerings. Distinct Product entities per line with full schema, sameAs links to integration partner marketplaces, and AggregateRating coverage scores 20 to 25. Merged or schema-thin product entities score 0 to 10.
Topical Authority (0 to 25 points). Score the brand's topical entity work. A mature cluster (15 to 25 pages on one topic, CCDS above 3, claimed topical territory) scores 20 to 25. Scattered topical coverage with no cluster scores 0 to 5.

Reading bands. Composite EAS above 80 indicates category-leading entity work. The brand wins query fan-out across most subquery branches. Composite 60 to 80 is solid but with at least one weak entity type that limits compounding. The diagnostic is which component scored lowest. Composite 40 to 60 indicates patchwork entity work. The brand has some surfaces strong but enough gaps that retrieval attribution leaks. Below 40 indicates foundational gaps. The brand needs entity-backbone work before any new content investment will compound.

Figure 3. The Entity Authority Score. Brand, Person, Product, and Topical authority each contribute up to 25 points, and the composite band shows whether the entity backbone is leading the category or still has foundational gaps.

Track EAS annually. The score moves slowly. Brand and Person authority components shift gradually. Product authority shifts somewhat faster. Topical authority shifts in line with cluster build cadence (Chapter 6). Annual review is enough frequency to catch trajectory without forcing quarterly noise.

The Entity Recognition Lag

Entity-graph work has a delay between setup and AI retrieval picking up the canonical entity. Brands that do not name the lag get pulled into premature panic when the citation share does not move right away. The lag varies by surface and by signal strength. Different surfaces recognize an entity at different rates, and they tend to do so in a rough order. The sections below describe that order from fastest to slowest.

Wikidata recognition: fastest. Wikidata is the fastest entity surface to update. Approved records show up in Google Knowledge Graph signals first, and AI Search retrieval starts using the canonical entity ID soon after. The quick cycle is why Wikidata is the first move in any entity program.
Schema-driven entity disambiguation: fast. Organization and Person schema with deep sameAs propagates through the discovery layer at a steady pace. Rich-result eligibility shows up early when validation is clean. The deeper entity-attribution shift settles after that.
Knowledge Panel triggering: slower. Google creates Knowledge Panels when entities hit prominence thresholds. The triggering process is opaque. Brands cannot force the panel. Across measured engagements, brand panels in active categories surface sooner; Person panels and panels in slower-moving categories take longer.
Topical entity recognition: slower still. AI systems learn topical-entity associations as the cluster compounds (Chapter 6's Cluster Maturity Curve). Topical entity recognition becomes measurable across category queries once the cluster reaches its later compounding stages.
Wikipedia inclusion: slowest. Wikipedia depends on third-party coverage meeting notability. The Mentions pillar work from Chapter 3 builds that coverage. Independent editor creation lands well after the third-party coverage volume crosses notability. Some categories take longer because of slower analyst-tier coverage cycles.

The cumulative effect compounds. A brand starting from zero entity work sees Wikidata-driven retrieval shifts first, then schema-driven disambiguation, then Knowledge Panels, then topical entity recognition consolidating, and Wikipedia coverage landing last. The full entity-graph maturity is a sustained, multi-cycle horizon. Executive expectations should be calibrated against that horizon. Programs that promise category-leading entity recognition almost immediately overstate what the entity work can deliver. Programs that allocate budget for the full horizon and pace stakeholder expectations to the recognition order earn the compounding lift the entity work was set up to produce.

An ordered sequence along a recognition-speed axis with no fixed durations. Five entity surfaces recognize an entity in order from fastest to slowest: Wikidata recognition fastest, then schema-driven disambiguation, then Knowledge Panel triggering, then topical entity recognition, with Wikipedia inclusion slowest. Executive expectations should be paced to this order. — Figure 4. The Entity Recognition Lag. Different surfaces pick up a new canonical entity at different rates, and naming the order keeps a program from premature panic when citation share does not move right away.

Platform-Specific Considerations

ChatGPT. Heavy weight on brand and product entities with strong Wikipedia and Wikidata signals. Named-expert recognition is strong for operators with steady attribution across owned and third-party content.
Claude. Weights academic-style entity signals heavily. Wikipedia inclusion, method-cited Person entities, and analyst-tier coverage over-index. Brand entities with deep sameAs to academic surfaces over-index. These include university partnerships and research collaborations.
Perplexity. Mixes entity recognition with community context. Brand and Person entities with strong Reddit and Hacker News mentions over-index. The community surfaces feed entity confirmation next to the structured-data signals.
Google AI Overviews. Native tie to Knowledge Graph. Entities with claimed Knowledge Panels, deep Wikidata records, and Wikipedia inclusion over-index. The pattern that 97% of AI Overviews cite at least one top-20 organic result reflects strong entity-recognition input.
Gemini. Native multimodal entity recognition. Brand and product entities with strong image attribution over-index next to text signals. These include logos in image search and product photos on indexed retailer sites.
Microsoft Copilot. LinkedIn-heavy entity recognition. Brand entities with strong LinkedIn company-page presence over-index. Person entities with active LinkedIn publishing over-index.

Industry Variants

Ben Wills's March 2026 research measured category-specific entity-signal correlations.

Wikidata-dominant categories (accounting software, CRM software, baby care brands, budget hotel chains). Brand and product entity work flows right into category leadership. Wikidata depth and steady ID coverage produce strong lift.
Wikipedia-citation-dominant categories (CRM software at rho=0.577). Wikipedia inclusion is the top entity signal. The Mentions pillar work from Chapter 3 builds the third-party coverage that meets Wikipedia notability. The work is contributed pieces with named authors.
Harmonic-centrality-dominant categories (affiliate marketing, auto insurance). Entity work compounds through embeddable tools and assets that earn citations across the link graph. Product entities with calculator companions beat Product entities without.
Best-search-rank-dominant categories (most B2B SaaS at moderate correlation). Entity work compounds with organic ranking. The Person entity and topical entity work pair with strong organic SEO. Together they make the most steady citation lift.
SE-outbound-link-dominant categories (agricultural equipment, beauty retail, beer brands). Entity work shows up as breadth of trusted directory inclusion. The sameAs array depth on Organization schema matters most here.

Common Mistakes That Defeat Entity Optimization

1. Schema without entity discipline. The most common failure mode. Brands add schema markup but use uneven canonical names, missing sameAs depth, and no entity-graph @id references. The schema passes validators. But it makes broken entity signals. Counter-test. Do all your content schemas point to the same Organization and Person @ids?

2. Trying to create Wikipedia articles direct. The standards need third-party coverage. Direct self-creation often gets rejected. It also creates lasting conflict-of-interest concerns. Counter-test. When did you last try to create a Wikipedia article for the brand direct?

3. Brand-byline-only content with no named experts. All content goes to the brand without named Person attribution. The People-entity layer is missing. Counter-test. Of your last 20 published pages, how many carry a named-author byline with matching Person schema?

4. Topical-entity scatter instead of depth. Content spread across 10 topics makes no topical entity in any of them. Focus on 2 to 3 topical entities at strong depth beats scatter. Counter-test. Can you name the 2 to 3 topical entities your brand has built to known depth?

5. Ignoring Knowledge Panel claim. The panel appears. The brand never claims it. A rival or former employee claims it instead. Recovery is messy. Counter-test. When did you last check whether a Knowledge Panel exists for your brand and key people?

6. Uneven canonical-name usage. The brand uses different casing, short forms, or stylings across content. Entity signal breaks. Counter-test. How does the brand name appear in your title tags, JSON-LD name properties, body content, and footer copyright across the top 20 pages?

7. Missing Wikidata record. The structured-data backbone is gone. AI systems pull content but cannot link it to a canonical entity ID. Counter-test. Search Wikidata for your brand. Does an entity record exist with full ID coverage?

8. Person entities with no independent authority signals. The named operator has a bio page on owned domain. But there are no contributed pieces, podcast spots, conference talks, or third-party references. The Person entity is flat. Counter-test. For your founding Person entity, how many third-party references exist that link back to the Person?

Questions & Answers

What is an entity in AI Search? A thing the model can pick out by ID. It can be a brand, a named person, a product, or a topic. AI systems pull and attribute content by entity, not by keyword. Strong entity work focuses attribution. Weak work scatters it.

What is query fan-out? The mechanism AI systems use to break a single query into 10 to 20 subqueries before retrieval. Brands with deep topical-entity coverage win across the subquery branches. Brands with thin coverage lose the long-tail subqueries.

Why does topical depth beat breadth? AI retrieval is entity-aware. Strong coverage across the full topic neighborhood makes a known topical entity. Scattered coverage across many topics makes strong coverage on none.

How do I claim a Google Knowledge Panel? Google creates panels when entities hit enough prominence. You cannot create one by adding schema. When a panel appears, claim through Google's claim flow. The methods are domain or social verification. The check takes time to clear. Unclaimed panels are at risk of wrong claims or staleness.

What is the role of Wikipedia and Wikidata? Wikidata is the structured-data backbone. It flows into Wikipedia and Knowledge Graph. Wikipedia inclusion drives strong lift (rho=0.577 in CRM software per Wills March 2026). Brand should have a Wikidata record at minimum. Wikipedia depends on third-party coverage meeting notability.

How does Person work compound with brand work? Multiplicatively. The brand inherits Person authority. The Person builds its own signal through third-party references. At maturity, brands run with three to four publishing Persons. Their overlapping authority makes category recognition.

What entity work should mid-market brands start with? In order. First, Organization and Person schema with deep sameAs. Second, Wikidata and trusted directory profiles. Third, one topical entity at strong depth via the cluster pattern from Chapter 6.

Should we have separate entities for product lines? Yes, when product lines serve distinct buyer groups and can be discussed independently. Each becomes a Product entity with its own Wikidata, schema, and support content. The exception is brands where product distinctions are fuzzy. Or where shared features block clean entity boundaries.