Corpus Engineering vs Relevance Engineering

Updated May 13, 2026

"Credit to Michael King for the foundation. The distinction is on scope, not on substance. Both disciplines are needed. One sits inside the other."

~ Cody C. Jensen, CEO & Founder, Searchbloom

Relevance Engineering, introduced by Michael King at iPullRank and articulated in the AI Search Manual, is the first practitioner discipline in the SEO industry to take embedding-driven, retrieval-driven, AI-generated visibility seriously and give it a working vocabulary. Corpus Engineering, introduced inside the MERIT Framework, is the broader systems-level discipline. Relevance Engineering sits inside Corpus Engineering as one of its six components.

This article is a head-to-head comparison: where the two disciplines overlap, where they diverge, and how they fit together. It is not an argument that Relevance Engineering is wrong. It is an argument that a broader, corpus-level frame is needed to organize the temporal, lifecycle, and measurement work that surrounds the relevance optimization Relevance Engineering describes.

TL;DR

Both disciplines are new to the SEO space. Relevance Engineering came first. Corpus Engineering is the broader synthesis.
Relevance Engineering already works across accessibility, semantic structure, retrieval precision, and entity systems. Corpus Engineering re-organizes that same surface around the corpus as the unit of analysis, and takes standing ownership of lifecycle, drift, and expansion that Relevance Engineering addresses as inputs rather than as co-equal components. Corpus Engineering names six co-equal components: accessibility, semantic structure, information gain, expansion, retrieval optimization, and lifecycle maintenance.
The distinction is on scope, not on substance. The retrieval optimization component of Corpus Engineering is the same work Relevance Engineering describes. Where Corpus Engineering adds: explicit lifecycle and drift monitoring at the corpus level, corpus-wide measurement cadence, and standing ownership of accessibility, expansion, and infrastructure as co-equal components, not checklists or per-engagement work.
Failure modes differ in scope. Relevance Engineering fails when content does not match, is structurally inaccessible, or is not extractable. Corpus Engineering names those plus drift, lifecycle decay, and entity evolution across the corpus as failure modes with their own remediation playbook.
Deliverables overlap heavily. A Relevance Engineering engagement produces semantic alignment audits, embedding analysis, query fan-out mapping, passage optimization, and entity work. A Corpus Engineering engagement adds an ecosystem-level accessibility audit, a corpus expansion roadmap, an information gain assessment scored at the corpus level, and a drift monitoring cadence as standing components, not per-engagement work.
Both fit together cleanly. They work the same surface; Corpus Engineering draws the boundary wider, treating the corpus as the unit of analysis and taking standing ownership of lifecycle, drift, and expansion. A practitioner doing Corpus Engineering necessarily does the relevance work Relevance Engineering describes.

Credit Where Credit Is Due

Michael King saw the shift before most of the field was paying attention. He named it. He gave practitioners working concepts for it. Embeddings, query fan-out, retrieval precision, semantic alignment, entity systems, internal knowledge graphs, and channel-agnostic semantic positioning as practitioner concerns in this industry trace through his published work and the team at iPullRank.

King positions Relevance Engineering as "a new operating system for visibility," grounded at the intersection of information retrieval, artificial intelligence, user experience, content strategy, and digital PR. The discipline is explicitly channel-agnostic and explicitly broader than on-page semantic optimization. That positioning matters: Corpus Engineering is built on top of an RE that is itself already systems-level, not a narrow technique.

That contribution is foundational. Without it, the conversation about AI search and AI visibility would still largely be stuck in keyword-density arguments and on-page checklists.

This article is not an attempt to one-up Relevance Engineering. The terminology and implementation work King has put into the field stands. Corpus Engineering builds on top of it.

Where I respectfully extend the discipline is at the corpus lifecycle.

Relevance Engineering, as articulated, organizes practitioner work around the semantic match. Accessibility, internal architecture, entity systems, and expansion all appear as work to be done. The corpus itself is treated as the implicit substrate, not the unit of analysis. My argument is that the corpus is the unit. Vector drift across model updates, semantic-relationship drift across entity evolution, and lifecycle decay across content age are not adequately covered by a per-query, per-retrieval frame. They need their own cadence, their own measurement, and their own standing ownership.

Corpus Engineering is the framework I propose to elevate that work. Relevance Engineering sits inside it as the retrieval optimization component. The disagreement is not with what King has built. It is with where the line of the discipline gets drawn, and where the temporal, lifecycle, and corpus-wide measurement work belongs.

What Relevance Engineering Is

Relevance Engineering, as introduced and developed by Michael King at iPullRank, is the practice of systematically positioning content within information systems to deliver pertinent and valuable results. King frames it as "a new operating system for visibility" and an explicitly channel-agnostic discipline.

The discipline is grounded at the intersection of:

information retrieval
artificial intelligence
user experience
content strategy
digital PR

Practitioner work covers semantic alignment between queries and content, embedding-aware optimization, retrieval precision in dense and hybrid retrieval systems, passage-level and chunk-level relevance, query fan-out and sub-query coverage, internal knowledge graphs and custom ontologies, domain-level embedding shaping, entity linking, and LLM and AI Overview retrieval alignment.

What organizes the work is the semantic match between content and query. The question it answers:

When a retrieval system evaluates content for a query, does ours surface as the most semantically aligned match for the underlying query, sub-query, and intent?

Relevance Engineering is a genuine evolution beyond keyword-driven SEO. It operationalizes embedding behavior, vector similarity, entity linking, and retrieval-system selection logic into practitioner work in a way no prior SEO discipline has done.

What Corpus Engineering Is

The discipline is grounded in six components:

Corpus accessibility (rendering, crawlability, machine readability)
Semantic structure (entity relationships, topical organization, semantic adjacency)
Information gain (originality, uniqueness, citation-worthiness)
Corpus expansion (semantic breadth, supporting evidence, external corroboration)
Retrieval optimization (chunk structure, passage clarity, semantic precision)
Corpus maintenance (drift management, freshness, entity evolution)

Corpus Engineering treats the corpus as the unit of analysis. The question it answers:

Across the entire information ecosystem we control or influence, is the corpus retrievable, accessible, semantically complete, citation-worthy, and durable across model evolution?

Relevance Engineering works across accessibility, semantic structure, retrieval precision, and entity systems. Corpus Engineering re-organizes that same surface around the corpus as the unit of analysis: accessibility, structure, information gain, expansion, retrieval optimization, and lifecycle maintenance all evaluated together against the same ecosystem.

Head-to-Head Comparison

Dimension	Relevance Engineering	Corpus Engineering
Unit of analysis	The semantic match at the passage, page, cluster, or domain-embedding level	The corpus as a system
Primary question	Is this the most relevant match?	Is the corpus retrievable, accessible, complete, and durable?
Time horizon	Per-retrieval moment up through citation-stability and domain-embedding cycles	Full lifecycle of the corpus, including model-version transitions and entity evolution
Core focus	Semantic alignment, retrieval precision	Semantic infrastructure, corpus quality, lifecycle
Embedding work	Query alignment, passage similarity, entity embedding, domain-level embedding shaping	All of the above plus measured embedding stability across model upgrades, vector drift monitoring, cross-domain ecosystem coordination
Accessibility	Addressed inside the content audit (readability, extractability, structure)	First-class co-equal component with continuous evaluation
Maintenance	Freshness as a content attribute; iteration through testing, not a standing lifecycle discipline	Continuous, scheduled practitioner discipline with explicit ownership
Information gain	Source-level imperative (be the definitive reference)	Source-level imperative with corpus-wide measurement
Entity work	Entity systems, internal knowledge graphs, ontologies, entity-query co-occurrence mapping	All of the above plus entity-evolution monitoring across the corpus lifecycle
Failure mode	Content fails to match, is structurally inaccessible, or is not extractable	Corpus fails to be retrievable, complete, durable, or maintained across model evolution
Practitioner deliverable	Relevance audit, query fan-out alignment, passage optimization, entity linking, internal knowledge graph design	Corpus audit, drift monitoring, semantic infrastructure design, expansion roadmap, lifecycle cadence
Relationship to SEO	Channel-agnostic discipline that extends on-page optimization and reframes content as positioned within retrieval systems	Channel-agnostic discipline at the ecosystem level, with explicit ownership of accessibility, lifecycle, infrastructure, and expansion alongside retrieval

Where They Overlap

Retrieval optimization sits at the intersection. Both disciplines:

treat embeddings as a primary signal
treat passage and chunk retrieval as critical
focus on semantic alignment over keyword density
recognize that retrieval systems behave differently from ranking systems
accept that AI and search systems increasingly evaluate semantic ecosystems, not isolated pages

The retrieval optimization component of Corpus Engineering is, in practical terms, the same work Relevance Engineering describes.

Where They Diverge

1. Accessibility

Relevance Engineering, in King's own framing, treats accessibility as a precondition that practitioners must enforce: clean semantic HTML, open robots.txt, sitemaps, and parser-friendly rendering all appear in the manual's GEO inclusion checklist. Corpus Engineering keeps that work but elevates it from a checklist item to a first-class component on equal footing with retrieval. The shift is one of weighting and standing, not introduction. Accessibility moves from a step on the inclusion checklist to a co-equal dimension of the corpus, evaluated continuously, not just at publication.

2. Lifecycle and Drift

Relevance Engineering treats freshness as a content attribute and includes citation-stability tracking in its measurement frame. Corpus Engineering extends this work into an explicit temporal discipline. Three drifts become first-class components with their own measurement cadence and remediation playbook. Corpus drift means the informational layer evolves over months. Vector drift means embedding representations change as models are upgraded. Semantic-relationship drift means entity relationships and salience shift over time.

3. Semantic Infrastructure

Relevance Engineering, as articulated, includes internal linking, topic clustering, ontologies, internal knowledge graphs, and entity work at the page and cluster level. Corpus Engineering extends this same instinct to the full information ecosystem: every property a brand controls or influences, treated as a single semantic system. Internal linking patterns, entity relationship design, semantic clustering, and topical hierarchy become corpus-wide architecture, not per-cluster work, evaluated for consistency across domains, languages, and modalities.

4. Corpus Expansion

Relevance Engineering already calls for adjacent-intent coverage, topical clusters, and third-party corroboration; King names digital PR as one of its foundational disciplines. Corpus Engineering formalizes that work as an expansion program with explicit ownership: a roadmap for semantic breadth, supporting evidence, related-entity coverage, and external corroboration. The program sets a measurable target for ecosystem completeness, not ad-hoc cluster build-outs. The breadth that target measures is Corpus Coverage, covered in depth in a separate Searchbloom article.

5. Information Gain

Both disciplines name information gain as a primary concern. King frames it at the brand and source level: be the definitive reference, publish what only you can publish, corroborate beyond your own site. Corpus Engineering builds on that framing with a per-corpus measurement discipline: does the ecosystem as a whole contribute non-commodity, citation-worthy information at scale, evaluated against retrieval evidence, not asserted at the page level?

6. Maintenance

Relevance Engineering recognizes freshness as a selection-filter input and includes a term-freshness KPI in King's measurement framework. Corpus Engineering builds an explicit maintenance discipline on top: cadenced drift monitoring, scheduled refresh, entity-evolution tracking, and lifecycle ownership of every corpus surface, not freshness as a per-page attribute.

How the Two Fit Together

The cleanest way to describe the relationship: Corpus Engineering is the parent discipline. Relevance Engineering is one of its six components, alongside accessibility, semantic structure, information gain, corpus expansion, and corpus maintenance.

The hierarchy:

MERIT Framework → Corpus Engineering → [Accessibility, Semantic Structure, Information Gain, Expansion, Retrieval Optimization (Relevance Engineering), Maintenance] → Implementation

They work the same surface; Corpus Engineering draws the boundary wider, treating the corpus as the unit of analysis. A practitioner doing Corpus Engineering necessarily does the relevance work that Relevance Engineering describes.

Why the Distinction Matters in Practice

The two disciplines lead practitioners toward overlapping deliverables organized by different principles.

Relevance Engineering engagements typically produce:

semantic alignment audits
embedding analysis at the passage, page, and domain level
query fan-out mapping
passage-level optimization
entity linking and internal knowledge graph design
relevance scoring against retrieval systems

Corpus Engineering engagements include all of the above, organized as the retrieval-optimization component of a broader package, plus:

corpus accessibility audit as a standing report
cross-domain semantic infrastructure design
entity-evolution tracking across the corpus
information gain assessment scored at the corpus level
corpus expansion roadmap with measurable ecosystem-completeness targets
drift monitoring and corpus maintenance cadence

The distinction also matters for organizations evaluating their visibility programs. A program built on Relevance Engineering alone covers the retrieval-side optimization well but typically does not include standing lifecycle, drift, and corpus-level measurement disciplines. A program built on Corpus Engineering organizes the relevance optimization and the lifecycle work under a single operating frame.

Both Disciplines Are New

Both Relevance Engineering and Corpus Engineering are new to the SEO, search, and digital marketing space. The two emerged in response to the same shift: retrieval systems and AI generation systems now evaluate semantic ecosystems, not just isolated documents. Each is grounded in established information retrieval concepts. Neither replaces SEO; they extend it. Google's own May 2026 guidance agrees: optimizing for generative AI search is still SEO, rooted in core ranking and quality systems.

The terminology is new. The underlying mechanics are not. Embedding research, corpus linguistics, and retrieval evaluation have decades of academic history. What is new is the synthesis of those concepts into practitioner disciplines built specifically for the AI-retrieval era.

Relevance Engineering came first in this industry. King was the practitioner who recognized the shift, named it, and put a working vocabulary around it. Every serious AI-search practitioner today is operating with concepts King helped surface.

Corpus Engineering is the broader synthesis I propose to extend that work. It absorbs Relevance Engineering as one component within a systems-level discipline that also addresses accessibility, expansion, infrastructure, and lifecycle. The goal is not to compete with what King has built. The goal is to give practitioners a complete frame for the work that surrounds and supports the relevance optimization Relevance Engineering describes.

Frequently Asked Questions

What is Relevance Engineering?

Relevance Engineering, introduced by Michael King at iPullRank, is the practice of systematically positioning content within information systems to deliver pertinent and valuable results. King frames it as a channel-agnostic operating system for visibility, grounded at the intersection of information retrieval, artificial intelligence, user experience, content strategy, and digital PR. What organizes the work is the semantic match between content and query.

What is Corpus Engineering?

Corpus Engineering is the systems-level discipline of designing, structuring, expanding, maintaining, and optimizing a corpus for retrieval, semantic understanding, citation, ranking, and AI generation across modern search and language systems. It addresses six components: corpus accessibility, semantic structure, information gain, corpus expansion, retrieval optimization, and corpus maintenance. Corpus Engineering treats the corpus as the unit of analysis.

Who introduced Relevance Engineering?

Michael King at iPullRank introduced and developed Relevance Engineering. He was the first practitioner in the SEO industry to take embedding-driven, retrieval-driven, AI-generated visibility seriously and give it a working vocabulary. The conversation about AI search and AI visibility in this industry traces through his work and the team at iPullRank.

How are Relevance Engineering and Corpus Engineering different?

Relevance Engineering works across accessibility, semantic structure, retrieval precision, and entity systems. Corpus Engineering re-organizes that same surface around the corpus as the unit of analysis, with six co-equal components: accessibility, semantic structure, information gain, corpus expansion, retrieval optimization, and lifecycle maintenance. Relevance Engineering sits inside Corpus Engineering as the retrieval optimization component. The distinction is on scope, not on substance.

Where do the two disciplines overlap?

Retrieval optimization is the intersection. Both disciplines treat embeddings as a primary signal, treat passage and chunk retrieval as critical, focus on semantic alignment over keyword density, and recognize that AI and search systems increasingly evaluate semantic ecosystems, not isolated pages. The retrieval optimization component of Corpus Engineering is the same work Relevance Engineering describes.

Is Relevance Engineering wrong?

No. Relevance Engineering is foundational work that gave the SEO industry a working vocabulary for retrieval-aware practice. The argument is not that Relevance Engineering is wrong; it is that a broader, corpus-level frame is needed to organize the temporal, lifecycle, and measurement work that surrounds the relevance optimization Relevance Engineering describes.

Should I use Relevance Engineering or Corpus Engineering?

Both. They work the same surface; Corpus Engineering draws the boundary wider, treating the corpus as the unit of analysis. A practitioner doing Corpus Engineering necessarily does the relevance work that Relevance Engineering describes. The distinction matters for evaluating visibility programs: a program built on Relevance Engineering alone covers the retrieval-side optimization well but typically does not include standing lifecycle and corpus-wide measurement disciplines. A Corpus Engineering program organizes both under a single operating frame.

What deliverables differ between the two?

A Relevance Engineering engagement typically produces semantic alignment audits, embedding analysis, query fan-out mapping, passage-level optimization, entity linking, and internal knowledge graph design. A Corpus Engineering engagement includes all of those, organized as the retrieval-optimization component of a broader package. It adds a corpus accessibility audit, cross-domain semantic infrastructure design, entity-evolution tracking, information gain assessment scored at the corpus level, a corpus expansion roadmap, and drift monitoring with corpus maintenance cadence.

Are Relevance Engineering and Corpus Engineering replacing SEO?

No. Both extend SEO; they do not replace it. Traditional SEO continues to matter for ranking eligibility. Relevance Engineering and Corpus Engineering address the parts of modern visibility that traditional SEO does not address: embedding behavior, retrieval precision, semantic infrastructure, and corpus lifecycle.

Where does Corpus Engineering fit inside the MERIT Framework?

The MERIT Framework defines what AI visibility requires across five pillars: Mentions, Evidence, Relevance, Inclusion, and Transformation. Corpus Engineering is the operating discipline beneath MERIT that engineers the corpus to satisfy those pillars. Relevance Engineering, in turn, sits inside Corpus Engineering as the retrieval optimization component.

The Bottom Line

If Relevance Engineering asks:

Is our content the most semantically aligned match for the query?

Then Corpus Engineering asks:

Is our corpus retrievable, accessible, complete, durable, and worth citing?

Both questions matter. The first sits inside the second.

Modern visibility is no longer about winning a single query. It is about engineering the conditions under which a corpus is retrievable, citation-worthy, and durable across the systems that increasingly mediate how information is found, cited, and generated.

Michael King opened the door on this conversation in the SEO industry by naming Relevance Engineering. The work I am doing with Corpus Engineering is to organize that conversation around the corpus as the unit of analysis, with explicit cadence for lifecycle, drift, and corpus-wide measurement built in alongside the retrieval optimization.

Credit to King for the foundation. The distinction is on scope, not on substance. Both disciplines are needed. One sits inside the other.