Blog

Schema Markup for AI Overviews: A Practical 2026 Guide

23 April 2026 · Eugene

Two pages rank #3 for the same query. One gets cited in the AI Overview; the other doesn’t. The difference, often, is whether Google can cleanly resolve the entities on the page — and whether the structured data confirms what the prose is already saying.

Schema markup has shifted purpose. For a decade we treated it as a rich-snippet lever: get the stars, get the FAQ accordion, win the SERP real estate. That framing is outdated. In 2026, the practical value of schema is entity disambiguation and machine-readable claims that language models can verify cheaply when assembling an AI Overview or AI Mode answer. Google hasn’t published a definitive list of schema types that influence citation, and anyone who tells you they have one is guessing. What we have instead is observed correlation, patent literature, and defensible reasoning about how retrieval pipelines work.

This post is the working playbook we use at Sovereign SEO when advising clients on AEO-focused structured data. Which schema types earn their keep. Which are theatre. How to write JSON-LD that actually helps an LLM understand your page, not just pass the Rich Results Test. And the honest limitation upfront: schema won’t save a page that isn’t otherwise ranking. It amplifies authority and clarity; it does not manufacture them.

Why Schema Matters Differently in the AI Overview Era

Classic rich results were a visual contract. You marked up a recipe, Google rendered a carousel, users clicked. The AI Overview layer operates differently. When Google assembles a generative answer, a retrieval system picks candidate passages, and a reasoning layer decides which sources to cite. Schema shows up at both stages — first as a signal that helps retrieval understand what the page is about, and second as machine-readable confirmation of claims the LLM is about to quote.

In practical terms, this changes what “good” schema looks like.

Entity linking over decoration. An Article with a named author who has a sameAs pointing to Wikipedia, LinkedIn, and ORCID is easier to trust than a generic Article block with a string author.
Claims anchored to entities. A Product block that references an Organization @id (not just a brand string) is retrievable as part of a connected graph.
Consistency with prose. If the page says “founded in 2012” and the schema says “2014,” the LLM has a reason to discount the whole source. Structured data should mirror on-page copy exactly.

This is why structured data is now genuinely part of technical SEO services again, not just a checkbox item. It’s doing cognitive work for the retrieval layer.

The correlation evidence, honestly stated

We’ve tracked AI Overview citations across roughly 180 client pages in 2025. Pages with complete Article + Organization + sameAs entity linking were cited at roughly 2.3x the rate of matched-rank pages without. FAQ-marked pages with question/answer pairs closely matching voice-search queries were cited at roughly 1.8x. This is correlation, not causation — the same editorial teams writing better schema also tend to write clearer prose. Treat the numbers as directional, not gospel.

Schema Types That Actually Move the Needle

Not every schema type is equal. Here’s the working hierarchy, ranked by the signal we’ve observed in AI Overview citations and AI Mode responses.

Article and NewsArticle

The backbone of editorial and blog content. The fields that matter are not the ones most SEOs emphasise.

author as an object (not a string), with @type: Person, a name, and a sameAs array pointing to the author’s professional identity (LinkedIn, Wikipedia, Muck Rack, ORCID for academics).
dateModified updated genuinely when content changes — LLMs weight recency heavily for evergreen-but-updated posts.
about — an array of entity references describing what the article is actually about. This is where you declare “this page is about [Wikipedia entity for ‘structured data’]” explicitly.
publisher linked to your Organization @id so authorship chains back to the brand entity.

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Schema Markup for AI Overviews: A Practical 2026 Guide",
  "datePublished": "2026-04-23",
  "dateModified": "2026-04-23",
  "author": {
    "@type": "Person",
    "@id": "https://freelanceseo.sg/seo-consultant-singapore-eugene-leow/#person",
    "name": "Eugene Leow",
    "sameAs": [
      "https://www.linkedin.com/in/eugeneleow/",
      "https://ahrefs.com/digest/eugene-leow-seo-conference-2023/"
    ]
  },
  "publisher": {
    "@id": "https://freelanceseo.sg/#organization"
  },
  "about": [
    { "@type": "Thing", "name": "Structured data", "sameAs": "https://en.wikipedia.org/wiki/Structured_data" },
    { "@type": "Thing", "name": "Search engine optimization", "sameAs": "https://en.wikipedia.org/wiki/Search_engine_optimization" }
  ]
}

FAQPage

Yes, still. Google deprecated the FAQ rich result for most sites in August 2023, and a lot of SEOs wrongly concluded FAQ schema was dead. It isn’t — the rich result went away, but the retrieval layer still parses FAQ blocks and maps them to conversational queries cleanly. We’ve watched client pages with well-written FAQ schema get cited in AI Overviews for long-tail questions that exactly mirror the marked-up questions. Write questions as real users phrase them.

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [{
    "@type": "Question",
    "name": "Does FAQ schema still work for AI Overviews in 2026?",
    "acceptedAnswer": {
      "@type": "Answer",
      "text": "Yes. Google retired the FAQ rich result for most sites in 2023, but the structured data is still parsed and used by retrieval systems that assemble AI Overview answers. Pages with FAQ schema matching conversational queries still get cited."
    }
  }]
}

This overlaps directly with what we cover in what is AEO — structuring content so answer engines can extract atomic answers. FAQ schema is arguably the cheapest AEO win available.

HowTo

For genuinely procedural queries (“how to set up X,” “steps to do Y”), HowTo schema remains valuable. Google removed the rich result visibility here too, but step-level structure helps LLMs cite specific steps rather than paraphrase whole sections. Don’t fake HowTo schema on non-procedural content — it’s a trust penalty if the prose doesn’t match.

Organization with sameAs

This is the single most underused schema type for AEO. A complete Organization block with sameAs linking to Wikipedia (if you have an article), Wikidata, LinkedIn, Crunchbase, industry databases, and any authoritative registry is how you tell Google “this brand string refers to this specific entity in the knowledge graph.”

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "@id": "https://freelanceseo.sg/#organization",
  "name": "Sovereign SEO",
  "url": "https://freelanceseo.sg/",
  "logo": "https://freelanceseo.sg/logo.png",
  "founder": { "@id": "https://freelanceseo.sg/seo-consultant-singapore-eugene-leow/#person" },
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "152 Beach Road #02-08, GB Gateway East",
    "addressLocality": "Singapore",
    "postalCode": "189721",
    "addressCountry": "SG"
  },
  "sameAs": [
    "https://www.linkedin.com/company/sovereign-seo/",
    "https://www.crunchbase.com/organization/sovereign-seo"
  ]
}

Every page on the site should reference this @id as its publisher, creating a single authoritative entity that the whole content graph points back to.

Product with Review and AggregateRating

For commercial and comparison queries, Product schema with genuine review and aggregateRating fields is heavily weighted. AI Overviews surfacing “best X for Y” results lean on structured rating data almost universally. Critical caveat: fake reviews get flagged, and Google’s recent spam updates have punished sites that stuff aggregate ratings without corresponding on-page review content. If you don’t have real reviews, don’t mark them up.

For e-commerce, combine Product with Offer, Brand (linked to your Organization @id), and ensure price consistency with on-page markup.

Schema Types That Are Marginal or Theatre

Not everything in the Schema.org vocabulary earns its place on your pages.

BreadcrumbList — harmless, low signal. Add it if your CMS does it automatically; don’t spend engineering time.
Speakable — explicitly designed for voice assistants and has never seen meaningful adoption in AI Overview citations in our tracking. Niche.
ClaimReview — powerful for fact-checkers and news organisations; irrelevant for 99% of commercial sites.
WebSite with SearchAction — gives you the sitelinks search box, no observable AEO impact.
Duplicate/redundant markup — marking up the same content three different ways doesn’t multiply the signal; it just creates validation warnings.

Time spent implementing these is time not spent on the Organization, Article, and entity linking work that actually moves citations.

Entity-First Thinking: The Part Most SEOs Skip

The schema shift that separates 2026 practitioners from 2020 practitioners is treating structured data as entity declarations rather than rich-snippet requests.

What entity-first schema looks like in practice:

One canonical Organization entity with a stable @id (URL fragment like #organization), referenced by every page as publisher.
Author entities as Person objects with @id, sameAs to Wikipedia/LinkedIn/professional profiles, and jobTitle / worksFor linking back to the Organization.
about arrays on content pages that point to Wikipedia and Wikidata entities for the core topics. This is the closest we get to telling Google explicitly “this page is about THIS concept, not a homonym.”
Consistent @id referencing across the site — the same author URL on every post, the same organisation fragment everywhere.

This is the infrastructure layer of GEO-focused optimisation. Without it, you’re relying entirely on prose for entity resolution. With it, you’re giving the retrieval layer a graph it can traverse.

Validation and Debugging

Markup that fails validation is worse than no markup. Three tools belong in your workflow:

Schema.org Validator (validator.schema.org) — strictest, catches vocabulary errors and type mismatches Google’s tool won’t flag.
Google Rich Results Test — confirms Google can parse your markup and tells you which rich result types are eligible. Note: passing this doesn’t mean the markup influences AI Overviews, just that it’s syntactically valid.
Schema.org’s own examples — when in doubt, copy the structure from the official examples page for each type. Don’t improvise property names.

For entity linking specifically, there’s no automated checker. Spot-check by Googling your author’s name with site:wikipedia.org and confirming the sameAs URLs you’ve declared actually resolve. Broken sameAs links damage trust.

A debugging discipline worth adopting: when a page isn’t getting cited in AI Overviews despite ranking, render the JSON-LD, paste it into the Schema.org validator, and check three things. Are entity references consistent? Are dates accurate? Does every claim in the schema match the on-page copy verbatim?

Where Schema Fits in the Broader AEO Picture

Honest framing: schema is amplification, not foundation. If your page isn’t ranking in the top 10, no amount of perfect JSON-LD will surface it in an AI Overview — the retrieval layer picks from already-ranking candidates. Schema decides between close competitors, not between competitors and also-rans. This is a point we repeat often in AI Overviews optimisation engagements: get the ranking fundamentals first, then layer schema.

The practical sequence we follow:

On-page clarity — genuine topical authority, clean H-tag hierarchy, answer-first paragraphs. This is the core of on-page SEO services and it’s where most AEO wins actually originate.
Entity infrastructure — canonical Organization, consistent author entities, sameAs hygiene.
Page-type schema — Article, FAQPage, HowTo, Product as appropriate to content type.
Ongoing validation — schema breaks quietly when CMS templates change. Quarterly audits catch drift.

Skipping steps 1 and 2 to jump to step 3 is the most common mistake we see. It’s also why “I implemented all the schema and nothing happened” is such a common complaint.

FAQ — Schema Markup for AI Overviews

Does FAQ schema still work for AI Overviews in 2026?
Yes. Google retired the visible FAQ rich result for most sites in August 2023, but the structured data is still parsed by the retrieval systems that assemble AI Overviews and AI Mode responses. Pages with FAQ schema matching conversational queries continue to be cited, especially for long-tail informational queries. Don’t remove existing FAQ markup because the rich result disappeared.

Is schema enough to get my pages cited in AI Overviews?
No. Schema amplifies ranking and clarity signals that already exist; it doesn’t manufacture them. If a page isn’t ranking in the top 10 organic results, schema won’t surface it in an AI Overview — the retrieval layer selects from candidates that are already visible. Treat schema as a tiebreaker between competitive pages, not as a substitute for topical authority.

Which schema type matters most for AEO?
If forced to pick one, a complete Organization block with thorough sameAs entity linking. It anchors every other piece of content on the site to a single authoritative entity and is the biggest underused lever we see in audits. After that, Article with proper author entities, then FAQPage, then Product for commercial pages.

Do I really need sameAs linking to Wikipedia and Wikidata?
If your organisation or author has a Wikipedia article, absolutely link to it — this is the cleanest entity disambiguation signal available. If you don’t have a Wikipedia page, link to the next strongest authoritative sources: Wikidata (you can often create an entry), LinkedIn, Crunchbase, industry registries, professional licensing bodies. The principle is connecting your entity to established knowledge graph nodes.

Will Google penalise me for over-marking-up a page?
Not for volume — for inaccuracy. Marking up content with schema that doesn’t match the visible page (fake reviews, inflated ratings, HowTo steps that don’t exist) is a structured data spam violation and can trigger manual actions. Marking up a page with three different but accurate schema types is fine; stacking them redundantly just wastes effort.

Does JSON-LD work better than Microdata or RDFa?
Yes, and it’s not close. Google explicitly recommends JSON-LD, it’s easier to maintain, and it keeps structured data separate from visible markup (which matters when the two need to stay synchronised). Convert any remaining Microdata to JSON-LD when you can.

How do I know if my schema is actually helping?
Imperfectly. There’s no Google Search Console report for “AI Overview citations driven by your structured data.” The practical proxies are: Rich Results Test eligibility, tracked citation rates in tools like Ahrefs’ brand radar or AthenaHQ, and before/after comparison when you ship significant schema changes on identified pages. Expect noisy signal — isolate one variable at a time.

Should I mark up every page or just key pages?
Every page should have Organization publisher reference and page-type schema (Article, Product, FAQPage as appropriate). Reserve heavier markup — HowTo, detailed review data, custom entity about arrays — for pages where it matches the content genuinely. Uniform minimum baseline, targeted depth on pages that warrant it.

Discuss Your Schema Strategy

If you’re auditing your structured data with AI Overview citations in mind — or rebuilding entity infrastructure after years of ad hoc markup — and want a strategic conversation, reach out.

Book a free 30-minute consultation or email [email protected].