| Internet-Draft | Semantic Anchor | June 2026 |
| Popov | Expires 7 December 2026 | [Page] |
Automated clients, including Large Language Model (LLM) crawlers and Retrieval-Augmented Generation (RAG) systems, currently lack a deterministic mechanism to verify the canonical identity of a web domain's operator. This "Identity Gap" results in attribution loss and prevents the automated verification of authority and expertise signals. This document defines the Semantic Anchor: a protocol-level orchestration of a domain-root, machine-readable JSON-LD identity node discoverable via predictable endpoints. It establishes a stable identity layer and a "Root of Trust" for AI-to-site interactions.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 7 December 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Current AI discovery protocols, such as llms.txt, provide human-readable summaries of site content but function as "unverifiable text surfaces." They describe what is on a site but fail to prove who is making the declaration.¶
This document addresses the structural "Identity Gap" first identified on April 7, 2026. It proposes a "Semantic Handshake" to move from probabilistic interpretation to deterministic verification of publisher identity.¶
The mechanism described herein was proven functional on April 20, 2026, when a major LLM retrieval system (Gemini) autonomously discovered, fetched, and parsed a Semantic Anchor implementation at 1Euroseo.com. The system incorporated the verified identity node into its reasoning without human prompting, demonstrating backward compatibility with existing retrieval architectures.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119].¶
Origin/Publisher: The legal entity or individual responsible for the operation of the domain.¶
Semantic Anchor: The binding mechanism (header or directive) linking a protocol file to an Identity Node.¶
Identity Node: A machine-readable JSON-LD document providing verifiable credentials of the Origin.¶
Triangular Authority Chain: A structural pattern linking Organization, Person, and Service entities to provide programmatic E-E-A-T.¶
The Identity Node MUST be expressed in JSON-LD using the Schema.org vocabulary to ensure interoperability with the global Knowledge Graph.¶
A conforming Identity Anchor MUST include:¶
To enable autonomous trust-scoring, the node SHOULD include:¶
Discovery MUST be predictable for automated clients. This specification defines three orchestration layers:¶
The llms.txt file MUST include an Identity header in the first three lines of the document:
Identity: https://<domain>/identity.jsonld¶
For clients not using llms.txt, the Identity Node SHOULD be accessible at:
https://<domain>/.well-known/identity.jsonld¶
Servers MAY advertise the anchor via a standard HTTP header to facilitate discovery during initial crawl:
Origin-Identity-Anchor: https://<domain>/identity.jsonld¶
To move beyond simple entity mapping, the Semantic Anchor supports a three-node authority pattern:¶
Organization Node: Establishes corporate identity.¶
Person Node: Links content to credentialed human expertise (e.g., MSc, Professional Certifications).¶
Service/Offer Node: Explicitly connects site knowledge/tools to the qualified Person and Organization.¶
This orchestration prevents "Schema Islands" and provides the AI with a closed-loop graph of authority.¶
Hosting the Identity Anchor at the domain root provides implicit proof of Origin control. Clients MUST verify that the Anchor URI matches the domain being crawled. Future revisions SHALL include support for cryptographic signing of the JSON-LD node to prevent identity spoofing and ensure non-repudiation.¶
The architectural pattern and identification of the "Identity Gap" were first publicly disclosed (https://www.linkedin.com/posts/marin-popov_ai-llms-mco-activity-7447224077381042176-haQk/) on April 7, 2026. A detailed technical rationale was published on April 8, 2026 (LinkedIn Pulse https://www.linkedin.com/pulse/real-reason-llmstxt-adoption-stalling-what-our-tool-found-day-easfe/). The formal technical specification was released on April 9, 2026 (Semantic Anchor v1.0 https://github.com/marin-popov/semantic-anchor).¶
This document serves as the authoritative chronological record of the architectural lineage for domain-root discovery patterns in AI identity.¶
This document makes no IANA requests at this time.¶
MSc Telecommunications¶