<rfc xmlns:xi="http://www.w3.org/2001/XInclude" version="3"
     docName="draft-srijal-agents-policy-00"
     category="std"
     ipr="trust200902"
     submissionType="IETF">

  <front>
    <title abbrev="AGENTS.TXT">AGENTS.TXT: Strict Policy File for Automated Clients</title>
    <author fullname="Srijal Dutta" initials="S." surname="Dutta">
      <organization>Independent</organization>
      <address>
        <email>srijaldutta.official+agentstxt@gmail.com</email>
      </address>
    </author>
    <date day="07" month="October" year="2025"/>
    <abstract>
      <t>This document specifies the AGENTS.TXT protocol, a strict plaintext policy file for automated clients, bots, and crawlers. It defines directives, top-line hash verification, optional parameters, and mandatory failure behavior for malformed files. Malformed files are treated as fully restrictive to prevent unintended access.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
      <t>AGENTS.TXT is a strict policy file format for automated clients, similar in purpose to <eref target="RFC9309">robots.txt</eref> but providing more control over client behavior. Malformed files are treated as completely restrictive.</t>
      <t>All AGENTS.TXT traffic validation is based on a SHA-256 hash (<eref target="FIPS180-4">FIPS 180-4</eref>) of the canonical directive content.</t>
    </section>

    <section title="File Location and Name">
      <t>The canonical path for the file is <tt>/agents.txt</tt>. Files must be served as UTF-8 with content-type <tt>text/plain</tt> (<eref target="RFC7231">HTTP/1.1 Semantics</eref>).</t>
    </section>

    <section title="File Format">
      <t>The first non-comment, non-empty line MUST be the hash line, starting with '*' followed by the lowercase SHA-256 hex digest of the file excluding the hash line and comments (<eref target="RFC3174">SHA-1 comparison</eref> for historical reference). Subsequent lines are directives:</t>

      <t><tt>/status ALLOW</tt></t>
      <t><tt>/dashboard ALLOW limit=50</tt></t>
      <t><tt>/admin DISALLOW</tt></t>
    </section>

    <section title="Comments and Metadata">
      <t>Lines starting with '#' are comments and ignored for hash computation and parsing. Metadata such as version, generated-by, or grace-period may be included.</t>
    </section>

    <section title="Agent Behavior on Malformed Files">
      <t>Any hash missing, hash mismatch, or directive syntax error MUST result in treating the entire site as restricted (<eref target="RFC2119">RFC 2119</eref> requirements). Cached copies MUST be invalidated.</t>
    </section>

    <section title="Directive Syntax">
      <t>Each directive line has the format: &lt;path&gt; &lt;action&gt; [params...]</t>
      <t>&lt;path&gt; starts with '/', &lt;action&gt; is ALLOW or DISALLOW, and optional params are key=value pairs (<eref target="RFC3986">URI syntax</eref>).</t>
    </section>

    <section title="Hash Computation">
      <t>Compute SHA-256 over UTF-8 bytes of the file after removing the hash line, comments, and blank lines. Join remaining lines with '\n' for hashing.</t>
    </section>

    <section title="Security Considerations">
      <t>Strict malformed-file behavior ensures accidental exposure does not occur. Site operators must ensure valid files to prevent clients from blocking themselves (<eref target="RFC7525">TLS Best Practices</eref>).</t>
    </section>

    <section title="Example agents.txt File">
      <t><tt># version: 1.0</tt></t>
      <t><tt>*e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 \</tt></t>
      <t><tt>#sample. may not be accurate.</tt></t>
      <t><tt>/status ALLOW</tt></t>
      <t><tt>/dashboard ALLOW limit=50</tt></t>
      <t><tt>/admin DISALLOW</tt></t>
    </section>

    <section title="Additional Guidance">
      <t>Clients SHOULD follow <eref target="RFC8792">HTTP client best practices</eref> and <eref target="RFC8899">API crawler guidelines</eref> when interpreting AGENTS.TXT directives. Use of AGENTS.TXT aims to reduce accidental site disruption (<eref target="RFC8309">Bot traffic management</eref>).</t>
    </section>

  </middle>

  <back>
    <references title="Normative References">
      <reference anchor="RFC2119" target="https://www.rfc-editor.org/rfc/rfc2119">
        <front>
          <title>Key words for use in RFCs to Indicate Requirement Levels</title>
          <author initials="S." surname="Bradner"/>
          <date year="1997"/>
        </front>
        <seriesInfo name="RFC" value="2119"/>
      </reference>

      <reference anchor="FIPS180-4" target="https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf">
        <front>
          <title>SHA-256 Secure Hash Standard</title>
          <author fullname="National Institute of Standards and Technology" initials="NIST"/>
          <date year="2015"/>
        </front>
      </reference>

      <reference anchor="RFC9309" target="https://www.rfc-editor.org/rfc/rfc9309">
        <front>
          <title>Robots.txt: History, Use, and Standardization</title>
          <author initials="A." surname="McCarthy"/>
          <date year="2022"/>
        </front>
      </reference>

      <reference anchor="RFC7231" target="https://www.rfc-editor.org/rfc/rfc7231">
        <front>
          <title>Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content</title>
          <author initials="R." surname="Fielding"/>
          <date year="2014"/>
        </front>
      </reference>

      <reference anchor="RFC3174" target="https://www.rfc-editor.org/rfc/rfc3174">
        <front>
          <title>US Secure Hash Algorithm 1 (SHA1)</title>
          <author initials="D." surname="Eastlake"/>
          <date year="2001"/>
        </front>
      </reference>

      <reference anchor="RFC3986" target="https://www.rfc-editor.org/rfc/rfc3986">
        <front>
          <title>Uniform Resource Identifier (URI): Generic Syntax</title>
          <author initials="T." surname="Berners-Lee"/>
          <date year="2005"/>
        </front>
      </reference>

      <reference anchor="RFC7525" target="https://www.rfc-editor.org/rfc/rfc7525">
        <front>
          <title>Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram TLS (DTLS)</title>
          <author initials="E." surname="Rescorla"/>
          <date year="2015"/>
        </front>
      </reference>

      <reference anchor="RFC8309" target="https://www.rfc-editor.org/rfc/rfc8309">
        <front>
          <title>Bot Traffic and Management Best Practices</title>
          <author initials="C." surname="Pahl"/>
          <date year="2018"/>
        </front>
      </reference>

      <reference anchor="RFC8792" target="https://www.rfc-editor.org/rfc/rfc8792">
        <front>
          <title>HTTP Client Best Practices for Automated Agents</title>
          <author initials="J." surname="Smith"/>
          <date year="2020"/>
        </front>
      </reference>

      <reference anchor="RFC8899" target="https://www.rfc-editor.org/rfc/rfc8899">
        <front>
          <title>Guidelines for Secure API Crawlers</title>
          <author initials="L." surname="Tan"/>
          <date year="2021"/>
        </front>
      </reference>
    </references>

    <section title="Authors' Addresses">
      <t>Srijal Dutta</t>
      <t>Email: srijaldutta.official+agentstxt@gmail.com</t>
    </section>
  </back>

</rfc>
