Network Working Group K. Williams Internet-Draft Independent Researcher Intended status: Standards Track 5 July 2025 Expires: 6 January 2026 Hierarchical Topology for Language Model Coordination draft-williams-netmod-lm-hierarchy-topology-00 Abstract This document defines a YANG data model and reference architecture for a hierarchical topology of language models (LMs), where tiny, small, and large LMs cooperate to perform distributed inference, summarization, and decision-making. The model supports secure inter-node messaging, request escalation, token-based authorization, and decentralized validation using pluggable trust models. This architecture is designed for deployments where computational capabilities vary across nodes, such as edge-to-cloud environments or multi-tier AI systems. The goal is to provide a standards-based mechanism for orchestrating scalable, secure LM interactions across heterogeneous systems. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 6 January 2026. Copyright Notice Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this Williams Expires 6 January 2026 [Page 1] Internet-Draft LM Hierarchy YANG Model July 2025 document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Model Overview . . . . . . . . . . . . . . . . . . . . . . . 5 4. Use Case Flow . . . . . . . . . . . . . . . . . . . . . . . . 6 5. Data Model Summary and Usage . . . . . . . . . . . . . . . . 10 6. Implementation Status . . . . . . . . . . . . . . . . . . . . 11 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 Appendix A. YANG Module . . . . . . . . . . . . . . . . . . . . 18 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 22 1. Introduction Recent advancements in machine learning have enabled powerful language models (LMs) to perform complex inference, summarization, and contextual reasoning. However, most production deployments assume centralized access to a single large LM, which is unsuitable for many constrained or distributed environments. This document proposes a hierarchical model for distributed language model (LM) deployments. In this architecture, lightweight "tiny" LMs operate on constrained devices or at the edge, mid-tier "small" LMs act as aggregators or context enhancers, and a central "large" LM provides global reasoning and escalation handling. Communication between nodes is structured using a YANG data model that supports: * Node roles and LM types * Secure, token-based authorization * Inter-node RPCs for inference, escalation, and validation * Notifications for heartbeat and liveliness reporting This topology enables a more scalable, privacy-preserving, and resilient LM deployment strategy by allowing computation and trust decisions to occur closer to data sources. It also provides a foundation for future interoperability between language model runtimes and network control systems. Williams Expires 6 January 2026 [Page 2] Internet-Draft LM Hierarchy YANG Model July 2025 The model is inspired by hierarchical network topologies such as those defined in [RFC8345] and extends similar principles to LM- based processing pipelines. 2. Terminology This document uses the following terminology: Tiny LM: A lightweight, constrained language model running on edge devices, microcontrollers, or embedded environments. Capable of basic classification, keyword spotting, or template-based NLP tasks. Small LM: A mid-tier model with enhanced summarization or contextual capabilities. Often deployed in local gateways, routers, or fog compute nodes. Serves as an aggregator and intermediary between Tiny and Large LMs. Large LM: A centralized, full-scale language model capable of complex inference, reasoning, multi-hop retrieval, and escalation handling. Typically deployed in cloud environments or centralized data centers. Escalation: The process by which a lower-tier LM defers processing to a higher-tier LM when its local capabilities are insufficient. Auth Token: A signed token (e.g., JWT or CWT) used to authenticate and authorize requests between nodes. Contains claims such as `iss`, `sub`, `scope`, `iat`, and optionally a `nonce`. Validate-Token: A YANG-defined `action` that allows a node to verify the authenticity and authorization scope of a token received from a peer. Pluggable Token Validation: A YANG `feature` that indicates support for extensible trust mechanisms (e.g., JWT verification, CBOR/ COSE decoding, external introspection endpoints). Heartbeat: A periodic `notification` sent by a Tiny LM to indicate liveness and continued operation. Hierarchy Topology: A tree-like structure in which Tiny LMs connect to Small LMs, and Small LMs connect to a Large LM, forming a vertical path for data and escalation flow. Williams Expires 6 January 2026 [Page 3] Internet-Draft LM Hierarchy YANG Model July 2025 3. Model Overview The data model defined in this document represents a multi-tier topology of language model (LM) nodes, organized hierarchically into three layers: Tiny, Small, and Large. Each node type serves a specific role in the processing pipeline and communicates using a set of well-defined YANG-based interfaces. The core components of the model include: * A topology container that describes nodes and their relationships * RPCs for handling inference requests (`lm-request`) and escalation (`model-escalation`) * An `action` for validating authentication tokens (`validate- token`) * A `notification` stream for liveness and heartbeat (`lm- heartbeat`) * Features such as `pluggable-token-validation` to support extensible security implementations Tiny LMs typically initiate requests but may escalate to a Small LM if the request exceeds local capacity. Small LMs may respond directly, or further escalate to a Large LM. All communication is authenticated using signed tokens, and authorization is enforced based on node roles, token scopes, and topology position. The model is inspired by the YANG network topology architecture defined in [RFC8345], but adapted to reflect the unique needs of language model interaction across a distributed system. It is designed to support: * Flexible security policy enforcement * Modular trust and validation strategies * Constrained environments with limited resources * Scalable coordination of inference and summarization workloads 4. Use Case Flow This section illustrates key operational behaviors of a hierarchical language model (LM) system using the data model defined in this document. Three common interaction patterns are described: inference escalation, heartbeat reporting, and summary aggregation. Williams Expires 6 January 2026 [Page 4] Internet-Draft LM Hierarchy YANG Model July 2025 4.1. Inference Request Escalation: Tiny LM > Small LM > Large LM Actors: * tiny-lm-089: A constrained edge LM deployed in a local sensor * small-lm-042: An intermediate aggregator LM with limited inference ability * large-lm-001: A central high-capacity LM responsible for complex reasoning Scenario: A user inputs a query via a constrained device running tiny-lm-089. The device is unable to resolve the meaning of the input and escalates the request through its hierarchy. Step-by-Step Flow: 1. Tiny LM Initiates Request Calls lm-request RPC to small-lm-042 Includes: auth-token, source-node, request-type, payload 2. Small LM Attempts to Resolve Verifies auth-token via validate-token If unable to respond, it prepares an escalation 3. Small LM Escalates to Large LM Calls model-escalation RPC to large-lm-001 Includes original-payload, reason, and its own auth-token 4. Large LM Responds Performs inference and returns enriched result 5. Small LM Relays Result Responds to the original lm-request 6. Tiny LM Displays Output Presents the result to the user Williams Expires 6 January 2026 [Page 5] Internet-Draft LM Hierarchy YANG Model July 2025 Security: * Token validation at each hop * Token scopes enforced (e.g., only small LMs can escalate) 4.2. Heartbeat Broadcast: Tiny LM > Topology Actors: * tiny-lm-089: An edge device LM * small-lm-042: Its supervising node Scenario: To maintain system health, each tiny LM emits a periodic heartbeat signal to its parent. Step-by-Step Flow: 1. Tiny LM Sends Heartbeat Emits lm-heartbeat notification with timestamp and status 2. Small LM Receives Notification Subscribed to lm-heartbeat stream Updates health status table or triggers alert on timeout Security: Not signed by default, but implementations MAY correlate with recent authenticated activity 4.3. Summary Aggregation: Small LM > Large LM Actors: * tiny-lm-089, tiny-lm-090, tiny-lm-091: Sensor LMs * small-lm-042: Aggregator * large-lm-001: Reasoning LM Scenario: Multiple tiny LMs submit observations. The small LM combines and escalates them. Williams Expires 6 January 2026 [Page 6] Internet-Draft LM Hierarchy YANG Model July 2025 Step-by-Step Flow: 1. Tiny LMs Submit Observations Each sends lm-request to small-lm-042 with its own auth-token 2. Small LM Aggregates Input Locally summarizes data from tiny LMs 3. Optional Escalation Sends model-escalation RPC to large-lm-001 with the summary 4. Large LM Enhances Result Returns executive-level context or response 5. Small LM Caches + Responds Updates cache and optionally forwards summary or alert Security: * All requests must carry valid, scoped auth-tokens * Escalation privileges restricted to authorized node types 5. Data Model Summary and Usage The data model defined by this document describes a hierarchical topology of language model (LM) nodes and the interfaces through which they communicate, authorize, and escalate inference operations. It is expressed using the YANG 1.1 data modeling language [RFC7950]. The model includes the following key elements: * A `lm-node` container with identity-based classification (`tiny`, `small`, `large`) * RPCs: - `lm-request`: Initiates an inference or summarization task - `model-escalation`: Forwards requests upward in the LM hierarchy * Actions: Williams Expires 6 January 2026 [Page 7] Internet-Draft LM Hierarchy YANG Model July 2025 - `validate-token`: Verifies token authenticity and scope * Notifications: - `lm-heartbeat`: Indicates node liveness and status * Groupings for `auth-token`, trust metadata, and request payloads * A `feature` flag (`pluggable-token-validation`) to support modular trust infrastructure All inter-node requests include a signed `auth-token`, which may be validated locally via `validate-token` or externally if the feature is supported. The full YANG module is provided in Appendix A. 6. Implementation Status NOTE TO RFC EDITOR: This section is to be removed before publication. This section documents the current implementation efforts related to the YANG model and architecture described in this draft. It is included to inform reviewers and working group participants of the maturity and deployment experience of this specification. Title: UniLoRa Mesh LM Hierarchy Prototype Authors: Keenan Williams Maturity Level: Early Prototype Development Status: Active Description: A functional prototype of the hierarchical LM topology described in this document has been implemented as part of the UniLoRa Mesh project. The prototype demonstrates communication between Tiny, Small, and Large LM nodes using the defined YANG data model over a LoRa-based mesh transport layer. Key features supported: * `lm-request`: Implemented on Tiny and Small LMs to initiate and forward inference requests * `model-escalation`: Fully implemented for upward delegation to a central reasoning engine Williams Expires 6 January 2026 [Page 8] Internet-Draft LM Hierarchy YANG Model July 2025 * `validate-token`: Implemented on Small and Large LMs using JWT- based verification with public key validation * `lm-heartbeat`: Actively used to track the liveliness of edge nodes * `pluggable-token-validation`: Enabled; token verification can be swapped between local crypto module or cloud introspection service * YANG model validated with `pyang` and integrated with a prototype RESTCONF endpoint Deployment: * Tiny LM: ESP32-based TTGO T-Beam devices running lightweight keyword spotting and rule-based summarization * Small LM: Raspberry Pi 5 devices running a lightweight Python LM with summarization and caching logic * Large LM: Central node hosted on a cloud container, running GPT- style inference with trust policy enforcement The system supports real-time inference routing from edge to core, token-authenticated message passing, and topology-driven trust enforcement as defined in this draft. The goal is to refine this prototype into a reference implementation that can be used for interoperability testing and NETCONF/RESTCONF YANG validation tooling. Source Code Repository: [to be published] License: Apache 2.0 Feedback and collaboration are welcomed to further validate this model across constrained and distributed environments. 7. IANA Considerations This document registers one URI in the "IETF XML Registry" [RFC3688] and one YANG module name in the "YANG Module Names" registry [RFC6020]. 7.1. XML Namespace Registration URI: urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy Registrant Contact: IETF NETMOD Working Group XML: Williams Expires 6 January 2026 [Page 9] Internet-Draft LM Hierarchy YANG Model July 2025 urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy 7.2. YANG Module Name Registration Name: ietf-lm-hierarchy Namespace: urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy Prefix: lm Reference: This document (draft-williams-netmod-lm-hierarchy- topology) 8. Security Considerations This document defines a hierarchical topology model for distributed language models (LMs), where communication occurs between nodes of differing capabilities and privileges (Tiny LMs, Small LMs, and a central Large LM). To maintain the integrity, trustworthiness, and isolation of operations within such a topology, security is critical. 8.1. Authentication and Authorization All inter-node communication is required to include an `auth-token`, as defined in the data model. These tokens may be bearer tokens, CBOR Web Tokens (CWTs), or JWTs signed by trusted issuers (e.g., Large LMs or centralized authorities). The model assumes a shared trust infrastructure wherein Large LMs issue or delegate trust tokens to downstream nodes. To support flexible, decentralized validation, this YANG module defines a node-level `validate-token` action. This action enables nodes to verify the authenticity and scope of received tokens at runtime. While token fields alone do not enforce authorization, this action provides a behavioral interface for systems to verify and act on trust decisions. This approach avoids reliance on centralized introspection services, which may not be suitable for bandwidth-constrained or delay- sensitive environments (e.g., when Tiny LMs operate at the edge). 8.2. Pluggable Trust Model A YANG `feature`, `pluggable-token-validation`, is defined to indicate support for extensible validation backends (e.g., certificate chains, OAuth2 introspection endpoints, COSE/CBOR decoders, etc.). This allows implementations to declare advanced trust handling capabilities without forcing them into minimal deployments. Williams Expires 6 January 2026 [Page 10] Internet-Draft LM Hierarchy YANG Model July 2025 8.3. Replay and Scope Protection Tokens should include `exp` (expiration) and `iat` (issued at) claims to protect against replay. Where possible, `nonce` or one- time identifiers should be used to detect message duplication. Token `scope` claims (e.g., `inference`, `summarization`) should be enforced by recipient nodes to prevent privilege escalation across the hierarchy. 8.4. Topological Access Controls Nodes should reject incoming requests from unauthorized peers based on: * Node type (e.g., a Tiny LM may not issue requests to another Tiny LM), * Issuer identity, * Token scope. Large LMs SHOULD enforce policies on which nodes may act as intermediaries (e.g., only trusted Small LMs may escalate). This layered security model ensures each node in the hierarchy enforces local trust decisions, minimizing blast radius in the event of compromise and allowing granular control over inter-node permissions. 8.5. Token Format (Informative) This document assumes the use of signed tokens to authorize inter- node communication. While the token format is implementation- specific, systems are RECOMMENDED to use existing standards such as: * JSON Web Tokens (JWT) [RFC7519] * CBOR Web Tokens (CWT) [RFC8392] Example minimal JWT claims: { "iss": "large-lm-001", "sub": "tiny-lm-089", "scope": ["inference", "summarization"], "exp": "2025-07-06T16:00:00Z", "iat": "2025-07-06T15:00:00Z", "nonce": "3e8f5b5b-c21e-47a0-92a2-1f6ad919ef55" } Williams Expires 6 January 2026 [Page 11] Internet-Draft LM Hierarchy YANG Model July 2025 Tokens MAY be passed in cleartext (if signed) or encrypted (if confidentiality is required). Nonce tracking, token expiration, and scope enforcement SHOULD be implemented at all receiving nodes. 9. References 9.1. Normative References [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, DOI 10.17487/RFC3688, January 2004, . [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for the Network Configuration Protocol (NETCONF)", RFC 6020, DOI 10.17487/RFC6020, October 2010, . [RFC7519] Jones, M., Bradley, J., and N. Sakimura, "JSON Web Token (JWT)", RFC 7519, DOI 10.17487/RFC7519, May 2015, . [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", RFC 7950, DOI 10.17487/RFC7950, August 2016, . [RFC8345] Clemm, A., Medved, J., Varga, R., Bahadur, N., Ananthakrishnan, H., and X. Liu, "A YANG Data Model for Network Topologies", RFC 8345, DOI 10.17487/RFC8345, March 2018, . [RFC8392] Jones, M., Wahlstroem, E., Erdtman, S., and H. Tschofenig, "CBOR Web Token (CWT)", RFC 8392, DOI 10.17487/RFC8392, May 2018, . 9.2. Informative References This document has no informative references. Williams Expires 6 January 2026 [Page 12] Internet-Draft LM Hierarchy YANG Model July 2025 Appendix A. YANG Module file "ietf-lm-hierarchy@2025-07-06.yang" module ietf-lm-hierarchy { yang-version 1.1; namespace "urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy"; prefix lm; import ietf-inet-types { prefix inet; } import ietf-yang-types { prefix yang; } organization "IETF Network Modeling (NETMOD) Working Group"; contact "WG Web: WG List: Author: Keenan Williams "; description "This module defines a hierarchical topology model for distributed language models (LMs), including request escalation, authentication, and inter-node coordination. Copyright (c) 2025 IETF Trust and the persons identified as authors of the code. All rights reserved. Redistribution and use in source and binary forms, with or without modification, is permitted pursuant to, and subject to the license terms contained in, the Revised BSD License set forth in Section 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info). This version of this YANG module is part of RFC XXXX (https://www.rfc-editor.org/info/rfcXXXX); see the RFC itself for full legal notices."; revision 2025-07-06 { description "Initial version"; reference "RFC XXXX: Hierarchical Topology for Language Model Coordination"; } feature pluggable-token-validation { description Williams Expires 6 January 2026 [Page 13] Internet-Draft LM Hierarchy YANG Model July 2025 "Indicates support for pluggable token validation (e.g., JWTs, OIDC, or COSE)"; } identity lm-node-type { description "Base identity for LM node types."; } identity tiny-lm { base lm-node-type; description "A lightweight edge-deployed language model."; } identity small-lm { base lm-node-type; description "A mid-tier aggregator or summarizer."; } identity large-lm { base lm-node-type; description "A central reasoning or escalation endpoint."; } grouping auth-token-grouping { description "Reusable auth-token structure."; leaf auth-token { type string; description "A signed authentication/authorization token."; } } container lm-node { description "Node-level configuration and operational state."; leaf node-id { type string; mandatory true; description "Unique identifier of this LM node."; } leaf node-type { type identityref { base lm-node-type; } mandatory true; description "Classification of this node (tiny, small, large)."; } container trust { Williams Expires 6 January 2026 [Page 14] Internet-Draft LM Hierarchy YANG Model July 2025 description "Token validation configuration."; if-feature pluggable-token-validation; leaf trust-anchor { type string; description "Root or public key used for token validation."; } leaf token-scope-enforced { type boolean; default true; description "Whether to enforce scope claims in tokens."; } } action validate-token { description "Validates a received authentication token."; input { leaf token { type string; mandatory true; } } output { leaf valid { type boolean; } leaf reason { type string; } } } } rpc lm-request { description "Submits an inference or summarization request."; input { uses auth-token-grouping; leaf source-node { type string; mandatory true; } leaf target-node { type string; mandatory true; } leaf request-type { type enumeration { enum inference; Williams Expires 6 January 2026 [Page 15] Internet-Draft LM Hierarchy YANG Model July 2025 enum summarization; } mandatory true; } leaf payload { type string; mandatory true; } } output { leaf result { type string; } leaf status { type string; } } } rpc model-escalation { description "Forwards a request upward in the hierarchy."; input { uses auth-token-grouping; leaf original-payload { type string; } leaf reason { type string; } } output { leaf resolution { type string; } leaf downstream-directive { type string; } } } notification lm-heartbeat { description "Emitted to indicate liveness of this node."; leaf sender-node { type string; } leaf status { type enumeration { enum alive; enum degraded; enum unreachable; Williams Expires 6 January 2026 [Page 16] Internet-Draft LM Hierarchy YANG Model July 2025 } } leaf timestamp { type yang:date-and-time; } } } Author's Address Keenan Williams Independent Researcher Email: telesis001@icloud.com Williams Expires 6 January 2026 [Page 17]