Network Working Group K. Williams
Internet-Draft Independent Researcher
Intended status: Standards Track 5 July 2025
Expires: 6 January 2026
Hierarchical Topology for Language Model Coordination
draft-williams-netmod-lm-hierarchy-topology-00
Abstract
This document defines a YANG data model and reference architecture
for a hierarchical topology of language models (LMs), where tiny,
small, and large LMs cooperate to perform distributed inference,
summarization, and decision-making. The model supports secure
inter-node messaging, request escalation, token-based authorization,
and decentralized validation using pluggable trust models. This
architecture is designed for deployments where computational
capabilities vary across nodes, such as edge-to-cloud environments
or multi-tier AI systems. The goal is to provide a standards-based
mechanism for orchestrating scalable, secure LM interactions across
heterogeneous systems.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on 6 January 2026.
Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
Williams Expires 6 January 2026 [Page 1]
Internet-Draft LM Hierarchy YANG Model July 2025
document must include Revised BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Model Overview . . . . . . . . . . . . . . . . . . . . . . . 5
4. Use Case Flow . . . . . . . . . . . . . . . . . . . . . . . . 6
5. Data Model Summary and Usage . . . . . . . . . . . . . . . . 10
6. Implementation Status . . . . . . . . . . . . . . . . . . . . 11
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
8. Security Considerations . . . . . . . . . . . . . . . . . . . 14
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 17
Appendix A. YANG Module . . . . . . . . . . . . . . . . . . . . 18
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction
Recent advancements in machine learning have enabled powerful
language models (LMs) to perform complex inference, summarization,
and contextual reasoning. However, most production deployments
assume centralized access to a single large LM, which is unsuitable
for many constrained or distributed environments.
This document proposes a hierarchical model for distributed language
model (LM) deployments. In this architecture, lightweight "tiny"
LMs operate on constrained devices or at the edge, mid-tier "small"
LMs act as aggregators or context enhancers, and a central "large"
LM provides global reasoning and escalation handling.
Communication between nodes is structured using a YANG data model
that supports:
* Node roles and LM types
* Secure, token-based authorization
* Inter-node RPCs for inference, escalation, and validation
* Notifications for heartbeat and liveliness reporting
This topology enables a more scalable, privacy-preserving, and
resilient LM deployment strategy by allowing computation and trust
decisions to occur closer to data sources. It also provides a
foundation for future interoperability between language model
runtimes and network control systems.
Williams Expires 6 January 2026 [Page 2]
Internet-Draft LM Hierarchy YANG Model July 2025
The model is inspired by hierarchical network topologies such as
those defined in [RFC8345] and extends similar principles to LM-
based processing pipelines.
2. Terminology
This document uses the following terminology:
Tiny LM: A lightweight, constrained language model running on edge
devices, microcontrollers, or embedded environments. Capable of
basic classification, keyword spotting, or template-based NLP
tasks.
Small LM: A mid-tier model with enhanced summarization or contextual
capabilities. Often deployed in local gateways, routers, or fog
compute nodes. Serves as an aggregator and intermediary between
Tiny and Large LMs.
Large LM: A centralized, full-scale language model capable of
complex inference, reasoning, multi-hop retrieval, and escalation
handling. Typically deployed in cloud environments or
centralized data centers.
Escalation: The process by which a lower-tier LM defers processing
to a higher-tier LM when its local capabilities are insufficient.
Auth Token: A signed token (e.g., JWT or CWT) used to authenticate
and authorize requests between nodes. Contains claims such as
`iss`, `sub`, `scope`, `iat`, and optionally a `nonce`.
Validate-Token: A YANG-defined `action` that allows a node to
verify the authenticity and authorization scope of a token
received from a peer.
Pluggable Token Validation: A YANG `feature` that indicates support
for extensible trust mechanisms (e.g., JWT verification, CBOR/
COSE decoding, external introspection endpoints).
Heartbeat: A periodic `notification` sent by a Tiny LM to indicate
liveness and continued operation.
Hierarchy Topology: A tree-like structure in which Tiny LMs connect
to Small LMs, and Small LMs connect to a Large LM, forming a
vertical path for data and escalation flow.
Williams Expires 6 January 2026 [Page 3]
Internet-Draft LM Hierarchy YANG Model July 2025
3. Model Overview
The data model defined in this document represents a multi-tier
topology of language model (LM) nodes, organized hierarchically into
three layers: Tiny, Small, and Large. Each node type serves a
specific role in the processing pipeline and communicates using a
set of well-defined YANG-based interfaces.
The core components of the model include:
* A topology container that describes nodes and their relationships
* RPCs for handling inference requests (`lm-request`) and escalation
(`model-escalation`)
* An `action` for validating authentication tokens (`validate-
token`)
* A `notification` stream for liveness and heartbeat (`lm-
heartbeat`)
* Features such as `pluggable-token-validation` to support
extensible security implementations
Tiny LMs typically initiate requests but may escalate to a Small LM
if the request exceeds local capacity. Small LMs may respond
directly, or further escalate to a Large LM. All communication is
authenticated using signed tokens, and authorization is enforced
based on node roles, token scopes, and topology position.
The model is inspired by the YANG network topology architecture
defined in [RFC8345], but adapted to reflect the unique needs of
language model interaction across a distributed system. It is
designed to support:
* Flexible security policy enforcement
* Modular trust and validation strategies
* Constrained environments with limited resources
* Scalable coordination of inference and summarization workloads
4. Use Case Flow
This section illustrates key operational behaviors of a hierarchical
language model (LM) system using the data model defined in this
document. Three common interaction patterns are described:
inference escalation, heartbeat reporting, and summary aggregation.
Williams Expires 6 January 2026 [Page 4]
Internet-Draft LM Hierarchy YANG Model July 2025
4.1. Inference Request Escalation: Tiny LM > Small LM > Large LM
Actors:
* tiny-lm-089: A constrained edge LM deployed in a local sensor
* small-lm-042: An intermediate aggregator LM with limited inference
ability
* large-lm-001: A central high-capacity LM responsible for complex
reasoning
Scenario:
A user inputs a query via a constrained device running tiny-lm-089.
The device is unable to resolve the meaning of the input and
escalates the request through its hierarchy.
Step-by-Step Flow:
1. Tiny LM Initiates Request
Calls lm-request RPC to small-lm-042
Includes: auth-token, source-node, request-type, payload
2. Small LM Attempts to Resolve
Verifies auth-token via validate-token
If unable to respond, it prepares an escalation
3. Small LM Escalates to Large LM
Calls model-escalation RPC to large-lm-001
Includes original-payload, reason, and its own auth-token
4. Large LM Responds
Performs inference and returns enriched result
5. Small LM Relays Result
Responds to the original lm-request
6. Tiny LM Displays Output
Presents the result to the user
Williams Expires 6 January 2026 [Page 5]
Internet-Draft LM Hierarchy YANG Model July 2025
Security:
* Token validation at each hop
* Token scopes enforced (e.g., only small LMs can escalate)
4.2. Heartbeat Broadcast: Tiny LM > Topology
Actors:
* tiny-lm-089: An edge device LM
* small-lm-042: Its supervising node
Scenario:
To maintain system health, each tiny LM emits a periodic heartbeat
signal to its parent.
Step-by-Step Flow:
1. Tiny LM Sends Heartbeat
Emits lm-heartbeat notification with timestamp and status
2. Small LM Receives Notification
Subscribed to lm-heartbeat stream
Updates health status table or triggers alert on timeout
Security:
Not signed by default, but implementations MAY correlate with recent
authenticated activity
4.3. Summary Aggregation: Small LM > Large LM
Actors:
* tiny-lm-089, tiny-lm-090, tiny-lm-091: Sensor LMs
* small-lm-042: Aggregator
* large-lm-001: Reasoning LM
Scenario:
Multiple tiny LMs submit observations. The small LM combines and
escalates them.
Williams Expires 6 January 2026 [Page 6]
Internet-Draft LM Hierarchy YANG Model July 2025
Step-by-Step Flow:
1. Tiny LMs Submit Observations
Each sends lm-request to small-lm-042 with its own auth-token
2. Small LM Aggregates Input
Locally summarizes data from tiny LMs
3. Optional Escalation
Sends model-escalation RPC to large-lm-001 with the summary
4. Large LM Enhances Result
Returns executive-level context or response
5. Small LM Caches + Responds
Updates cache and optionally forwards summary or alert
Security:
* All requests must carry valid, scoped auth-tokens
* Escalation privileges restricted to authorized node types
5. Data Model Summary and Usage
The data model defined by this document describes a hierarchical
topology of language model (LM) nodes and the interfaces through
which they communicate, authorize, and escalate inference
operations. It is expressed using the YANG 1.1 data modeling
language [RFC7950].
The model includes the following key elements:
* A `lm-node` container with identity-based classification (`tiny`,
`small`, `large`)
* RPCs:
- `lm-request`: Initiates an inference or summarization task
- `model-escalation`: Forwards requests upward in the LM
hierarchy
* Actions:
Williams Expires 6 January 2026 [Page 7]
Internet-Draft LM Hierarchy YANG Model July 2025
- `validate-token`: Verifies token authenticity and scope
* Notifications:
- `lm-heartbeat`: Indicates node liveness and status
* Groupings for `auth-token`, trust metadata, and request payloads
* A `feature` flag (`pluggable-token-validation`) to support
modular trust infrastructure
All inter-node requests include a signed `auth-token`, which may be
validated locally via `validate-token` or externally if the feature
is supported.
The full YANG module is provided in Appendix A.
6. Implementation Status
NOTE TO RFC EDITOR: This section is to be removed before
publication.
This section documents the current implementation efforts related to
the YANG model and architecture described in this draft. It is
included to inform reviewers and working group participants of the
maturity and deployment experience of this specification.
Title: UniLoRa Mesh LM Hierarchy Prototype
Authors: Keenan Williams
Maturity Level: Early Prototype
Development Status: Active
Description:
A functional prototype of the hierarchical LM topology described in
this document has been implemented as part of the UniLoRa Mesh
project. The prototype demonstrates communication between Tiny,
Small, and Large LM nodes using the defined YANG data model over a
LoRa-based mesh transport layer.
Key features supported:
* `lm-request`: Implemented on Tiny and Small LMs to initiate and
forward inference requests
* `model-escalation`: Fully implemented for upward delegation to a
central reasoning engine
Williams Expires 6 January 2026 [Page 8]
Internet-Draft LM Hierarchy YANG Model July 2025
* `validate-token`: Implemented on Small and Large LMs using JWT-
based verification with public key validation
* `lm-heartbeat`: Actively used to track the liveliness of edge
nodes
* `pluggable-token-validation`: Enabled; token verification can be
swapped between local crypto module or cloud introspection
service
* YANG model validated with `pyang` and integrated with a prototype
RESTCONF endpoint
Deployment:
* Tiny LM: ESP32-based TTGO T-Beam devices running lightweight
keyword spotting and rule-based summarization
* Small LM: Raspberry Pi 5 devices running a lightweight Python LM
with summarization and caching logic
* Large LM: Central node hosted on a cloud container, running GPT-
style inference with trust policy enforcement
The system supports real-time inference routing from edge to core,
token-authenticated message passing, and topology-driven trust
enforcement as defined in this draft. The goal is to refine this
prototype into a reference implementation that can be used for
interoperability testing and NETCONF/RESTCONF YANG validation
tooling.
Source Code Repository: [to be published]
License: Apache 2.0
Feedback and collaboration are welcomed to further validate this
model across constrained and distributed environments.
7. IANA Considerations
This document registers one URI in the "IETF XML Registry" [RFC3688]
and one YANG module name in the "YANG Module Names" registry
[RFC6020].
7.1. XML Namespace Registration
URI: urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy
Registrant Contact: IETF NETMOD Working Group
XML:
Williams Expires 6 January 2026 [Page 9]
Internet-Draft LM Hierarchy YANG Model July 2025
urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy
7.2. YANG Module Name Registration
Name: ietf-lm-hierarchy
Namespace: urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy
Prefix: lm
Reference: This document (draft-williams-netmod-lm-hierarchy-
topology)
8. Security Considerations
This document defines a hierarchical topology model for distributed
language models (LMs), where communication occurs between nodes of
differing capabilities and privileges (Tiny LMs, Small LMs, and a
central Large LM). To maintain the integrity, trustworthiness, and
isolation of operations within such a topology, security is
critical.
8.1. Authentication and Authorization
All inter-node communication is required to include an `auth-token`,
as defined in the data model. These tokens may be bearer tokens,
CBOR Web Tokens (CWTs), or JWTs signed by trusted issuers (e.g.,
Large LMs or centralized authorities). The model assumes a shared
trust infrastructure wherein Large LMs issue or delegate trust
tokens to downstream nodes.
To support flexible, decentralized validation, this YANG module
defines a node-level `validate-token` action. This action enables
nodes to verify the authenticity and scope of received tokens at
runtime. While token fields alone do not enforce authorization,
this action provides a behavioral interface for systems to verify
and act on trust decisions.
This approach avoids reliance on centralized introspection services,
which may not be suitable for bandwidth-constrained or delay-
sensitive environments (e.g., when Tiny LMs operate at the edge).
8.2. Pluggable Trust Model
A YANG `feature`, `pluggable-token-validation`, is defined to
indicate support for extensible validation backends (e.g.,
certificate chains, OAuth2 introspection endpoints, COSE/CBOR
decoders, etc.). This allows implementations to declare advanced
trust handling capabilities without forcing them into minimal
deployments.
Williams Expires 6 January 2026 [Page 10]
Internet-Draft LM Hierarchy YANG Model July 2025
8.3. Replay and Scope Protection
Tokens should include `exp` (expiration) and `iat` (issued at)
claims to protect against replay. Where possible, `nonce` or one-
time identifiers should be used to detect message duplication.
Token `scope` claims (e.g., `inference`, `summarization`) should be
enforced by recipient nodes to prevent privilege escalation across
the hierarchy.
8.4. Topological Access Controls
Nodes should reject incoming requests from unauthorized peers based
on:
* Node type (e.g., a Tiny LM may not issue requests to another Tiny
LM),
* Issuer identity,
* Token scope.
Large LMs SHOULD enforce policies on which nodes may act as
intermediaries (e.g., only trusted Small LMs may escalate).
This layered security model ensures each node in the hierarchy
enforces local trust decisions, minimizing blast radius in the event
of compromise and allowing granular control over inter-node
permissions.
8.5. Token Format (Informative)
This document assumes the use of signed tokens to authorize inter-
node communication. While the token format is implementation-
specific, systems are RECOMMENDED to use existing standards such as:
* JSON Web Tokens (JWT) [RFC7519]
* CBOR Web Tokens (CWT) [RFC8392]
Example minimal JWT claims:
{
"iss": "large-lm-001",
"sub": "tiny-lm-089",
"scope": ["inference", "summarization"],
"exp": "2025-07-06T16:00:00Z",
"iat": "2025-07-06T15:00:00Z",
"nonce": "3e8f5b5b-c21e-47a0-92a2-1f6ad919ef55"
}
Williams Expires 6 January 2026 [Page 11]
Internet-Draft LM Hierarchy YANG Model July 2025
Tokens MAY be passed in cleartext (if signed) or encrypted (if
confidentiality is required). Nonce tracking, token expiration, and
scope enforcement SHOULD be implemented at all receiving nodes.
9. References
9.1. Normative References
[RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688,
DOI 10.17487/RFC3688, January 2004,
.
[RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for
the Network Configuration Protocol (NETCONF)", RFC 6020,
DOI 10.17487/RFC6020, October 2010,
.
[RFC7519] Jones, M., Bradley, J., and N. Sakimura, "JSON Web Token
(JWT)", RFC 7519, DOI 10.17487/RFC7519, May 2015,
.
[RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling
Language", RFC 7950, DOI 10.17487/RFC7950, August 2016,
.
[RFC8345] Clemm, A., Medved, J., Varga, R., Bahadur, N.,
Ananthakrishnan, H., and X. Liu, "A YANG Data Model for
Network Topologies", RFC 8345, DOI 10.17487/RFC8345,
March 2018, .
[RFC8392] Jones, M., Wahlstroem, E., Erdtman, S., and H. Tschofenig,
"CBOR Web Token (CWT)", RFC 8392, DOI 10.17487/RFC8392,
May 2018, .
9.2. Informative References
This document has no informative references.
Williams Expires 6 January 2026 [Page 12]
Internet-Draft LM Hierarchy YANG Model July 2025
Appendix A. YANG Module
file "ietf-lm-hierarchy@2025-07-06.yang"
module ietf-lm-hierarchy {
yang-version 1.1;
namespace "urn:ietf:params:xml:ns:yang:ietf-lm-hierarchy";
prefix lm;
import ietf-inet-types { prefix inet; }
import ietf-yang-types { prefix yang; }
organization
"IETF Network Modeling (NETMOD) Working Group";
contact
"WG Web:
WG List:
Author: Keenan Williams
";
description
"This module defines a hierarchical topology model for
distributed language models (LMs), including request
escalation, authentication, and inter-node coordination.
Copyright (c) 2025 IETF Trust and the persons identified as
authors of the code. All rights reserved.
Redistribution and use in source and binary forms, with or
without modification, is permitted pursuant to, and subject to
the license terms contained in, the Revised BSD License set
forth in Section 4.c of the IETF Trust's Legal Provisions
Relating to IETF Documents
(https://trustee.ietf.org/license-info).
This version of this YANG module is part of RFC XXXX
(https://www.rfc-editor.org/info/rfcXXXX); see the RFC itself
for full legal notices.";
revision 2025-07-06 {
description "Initial version";
reference "RFC XXXX: Hierarchical Topology for Language Model
Coordination";
}
feature pluggable-token-validation {
description
Williams Expires 6 January 2026 [Page 13]
Internet-Draft LM Hierarchy YANG Model July 2025
"Indicates support for pluggable token validation
(e.g., JWTs, OIDC, or COSE)";
}
identity lm-node-type {
description "Base identity for LM node types.";
}
identity tiny-lm {
base lm-node-type;
description "A lightweight edge-deployed language model.";
}
identity small-lm {
base lm-node-type;
description "A mid-tier aggregator or summarizer.";
}
identity large-lm {
base lm-node-type;
description "A central reasoning or escalation endpoint.";
}
grouping auth-token-grouping {
description "Reusable auth-token structure.";
leaf auth-token {
type string;
description "A signed authentication/authorization token.";
}
}
container lm-node {
description "Node-level configuration and operational state.";
leaf node-id {
type string;
mandatory true;
description "Unique identifier of this LM node.";
}
leaf node-type {
type identityref {
base lm-node-type;
}
mandatory true;
description "Classification of this node (tiny, small, large).";
}
container trust {
Williams Expires 6 January 2026 [Page 14]
Internet-Draft LM Hierarchy YANG Model July 2025
description "Token validation configuration.";
if-feature pluggable-token-validation;
leaf trust-anchor {
type string;
description "Root or public key used for token validation.";
}
leaf token-scope-enforced {
type boolean;
default true;
description "Whether to enforce scope claims in tokens.";
}
}
action validate-token {
description
"Validates a received authentication token.";
input {
leaf token {
type string;
mandatory true;
}
}
output {
leaf valid {
type boolean;
}
leaf reason {
type string;
}
}
}
}
rpc lm-request {
description "Submits an inference or summarization request.";
input {
uses auth-token-grouping;
leaf source-node {
type string;
mandatory true;
}
leaf target-node {
type string;
mandatory true;
}
leaf request-type {
type enumeration {
enum inference;
Williams Expires 6 January 2026 [Page 15]
Internet-Draft LM Hierarchy YANG Model July 2025
enum summarization;
}
mandatory true;
}
leaf payload {
type string;
mandatory true;
}
}
output {
leaf result {
type string;
}
leaf status {
type string;
}
}
}
rpc model-escalation {
description "Forwards a request upward in the hierarchy.";
input {
uses auth-token-grouping;
leaf original-payload {
type string;
}
leaf reason {
type string;
}
}
output {
leaf resolution {
type string;
}
leaf downstream-directive {
type string;
}
}
}
notification lm-heartbeat {
description "Emitted to indicate liveness of this node.";
leaf sender-node {
type string;
}
leaf status {
type enumeration {
enum alive;
enum degraded;
enum unreachable;
Williams Expires 6 January 2026 [Page 16]
Internet-Draft LM Hierarchy YANG Model July 2025
}
}
leaf timestamp {
type yang:date-and-time;
}
}
}
Author's Address
Keenan Williams
Independent Researcher
Email: telesis001@icloud.com
Williams Expires 6 January 2026 [Page 17]