TAO Paper Revision — OSDG Experiment Draft (v2.2)

This page contains ready-to-integrate text patches for the Holt (2026) TAO paper revision. Three sections: revised Section 2 (Related Work), revised Section 5 (Comparative Analysis), and new Section 4.7 (Experiment 6). Drop in empirical numbers as they come off the bench during H4–18.

<aside> 🔗

Canonical cluster anchors (previewable). Yennefer cluster · TAEX · Retraining loop · §6.17 · Integrations registry · Integrative Flow.

‣ · ‣ · ‣ · ‣ · ‣ · TAEX Intent Router — Agent-Based Routing with Dissonance Memory MCP · ‣ · ‣ · ‣

</aside>

PATCH 1 — Section 2 (Background and Related Work)

2.5 SDG Text Classification as a Validation Domain

The UN Sustainable Development Goals (SDGs) provide a globally recognized taxonomy of 17 goals for sustainable development [26]. Automated mapping of text to SDGs has practical applications in policy analysis, research evaluation, and open-government reporting. The OSDG Community Platform (osdg.ai) operationalizes this mapping through a citizen-science labeling pipeline: volunteers review paragraph-length excerpts and apply binary accept/reject judgments for individual SDG labels, producing a multi-label ground-truth corpus validated by geographically distributed annotators [27].

The v2022.07 public release (OSDG-CD) comprises 32,431 excerpts (mean length ~90 words) drawn from UN reports, policy papers, and academic abstracts, accumulating 217,000+ validated labels from ~1,000 annotators across 110 countries. Each excerpt is short, semantically dense, and I/O-bound relative to per-sample compute — structural characteristics that make it a natural independent validation domain for Optimization Inversion analysis (Section 4.7).

The OSDG ML pipeline uses transformer-based sentence embeddings (SBERT family) for classification. Inference on short-text batches is CPU-dominant with near-zero GPU utilization at batch size 1, closely matching the overhead profile of the malware-triage benchmark (Section 4.1) and enabling direct comparison of overhead attribution across workload classes.

PATCH 2 — Section 5 (Comparative Analysis: TAO vs. Other Energy-Aware Schedulers)

Replace the existing Table 4 caption and add the following paragraph after the Key Distinctions bullet list:

Extended Validation Domain. The OSDG short-text classification workload (Section 4.7) provides an independent workload class for comparative evaluation. Unlike malware-triage (high I/O intensity via binary file parsing), OSDG inference is characterized by tokenization and embedding overhead on ~90-word excerpts with negligible GPU utilization. Both workloads exhibit the Optimization Inversion regime; the OSDG results quantify whether infrastructure overhead dominance (≥99%) and phase-transition behavior generalize beyond the original benchmark. This cross-workload comparison strengthens the claim that orchestration-layer optimization — rather than accelerator-level tuning — is the primary energy reduction lever across heterogeneous AI inference tasks.

Revised Table 4 row for TAO — update the Notes column:

Validated on malware-triage (binary classification, 2000 samples) and OSDG short-text classification (1,000 stratified excerpts, 17 SDG heads); see Section 4.7.

PATCH 3 — Section 4.7 (New: Experiment 6)

4.7 Experiment 6: SDG Classification as Independent Workload Validation

We evaluate TAO on the OSDG-CD v2022.07 dataset [27] to test whether Optimization Inversion and phase-transition behavior generalize beyond the malware-triage benchmark to a structurally distinct I/O-bound inference workload.

4.7.1 Dataset and Workload Profile

The OSDG Community Dataset (v2022.07) comprises 32,431 paragraph-length text excerpts (mean ~90 words) with citizen-validated SDG labels from ~1,000 annotators across 110 countries. A stratified sample of 1,003 excerpts (59 per SDG, 17 SDGs) was drawn using pandas.groupby('sdg').sample(n=59, random_state=42) to ensure uniform label distribution. A secondary stress-test run applied 2× sampling weight to SDGs 3, 7, and 11 to maximize concurrent active classification heads and stress TAO's multi-objective admission control.