A routing table is just a suggestion until it's backed by data. In Layer 20, we completed the silicon measurements that transform theory into verified practice. Every LLM operation now has a definitive answer: where it should execute.
The Data Foundation
440 silicon samples across four theaters. Each sample captures throughput, power consumption, thermal output, and latency. This isn't benchmark folklore—it's measured hardware behavior under controlled conditions.
120 CPU samples. 100 iGPU samples. 100 dGPU samples. 120 RAM samples. All operations measured under various configurations.
The Routing Architecture
Operations are classified by their computational profile. Memory-bound operations flow through one path. Compute-bound through another. The scheduler reads the operation signature and routes accordingly.
We've identified four distinct operation classes:
- Normalization ops — Light compute, memory sequential
- Projection ops — Heavy compute, memory intensive
- Attention ops — Matrix heavy, high bandwidth
- Memory ops — Pure bandwidth, minimal compute
Results
From Theory to Practice
The routing table was once a hypothesis. Now it's a lookup table backed by reproducible measurements. When the scheduler receives an operation, it doesn't guess—it routes based on verified silicon behavior.
Evidence file: evidence/phase2a/routing_table_verified.md — Complete routing table.