| Signal | What it optimizes |
|---|---|
| Geographic proximity | Lowest network latency |
| Device capability | Enough compute for the requested model |
| Current load | Avoids overloaded nodes |
| Model availability | Routes to nodes with the NLM already cached |
Every request goes to the nearest capable device. Fallback to cloud if needed.
| Signal | What it optimizes |
|---|---|
| Geographic proximity | Lowest network latency |
| Device capability | Enough compute for the requested model |
| Current load | Avoids overloaded nodes |
| Model availability | Routes to nodes with the NLM already cached |