Will data centers become obsolete?

In the near future, it's unlikely. There will be a shift, with more processing happening on devices or at the edge, while the cloud handles more intensive or collaborative scenarios.

How can we satisfy auditors with on-device AI?

Store prompts and results locally, with secure sync at intervals. Maintain consistent model versions and provide a transparent map of the data flow.

How do we assess success?

Monitor the cost per task, response time, reviewer intervention rate, citation coverage if using RAG, and overall user satisfaction.

AI on devices versus data centers: What Canadian leaders need to know now

Q: What should we implement first?

Focus on tasks that are safe and high-volume: local document and email summarization, transcription, and offline Q&A capabilities, with options to escalate to the cloud when necessary.

Q: Which hardware is important?

NPUs, memory bandwidth, and secure enclaves are critical. Ensure the consistent distribution of models and signed updates across the device network.

Artificial Intelligence

Confusion

Jan 9, 2026

A contemporary data center featuring rows of server racks, with a glowing question mark symbol at its heart. This serves as a metaphor for comparing the roles of on-device AI and data centers within the sphere of technology.

Uncertain about how to get started with AI?
Evaluate your readiness, potential risks, and key priorities in less than an hour.

➔ Download Our Free AI Preparedness Pack

On-device AI could impact massive data centres—here’s how to strategize

The AI surge initiated a global rush to construct enormous, energy-consuming data centres. Perplexity’s CEO Aravind Srinivas has questioned that trend: if inference increasingly occurs on device, the financial dynamics of centralized AI could ease over time. Regardless of whether you fully embrace this perspective, it’s a cue to diversify your architecture strategies now.

Why the argument holds weight

Efficiency gains: Smaller, instruction-optimized models are continuously improving, enabling useful tasks at lower computing costs.
Silicon roadmap: NPUs in laptops and phones enhance matrix operations locally, reducing latency and cloud data transfer.
Privacy & sovereignty: Local processing diminishes data movement, aiding with privacy regulations and sectoral controls.
Cost exposure: Cloud AI expenditure is unpredictable; shifting some workloads to device/edge can stabilize cost per unit.

Where on-device fits today

Summaries and translations of local documents/emails on laptops.
Contextual helpers in productivity apps with limited data access.
Field work: offline drafting, policy reference, and speech transcription on mobiles.
Sensitive notes: client or patient-side triage where data must not transit external clouds.

Where cloud remains superior (for now)

Large-context reasoning over extensive datasets.
Heavy multimodal (high-resolution video, complex tools) and operational coordination.
Team-wide grounding (RAG) against enterprise knowledge with robust observability.
Burst capacity for quick surges (earnings days, incidents).

Architecture options: hybrid, not binary

Device-first, cloud-support
- Run a compact model on device; access a cloud model only when necessary.
- Store embeddings locally; sync encrypted summaries when online.
Edge/VPC inference
- Host models in your VPC or colocation for sensitive prompts; maintain observability and policy control.
Cloud with smart client
- Remain cloud-focused but delegate pre/post-processing and redaction to device NPUs to reduce tokens and risk.

Decision framework (CFO/CTO-friendly)

Criterion	Device-first	Edge/VPC	Cloud-first
Latency	Best (local)	Good (nearby)	Variable
Unit cost	Low per task; fixed device CAPEX	Medium	Pay-as-you-go; can spike
Privacy	Strong (local data)	Strong (residency)	Manage via controls
Observability	More challenging; client logging	Strong	Strong
Model size	Small/medium	Medium	Any

Governance implications

DPIA/records of processing: document local versus remote paths; justify legal basis.
Content controls: exclude customer data from model training; lock versions for auditing.
Telemetry minimization: collect only necessary client logs for safety/QA; hash or aggregate sensitive fields.
Device posture: enforce OS version, disk encryption, secure enclaves, and remote wipe capabilities.

A 90-day evaluation plan

Weeks 1–2 – Discovery

Create an inventory of candidate workloads; categorize by sensitivity, latency, context size.
Select 3 use cases (e.g., local document summarization; mobile transcription; offline policy Q&A).

Weeks 3–6 – Iterative development

Launch device-first prototypes; integrate a cloud escalation path; measure latency, cost per task, override rate.

Weeks 7–12 – Compare & decide

Conduct A/B testing for device versus cloud for the same task; model Total Cost of Ownership over 12 months; establish guidelines for production.

Risks & realities (a balanced view)

Hype risk: Not all workloads fit device constraints; maintain cloud capacity for intensive tasks.
Operations overhead: Fleet model distribution/updates and NPU fragmentation require tooling.
Security trade-offs: Endpoints present attack surfaces; reinforce devices and verify model artifacts.
Vendor posture: Validate claims; prioritize benchmarks, energy profiles, and roadmaps over slogans.

Bottom line

On-device AI is gaining traction, and it will likely rebalance where inference happens. Avoid relying solely on a single architecture: opt for a hybrid approach, measure rigorously, and shift workloads to the most cost-effective trustworthy path that meets governance standards.

Next Steps: Need assistance in crafting a hybrid AI strategy? Generation Digital offers architecture planning, Total Cost of Ownership models, and pilot development for regulated industries.

FAQ

Q1. Will data centres really become obsolete?
A. Not likely in the short term. Expect rebalancing, with more inference happening on devices/edge and cloud for intensive or shared contexts.

Q2. What should be piloted first?
A. Initiate with low-risk, high-volume tasks: local document/email summarization, transcription, and offline Q&A with cloud escalation capabilities.

Q3. How do we maintain auditors’ trust with on-device AI?
A. Log prompts/results locally with frequent secure synchronization, lock model versions, and publish a data flow map.

Q4. Which hardware is important?
A. NPUs, memory bandwidth, and secure enclaves matter; ensure managed distribution of models and verified updates.

Q5. How do we measure success?
A. Focus on cost per task, latency, override rate, citation coverage (when using RAG), and user satisfaction.