Skip to content
IoT & Embedded

IoT and edge AI: when devices start making decisions on their own

Small models now run on cheap silicon, in the field, with no round-trip. What that unlocks for industrial, retail, and logistics operations, and the new failure modes.

SDEN team10 min read

The premise

For most of the last decade, the interesting work in connected devices happened in the cloud. The device sent its data up; the cloud thought about it; a decision came back down. Latency, bandwidth, and connectivity all tilted in favour of centralised processing.

That balance has shifted. Small, capable models now run on cheap silicon, in the field, with no round-trip. A camera can decide what it is looking at; a meter can classify what it is reading; a sensor can detect an anomaly without ever calling home. The architectural posture of IoT and embedded systems is being rewritten in real time.

This article is about what edge intelligence has unlocked, what new failure modes have arrived with it, and how a senior team approaches IoT engagements in this new shape.

Why this matters now

Edge intelligence is no longer exotic

Models that mattered to industrial use cases now fit on devices that cost tens of dollars.

Two trajectories crossed in the last two years. Models got smaller and faster at the same task, and edge hardware got cheaper for the same compute envelope. The result is that capabilities that required a server eighteen months ago (vision classification, anomaly detection, simple language understanding) now fit on devices that cost less than the sensor next to them.

This unlocks a different system shape. Industrial monitoring can decide locally what is worth sending. Retail cameras can do counts and dwell-time analytics without sending video upstream. Logistics sensors can flag a problem inside a container before the container reaches a depot. The cloud is still in the loop, but it is no longer the only place decisions happen.

The economics also change. Bandwidth becomes cheaper because devices transmit decisions, not raw streams. Latency drops because decisions do not wait on a round-trip. And privacy improves, because the data that never leaves the device cannot leak from a centralised store.

Fig.: Edge intelligence is no longer exotic
What the discipline still covers

Firmware, connectivity, and fleet operations

IoT engineering at SDEN still covers the work the discipline has always covered. Firmware development for the device itself, in C, C++, Rust, or a higher-level framework where the constraints allow. Connectivity: choosing the right radio (cellular, LoRaWAN, Wi-Fi, BLE) and the right protocol (MQTT, CoAP) for the use case. Secure provisioning: making sure each device has a verifiable identity from the moment it is manufactured. Edge-to-cloud pipelines: getting decisions, events, and telemetry to the cloud safely and cheaply. And fleet operations: software updates, observability, and incident response across thousands of devices in the field.

What is new is the AI layer that sits inside the firmware, between the sensor and the radio. The discipline absorbed it, but the operational realities (model updates, model drift, model evaluation in the field) are new responsibilities for an embedded team.

Fig.: Firmware, connectivity, and fleet operations
What changes in operational posture

When the device decides, the device has to be observable

A device that classifies its own sensor data is a device whose classification can be wrong. The operational layer has to give the team the ability to know that, without sending every input upstream, because that defeats the point of edge computing.

The pattern that works is statistical: each device samples a small fraction of its decisions, with the inputs and the model output, and uploads the sample for evaluation. The team monitors the sampled accuracy over time, the distribution of decisions, and the gap between what the device says and what the centralised system would have said. When the metrics drift, the team retrains, redeploys, and tracks the change like any other production change.

This is a new operational discipline for most embedded teams. It is the one that decides whether the AI features in the firmware are a stable capability or a slow-moving liability.

Fig.: When the device decides, the device has to be observable
Before / after

What edge intelligence unlocks in the field

Four operational shifts we have shipped or are shipping for industrial, retail, and logistics clients.

Before

An industrial camera streams full-resolution video to a central server, where a model runs the analytics. Bandwidth costs are real, and connectivity outages mean blind spots.

After

The camera runs the analytics locally, transmits the counts and events, and stores only the frames that triggered an alert. Bandwidth drops by an order of magnitude; outages stop producing blind spots.

Takeaway · Decisions move to where the data is. The cloud becomes the aggregator, not the bottleneck.

Before

A predictive-maintenance sensor sends every reading upstream; the cloud computes anomalies; the alert arrives minutes later.

After

The sensor runs the anomaly model locally, alerts the operator in seconds when something is off, and only sends contextual data when an event has actually happened.

Takeaway · Time-to-alert collapses. Cost-to-monitor drops, because telemetry is event-driven.

Before

A retail counter sends footage to the cloud for occupancy analytics, with the legal review delaying the deployment by months.

After

The counter analyzes occupancy on-device, transmits only aggregate counts, and never produces a frame anyone could legally object to. Privacy and operations align by architecture.

Takeaway · Privacy by design becomes cheaper than privacy by policy.

Before

A logistics tag uses cellular to phone home every reading, draining the battery in months.

After

The tag uses a lightweight model to decide which readings matter, transmits only those, and lasts two years on the same battery.

Takeaway · Edge intelligence is also a power story. Devices last longer because they think before they speak.

Fig.: What edge intelligence unlocks in the field
How SDEN ships IoT

Three defaults across every device engagement

These are the practices we hold to across firmware, fleet operations, and edge-to-cloud pipelines.

Secure provisioning from manufacture

Every device leaves the factory with a unique, verifiable identity. Devices that cannot be provisioned this way are not deployed; the security debt is too expensive to retrofit at scale.

Updateable, observably so

Firmware is updateable over the air, including the AI models at the edge. The update telemetry is observable: we know which devices are on which version at any moment.

Field-evaluated, not lab-trusted

Every edge model has a sampled evaluation loop in the field. We do not assume the lab numbers hold once the device is deployed; we measure.

What good looks like

A fleet that does the right thing, then tells you about it

A mature IoT deployment is felt as a quiet operations dashboard.

A working fleet is one where decisions happen at the device, anomalies surface at the centre, and the operations team trusts what they see. New devices come online cleanly. Updates roll out without surprises. Battery, signal, and accuracy stay within expected envelopes; when they do not, the dashboard says so before the customer does.

The technical artefact is a stack from firmware to cloud that one team can reason about. The cultural artefact is an operations team that does not flinch when the fleet grows by another order of magnitude.

When SDEN finishes an IoT engagement, the deliverable is the firmware, the OTA pipeline, the fleet observability, and the runbook for the next deployment. The handoff is the point.

Fig.: A fleet that does the right thing, then tells you about it
FAQ

IoT & Embedded:
questions we get asked.

Direct answers to the questions we get asked the most. If yours isn't covered, write to the team.

Let's get to work

Got a project worth building?

Tell us about your project. We work with a limited number of clients at a time, and we'll get back to you within 24 working hours with a first engineer's read, no commitment.

WhatsAppChat with the team
LinkedInFollow SDEN
X@sdenengineering