← Back to Blog

general

Build Notes: A Week of Local-First Smart-Home Research

Part of the build-in-public series. Everything here is something we actually tested, measured, or hit head-first last week — written up as findings, not a sales pitch.

The theme of the week was local-first home automation: a smart home that runs entirely on our own hardware — no vendor clouds, no subscriptions. Most of the work was research: figuring out what actually works before spending a dollar. Here's what we learned, with the technical details that matter.

Finding 1: A local LLM voice assistant fails silently if the model can't "think"

We wired Home Assistant's Assist pipeline to a local Ollama model (running on a GPU node over the LAN) so we could query the house with nothing leaving the network. Every request returned HTTP 500. Connectivity was fine — the real exception, buried in system_log, was:

ollama._types.ResponseError: "<model>" does not support thinking (status code: 400)

The conversation agent had "Think before responding" enabled against a non-reasoning model, so every call 400'd downstream and surfaced as a generic 500. A leftover ConnectionError from an earlier config attempt made it look like a network problem.

Takeaway: match Assist's feature flags to the model's real capabilities, and always read system_log/logbook for the underlying error — the surface message lies. For voice that controls devices (not just chats), you also need a tool-calling-capable model; small 3B/8B general models are weak at it.

Finding 2: One pinned model can starve a whole GPU

The assistant was also painfully slow. Cause: a 14B model had been loaded with an effectively infinite keep_alive (it showed an expiry decades out), pinning ~11GB of a 16GB card. The smaller assistant model couldn't fit in the remainder and crashed on load with a CUDA OOM. ollama ps made it obvious. Unloading the squatter dropped response time from "spinning forever" to ~2s warm.

Takeaway: on a shared GPU, audit keep_alive and ollama ps regularly. We also mapped every model across both AI nodes and found nine models doing the work of four — the same text job (summaries, rewrites, chat) was spread across four different models. Consolidating to one general text model per node frees both VRAM and disk.

Finding 3: Cheap smart plugs have a hidden auth wall (KLAP)

We tried to adopt some older Wi-Fi smart plugs into HA. With correct credentials and 2FA off, local auth still failed. A direct protocol probe revealed the model family negotiates KLAP encryption, which authenticates against the vendor's cloud account — and after a password reset, the plug keeps serving its old local credential hash until it re-syncs with the cloud.

Takeaway: "local control" is a spectrum. Cloud-auth Wi-Fi gadgets carry a dependency even when you only want LAN control. We're standardizing on a Zigbee 3.0 coordinator via ZHA (the SONOFF EFR32MG21 dongle) — and a key detail: mount the dongle on a shielded USB 2.0 extension, never a USB 3.0 port. USB 3 radiates 2.4GHz noise that wrecks Zigbee range. Locks are the exception — the best HA-local locks are still Z-Wave, so that's a second small USB stick alongside the Zigbee one.

Finding 4: Appliance automation lives or dies on two specs

The "put a dumb appliance on a smart plug" trick only works if the appliance has a mechanical on/off switch and resumes its last state after power is restored. Digital touch-button devices boot to OFF and wait for a press, so power-cycling the plug does nothing. This bites hardest on compressor appliances (coffee makers, dehumidifiers): many reset to OFF, and short-cycling a compressor via a plug will also kill it.

The local path for a digital appliance is an ESP32 + relay wired across the button (ESPHome), not a plug. On the ESP32-C3 specifically, mind the boot/strapping pins (avoid GPIO2/8/9 and the USB pins 18/19 for relay outputs), run sensors at 3.3V logic, and use a normally-closed relay so it fails safe.

Takeaway: before automating any appliance, confirm (a) mechanical switch and (b) power-loss resume behavior. Those two specs pick the entire approach — plug vs. relay-mod vs. native integration.

Finding 5: Sensor data needs sanity-checking before a dashboard trusts it

Takeaway: alerts need context, not just thresholds. The fix wasn't a higher global limit (that would miss a real infected phone) — it was segmenting the rule by device class.

Finding 6: The iOS notification history can't be trusted — so we kept our own

Pushes were delivering to the phone, but the companion app's notification history was empty (a known iOS-app limitation). Rather than fight it, we added an automation that triggers on the call_service event for the notify domain and mirrors every message into HA's logbook against a dedicated sensor. Now there's a permanent, searchable 7-day record rendered on the dashboard — independent of the phone app.

trigger:  { platform: event, event_type: call_service,
            event_data: { domain: notify } }
action:   logbook.log  →  name/message from trigger.event.data.service_data

Takeaway: when a vendor feature is flaky, capture the data yourself at the source event. The logbook becomes the source of truth.

Finding 7: Big-capacity storage forces real decisions

Designing a media + camera storage backbone surfaced concrete constraints:
- A popular NAS line caps at 24TB and increasingly requires the vendor's own branded drives on its current generation — a hard wall if you want 28TB+ disks. We chose an x86 NAS that accepts any drive (and can run TrueNAS Scale / ZFS) instead.
- At large drive sizes, single-parity RAID is genuinely risky: rebuilding a 28TB drive takes days, during which the array is degraded and the surviving drives are under maximum stress — exactly when a second failure is most likely, and with single parity that's total loss. Double parity (RAID-6 / RAIDZ2) is mandatory at this scale. Six 28TB drives in RAIDZ2 land ~112TB usable (4 data + 2 parity).
- Camera recording and media should not share a spindle. Frigate's sustained 24/7 writes go to a dedicated CMR drive in the compute host; the media library lives on the redundant array. And RAID is not backup — the irreplaceable slice (configs, photos) still needs an offsite copy.

Takeaway: "expandable" is about the drive policy, not the bay count — and RAID level must scale with drive size.

Bonus: small infrastructure truths we re-confirmed

The throughline

Almost every finding pointed the same way: the more local and open the device, the fewer surprises. Cloud-auth gadgets, vendor drive lock-in, and unreliable app features were the recurring friction; the pieces that "just worked" were the open ones — Zigbee/ZHA, ESPHome, ZFS, self-hosted services.

Next week: turning the research into an actual wiring plan and starting the foundation. We'll publish what works and what doesn’t.