Part of the build-in-public series. Everything here is something we actually tested, measured, or hit head-first last week — written up as findings, not a sales pitch.
The theme of the week was local-first home automation: a smart home that runs entirely on our own hardware — no vendor clouds, no subscriptions. Most of the work was research: figuring out what actually works before spending a dollar. Here's what we learned, with the technical details that matter.
Finding 1: A local LLM voice assistant fails silently if the model can't "think"
We wired Home Assistant's Assist pipeline to a local Ollama model (running on a GPU node over the LAN) so we could query the house with nothing leaving the network. Every request returned HTTP 500. Connectivity was fine — the real exception, buried in system_log, was:
ollama._types.ResponseError: "<model>" does not support thinking (status code: 400)
The conversation agent had "Think before responding" enabled against a non-reasoning model, so every call 400'd downstream and surfaced as a generic 500. A leftover ConnectionError from an earlier config attempt made it look like a network problem.
Takeaway: match Assist's feature flags to the model's real capabilities, and always read system_log/logbook for the underlying error — the surface message lies. For voice that controls devices (not just chats), you also need a tool-calling-capable model; small 3B/8B general models are weak at it.
Finding 2: One pinned model can starve a whole GPU
The assistant was also painfully slow. Cause: a 14B model had been loaded with an effectively infinite keep_alive (it showed an expiry decades out), pinning ~11GB of a 16GB card. The smaller assistant model couldn't fit in the remainder and crashed on load with a CUDA OOM. ollama ps made it obvious. Unloading the squatter dropped response time from "spinning forever" to ~2s warm.
Takeaway: on a shared GPU, audit keep_alive and ollama ps regularly. We also mapped every model across both AI nodes and found nine models doing the work of four — the same text job (summaries, rewrites, chat) was spread across four different models. Consolidating to one general text model per node frees both VRAM and disk.
Finding 3: Cheap smart plugs have a hidden auth wall (KLAP)
We tried to adopt some older Wi-Fi smart plugs into HA. With correct credentials and 2FA off, local auth still failed. A direct protocol probe revealed the model family negotiates KLAP encryption, which authenticates against the vendor's cloud account — and after a password reset, the plug keeps serving its old local credential hash until it re-syncs with the cloud.
Takeaway: "local control" is a spectrum. Cloud-auth Wi-Fi gadgets carry a dependency even when you only want LAN control. We're standardizing on a Zigbee 3.0 coordinator via ZHA (the SONOFF EFR32MG21 dongle) — and a key detail: mount the dongle on a shielded USB 2.0 extension, never a USB 3.0 port. USB 3 radiates 2.4GHz noise that wrecks Zigbee range. Locks are the exception — the best HA-local locks are still Z-Wave, so that's a second small USB stick alongside the Zigbee one.
Finding 4: Appliance automation lives or dies on two specs
The "put a dumb appliance on a smart plug" trick only works if the appliance has a mechanical on/off switch and resumes its last state after power is restored. Digital touch-button devices boot to OFF and wait for a press, so power-cycling the plug does nothing. This bites hardest on compressor appliances (coffee makers, dehumidifiers): many reset to OFF, and short-cycling a compressor via a plug will also kill it.
The local path for a digital appliance is an ESP32 + relay wired across the button (ESPHome), not a plug. On the ESP32-C3 specifically, mind the boot/strapping pins (avoid GPIO2/8/9 and the USB pins 18/19 for relay outputs), run sensors at 3.3V logic, and use a normally-closed relay so it fails safe.
Takeaway: before automating any appliance, confirm (a) mechanical switch and (b) power-loss resume behavior. Those two specs pick the entire approach — plug vs. relay-mod vs. native integration.
Finding 5: Sensor data needs sanity-checking before a dashboard trusts it
- An Ookla speed-test integration reported a ping of 1,800,000 ms regardless of actual latency — a broken field. Download/upload were accurate, so we dropped the ping chip and surfaced the test server instead.
- A "single device spamming DNS — possible malware" alert fired… on our own server. A busy host legitimately bursts DNS: a headless browser restarting re-resolves hundreds of domains in one poll window, easily exceeding a >150 queries/min threshold meant for client devices. We tuned the automation's condition to exempt infrastructure hosts (by IP/hostname) while keeping the tight threshold on phones/TVs/IoT.
Takeaway: alerts need context, not just thresholds. The fix wasn't a higher global limit (that would miss a real infected phone) — it was segmenting the rule by device class.
Finding 6: The iOS notification history can't be trusted — so we kept our own
Pushes were delivering to the phone, but the companion app's notification history was empty (a known iOS-app limitation). Rather than fight it, we added an automation that triggers on the call_service event for the notify domain and mirrors every message into HA's logbook against a dedicated sensor. Now there's a permanent, searchable 7-day record rendered on the dashboard — independent of the phone app.
trigger: { platform: event, event_type: call_service,
event_data: { domain: notify } }
action: logbook.log → name/message from trigger.event.data.service_data
Takeaway: when a vendor feature is flaky, capture the data yourself at the source event. The logbook becomes the source of truth.
Finding 7: Big-capacity storage forces real decisions
Designing a media + camera storage backbone surfaced concrete constraints:
- A popular NAS line caps at 24TB and increasingly requires the vendor's own branded drives on its current generation — a hard wall if you want 28TB+ disks. We chose an x86 NAS that accepts any drive (and can run TrueNAS Scale / ZFS) instead.
- At large drive sizes, single-parity RAID is genuinely risky: rebuilding a 28TB drive takes days, during which the array is degraded and the surviving drives are under maximum stress — exactly when a second failure is most likely, and with single parity that's total loss. Double parity (RAID-6 / RAIDZ2) is mandatory at this scale. Six 28TB drives in RAIDZ2 land ~112TB usable (4 data + 2 parity).
- Camera recording and media should not share a spindle. Frigate's sustained 24/7 writes go to a dedicated CMR drive in the compute host; the media library lives on the redundant array. And RAID is not backup — the irreplaceable slice (configs, photos) still needs an offsite copy.
Takeaway: "expandable" is about the drive policy, not the bay count — and RAID level must scale with drive size.
Bonus: small infrastructure truths we re-confirmed
- Run a neutral to every switch box you open. Older homes (no neutral at the switch) are the single biggest blocker to smart switches; pulling neutral now makes every future smart switch trivial.
- Solid vs stranded matters. In-wall structured cable is solid copper and must be punched down (keystone jacks + patch panel), not crimped into plugs — and avoid copper-clad-aluminum ("CCA") entirely; it fails PoE and fire code.
- PoE budget, not port count, is the real constraint. A pile of access points + cameras + a PoE light can exceed a small switch's wattage long before you run out of ports; size the switch by watts.
- An HA controller should boot from NVMe, not an SD card. HA's recorder database writes constantly and kills SD cards; moving to an M.2 SSD is the single biggest reliability upgrade for a Home Assistant host.
The throughline
Almost every finding pointed the same way: the more local and open the device, the fewer surprises. Cloud-auth gadgets, vendor drive lock-in, and unreliable app features were the recurring friction; the pieces that "just worked" were the open ones — Zigbee/ZHA, ESPHome, ZFS, self-hosted services.
Next week: turning the research into an actual wiring plan and starting the foundation. We'll publish what works and what doesn’t.