A $100 single-board computer, the Khadas Edge2, is running real-time drone detection on its built-in neural processing unit, hitting the camera's 46 FPS capture ceiling while consuming roughly 140 MB of resident memory per stream, with no GPU server in the building. The work, posted as a Show HN project on GitHub by user alebal123bal, reads less as a research contribution than as a reproducible blueprint for low-footprint, sensor-rate edge vision.
The Khadas Edge2 is built around Rockchip's RK3588S, a system-on-chip that integrates a CPU, a GPU, and a roughly 6 TOPS neural processing unit (NPU) on the same die. The pipeline pushes inference onto the NPU and pushes pixel handling onto the chip's image signal processor (ISP) and 2D raster graphics accelerator (RGA), fixed-function blocks designed for image work. The CPU is left to coordinate rather than crunch.
That division of labor is the spine of the project. The repository is a chain of small independent processes tied together with Unix-domain sockets, a local inter-process channel that avoids the overhead of network protocols. One process captures frames from a MIPI-CSI-connected 1080p camera (the standard embedded-camera link used on smartphones and single-board computers) at up to 46 FPS, the camera's hardware ceiling. Another converts color space on the RGA. A third runs two variants of YOLOv8n, a compact object-detection model, in parallel across three of the NPU's cores. The author's README claims this NPU parallelism lifts throughput from roughly 31 FPS to the 46 FPS the camera can deliver, with each stream holding about 140 MB of resident memory, per the project repository.
The 46 FPS figure deserves care. It is the camera's ceiling, not a claim about detection quality at that throughput. The repository does not publish mAP (mean average precision, the standard object-detection accuracy metric), miss rate, false-positive rate, or small-object performance, and a GitHub demo is not a deployed counter-UAV system. Real-world drone detection still has to contend with occlusion, range, weather, and adversarial robustness, and this code does not claim to solve any of that.
What it does claim is a pattern. By keeping capture, color conversion, and inference on fixed-function silicon and stitching them together with small processes over Unix-domain sockets rather than a monolithic application, the author turns the RK3588S into something closer to a tiny pipeline operator than a generic Linux box running a neural network. Each stage is small enough to be swapped or profiled independently, and the same architecture should travel to other small-model vision tasks: license-plate reading, traffic analytics, fall detection, industrial inspection, or any workload where the constraint is sensor rate, not model size.
There is one extra step the author chose to add. After the tracker produces detections, a small language model, Qwen2.5-0.5B, runs on-device and writes a short natural-language recap of what the camera just saw. It is a recap, not a classifier: the model summarizes bounding-box counts and rough trajectories, and it does not infer intent or threat level. In a hobbyist demo it is a quirk. In a fieldable system it would be a debugging and log-compression tool. The point is that an NPU-class edge board can host both a vision model and a small language model at the same time, with the CPU still largely free.
The repository itself is small. It has four commits, one star, and no forks as of writing, and the headline throughput figure is the author's own measurement rather than an independent benchmark, as the GitHub repository shows. Public specs for Rockchip's RK3588S put the integrated NPU at roughly 6 TOPS, but no third-party review, paper, or vendor validation has surfaced to corroborate the project's throughput claim. The constructive read is that a researcher, a small lab, a school, or a town can stand up real-time UAV sensing on $80 to $150 of RK3588S hardware. What to watch is whether the same pattern lands on a different small-model vision task, because the architectural idea (independent processes over Unix-domain sockets, fixed-function acceleration, multi-core NPU parallelism) is the part that travels. The drone detector is the demo; the blueprint is what is worth replicating.