SafeVision — Industrial Computer Vision Safety Platform
Real-time PPE, restricted-zone and machinery-proximity violation detection for automotive plants — RTSP camera frames to email alerts in under 5 seconds, six microservices, YOLOv8 + ByteTrack on shared memory.
Demo · Live recording
Problem
Automotive plants run on cameras that nobody watches in real time. PPE compliance, restricted-zone discipline and safe distance from moving machinery are audited after the fact — typically when an incident has already happened. I wanted to see how far a single-developer build could go toward closing that loop: take the cameras a plant already has, surface violations on a dashboard within seconds, and route alerts the same way an operator already gets notified. The product had to behave like a serious industrial system — multi-camera, multi-rule, multi-plant — not a demo that falls over the moment two streams hit it at once.
Real-time computer vision on RTSP is a backpressure problem first and a model problem second. A single 1080p frame is around 6 MB; pushing those through Redis would saturate the broker and starve every other consumer. Per-camera tracker state has to survive frame drops without re-identifying the same worker as a new entity. YAML safety rules — hot-reloaded on file change — must apply atomically so a half-written file never produces phantom violations. And operators who write the rules are not engineers, so the system needed a language for them, not just YAML.
Solution
Six microservices wired through Redis Streams with consumer groups for at-least-once delivery. Frame bytes travel out-of-band via POSIX shared memory: ingestion writes a UUID-named SHM block and publishes the reference; inference reads, runs YOLOv8 + ByteTrack, then unlinks. The rule engine watches the YAML directory with inotify and atomically swaps the in-memory rule set behind a lock. The Next.js 14 dashboard streams incidents over WebSocket and exposes a chat-based rule builder that calls an OpenRouter-hosted Llama to convert plain English into validated YAML. Postgres + TimescaleDB holds incident metadata; MinIO holds 20-second evidence clips (10s pre + 10s post, face-blurred when configured). Prometheus, Grafana and Loki provide the observability surface.
Architecture
RTSP camera → Ingestion (PyAV + SHM) → Inference (YOLOv8 + ByteTrack) → Rule Engine (YAML, hot reload) → Incident (FastAPI + Postgres + MinIO clips) → Notification (SMTP) → Email inbox. All inter-service hops are Redis Streams with consumer groups; OpenTelemetry trace IDs propagate through stream message headers.
Key metrics
Live links
Key decisions
Frames travel via POSIX shared memory, not Redis
A 1080p frame is ~6 MB. Putting that on a Redis Stream saturates the broker and starves every other consumer. Writing the bytes to a UUID-named SHM block and publishing only the reference keeps Redis for what it is good at — coordination and small events — while frame data stays in O(1) zero-copy reach on the same host. Safe for single-host plant deployments; a remote inference fan-out would need a different transport.
Hot-reload rules under a lock, no consumer restart
Operators iterate on YAML rules many times an hour while tuning a new camera. Forcing a service restart on every change would mean dropped frames during the gap. The rule engine watches the directory with inotify, re-parses the entire YAML set on any change, and swaps the in-memory rule reference under a lock — the next frame the consumer reads sees the new rules atomically, with zero downtime and zero half-loaded states.
Chat-based rule builder, validated YAML output
Safety officers know the policy; they do not write YAML. An OpenRouter-hosted Llama converts plain English into a rule, the schema is validated against Pydantic before the file is written, and the engine hot-loads it. The audit trail stays in YAML — diff-friendly and reviewable — but the authoring surface is conversation.
At-least-once over exactly-once, with idempotent consumers
Every Redis Stream uses consumer groups with explicit XACK and MAXLEN trim policies. Consumers are written to be idempotent on incident ID, which is cheaper than coordinating exactly-once delivery across six services and survives broker restarts, network blips and Inference catching up after a stall.
Reflection
The platform forced me to take real-time backpressure seriously. The first cut shipped frames through Redis and fell over at two cameras; moving to POSIX shared memory and accepting at-least-once delivery with idempotent consumers was the unlock. The chat-based rule builder turned out to be the feature operators reach for first — even when they could write YAML, they preferred describing the rule and reading back the diff. That validated keeping the LLM as an authoring surface, not a runtime dependency.