Computer vision systems
Detection, segmentation, tracking. Built around your data, evaluated on your metrics, deployed to your stack.
Production-grade AI systems engineered to run at scale. Eight years of ML in production — not in slides.
From a vague problem statement to a system running in production. The middle 80% is where most of the work lives.
Detection, segmentation, tracking. Built around your data, evaluated on your metrics, deployed to your stack.
VLMs for retrieval, captioning, document understanding. Fine-tuned where it matters — not just prompted.
End-to-end indexes from photo to product. Index design, ANN, rerankers, eval — the parts most demos skip.
Invoice, form, receipt and ID extraction. High recall, audited error modes, human-in-the-loop where required.
Quantize, profile, ship. Sub-100ms inference on CPU, GPU and embedded targets.
A selection of production work across retail, logistics, industrial automation and healthcare. Detection, retrieval, OCR, VLMs — and the unglamorous pipeline work that makes any of it real.
End-to-end pipeline combining monocular depth estimation (Depth Anything V3) and semantic segmentation (OneFormer) to auto-generate per-camera monitoring polygons — enabling people-in-distress detection across venue deployments.
Feature-matching pipeline using RANSAC to quantify field-of-view overlap between camera pairs. Produced per-pair overlap scores and camera-graph layouts that informed placement and eliminated blind spots.
Built and deployed a state-of-the-art visual product retrieval model powering real-time product discovery from a phone photo. Surpassed Google Vertex AI multimodal embeddings on the same eval set.
Explored Vision-Language Models to automate parcel-sorting validation in warehouses and extract structured text from customs labels. Replaced brittle template OCR with prompted VLM extraction.
Built and optimized a Vision Transformer-based pipeline for video analysis of airport luggage footage, extracting flight numbers and license plates with high accuracy in operational conditions.
Automated product recognition using feature-based matching of basket contents in retail. Layered consumer-behavior analysis on top of the recognition stream to surface buying patterns.
OCR and semantic analysis for sender/receiver address extraction. Optimized deep-learning pipelines for real-time barcode detection on small packages on a moving belt.
Siamese network trained on retinal fundus images to learn lower-dimensional representations minimizing intra-class and maximizing inter-class distance. One-shot retrieval at inference.
WeConverge is an applied AI engineering practice focused on production computer vision and machine-learning systems. We work with operators in retail, logistics, industrial automation and healthcare — the places where models have to actually run, not just demo.
Eight years of shipping ML systems in production: state-of-the-art visual product retrieval at marketplace scale, real-time barcode detection on moving belts, depth + segmentation pipelines that auto-generate camera monitoring zones at venue scale, and VLMs for parcel sorting and customs OCR.
We take one or two engagements at a time — long enough to ship the system and hand it back working. No agency layer, no slide decks standing in for evidence. Just a small team that builds.
If you have a problem worth solving, we'd like to hear about it.
Tell us about it. We'll reply within one business day — usually with a clarifying question or two.