We/Converge

Building intelligent vision systems for real-world impact.

Production-grade AI systems engineered to run at scale. Eight years of ML in production — not in slides.

Business

Problem

Converge

AI / CV

Solution

02/Services

Things we do, end-to-end.

From a vague problem statement to a system running in production. The middle 80% is where most of the work lives.

01↗

Computer vision systems

Detection, segmentation, tracking. Built around your data, evaluated on your metrics, deployed to your stack.

pytorchdetectronmmdet

02↗

Vision-language models

VLMs for retrieval, captioning, document understanding. Fine-tuned where it matters — not just prompted.

clipsiglipllava

03↗

Visual search & retrieval

End-to-end indexes from photo to product. Index design, ANN, rerankers, eval — the parts most demos skip.

faissqdrantrerank

04↗

OCR & document AI

Invoice, form, receipt and ID extraction. High recall, audited error modes, human-in-the-loop where required.

layoutlmdonutpaddle

05↗

Edge AI & deployment

Quantize, profile, ship. Sub-100ms inference on CPU, GPU and embedded targets.

onnxtensorrtopenvino

03/Selected Work

Systems we've shipped.

A selection of production work across retail, logistics, industrial automation and healthcare. Detection, retrieval, OCR, VLMs — and the unglamorous pipeline work that makes any of it real.

01 · Industrial · SafetyShipped · 2026

Automated monitoring zones at scale

End-to-end pipeline combining monocular depth estimation (Depth Anything V3) and semantic segmentation (OneFormer) to auto-generate per-camera monitoring polygons — enabling people-in-distress detection across venue deployments.

Impact:Auto-generated zones across hundreds of cameras · zero manual annotation

Depth Anything V3OneFormerSegmentationPyTorch

02 · Industrial · Camera networksShipped · 2026

Camera overlap analysis

Feature-matching pipeline using RANSAC to quantify field-of-view overlap between camera pairs. Produced per-pair overlap scores and camera-graph layouts that informed placement and eliminated blind spots.

Impact:Overlap graphs across venues · blind-spot elimination

RANSACFeature matchingOpenCVGraph layout

03 · Retail · Visual searchShipped · 2025

Visual product retrieval at marketplace scale

Built and deployed a state-of-the-art visual product retrieval model powering real-time product discovery from a phone photo. Surpassed Google Vertex AI multimodal embeddings on the same eval set.

Impact:+12.1% Recall@1 · +4.7% Recall@10 vs. Vertex AI multimodal

CLIPVertex AIPyTorchEmbeddingsRetrieval

04 · Logistics · VLMShipped · 2024

VLMs for parcel sorting & customs OCR

Explored Vision-Language Models to automate parcel-sorting validation in warehouses and extract structured text from customs labels. Replaced brittle template OCR with prompted VLM extraction.

Impact:Structured customs-label extraction · sorting validation in production

VLMLLaVAHugging FaceUnslothVLLM

05 · Aviation · OCRShipped · 2023

Vision Transformers for airport luggage

Built and optimized a Vision Transformer-based pipeline for video analysis of airport luggage footage, extracting flight numbers and license plates with high accuracy in operational conditions.

Impact:Flight # and plate extraction on live luggage video

ViTOCRTensorRTTriton

06 · Retail · Scan & GoShipped · 2022

Seamless basket recognition

Automated product recognition using feature-based matching of basket contents in retail. Layered consumer-behavior analysis on top of the recognition stream to surface buying patterns.

Impact:Basket-scan automation · behavior insights

Feature matchingPyTorchRetail

07 · Logistics · EdgeShipped · 2021

Address OCR + real-time barcodes

OCR and semantic analysis for sender/receiver address extraction. Optimized deep-learning pipelines for real-time barcode detection on small packages on a moving belt.

Impact:Real-time barcode detection · address extraction in production

OCRDetectionTensorRTEdge

08 · Healthcare · RetrievalShipped · 2019

CBIR for retinal fundus images

Siamese network trained on retinal fundus images to learn lower-dimensional representations minimizing intra-class and maximizing inter-class distance. One-shot retrieval at inference.

Impact:Published · BVM 2020 (Springer Vieweg)

Siamese networkOne-shotRetrievalTensorFlow

04/About

An applied AI engineering practice.

8 yrs: in production ML
4: industries served
1–2: engagements at a time
EU + US: remote-friendly

WeConverge is an applied AI engineering practice focused on production computer vision and machine-learning systems. We work with operators in retail, logistics, industrial automation and healthcare — the places where models have to actually run, not just demo.

Eight years of shipping ML systems in production: state-of-the-art visual product retrieval at marketplace scale, real-time barcode detection on moving belts, depth + segmentation pipelines that auto-generate camera monitoring zones at venue scale, and VLMs for parcel sorting and customs OCR.

We take one or two engagements at a time — long enough to ship the system and hand it back working. No agency layer, no slide decks standing in for evidence. Just a small team that builds.

/ Founder

Azeem Bootwala — AI engineer, computer vision. Sr. CV Engineer at Basic-Fit, previously Data Scientist at Bol.com (Spot & Shop visual search) and six years of CV at Prime Vision. M.Sc. Medical Engineering, Erlangen-Nürnberg; published siamese-network retrieval at BVM 2020.

If you have a problem worth solving, we'd like to hear about it.

Contact · 05

Have a problem worth solving?

Tell us about it. We'll reply within one business day — usually with a clarifying question or two.

info@weconverge.nl

Delft · NL · UTC+1