Job Summary
We are looking for an engineer to build and run AI services that mix computer vision, CNNs, large‑language models with RAG, and autonomous agents.
You’ll take new ideas from our research group and turn them into fast, reliable micro‑services on both edge devices and the cloud, expanding what the deviceWISE AI platform can do.
Responsibilities
- Design and build GPU‑accelerated micro‑services for vision, LLM, and RAG workloads.
- Own the full model lifecycle: data capture, training, evaluation, packaging, and CI / CD deployment (Docker, Kubernetes, NVIDIA NIM).
- Optimize inference with TensorRT, ONNX‑Runtime, quantization, batching, and Triton.
- Develop agent orchestration logic (LangChain, CrewAI) that chains tools, prompts, and APIs.
- Build and maintain automated visual inspection systems using computer vision models for quality control, defect detection, and anomaly identification in manufacturing and IoT environments.
- Integrate high‑throughput camera or sensor feeds with cloud knowledge bases for multimodal insights.
- Mentor junior engineers, lead code reviews, and document best practices (senior level).
Qualifications
- 3+ years of software development experience with strong proficiency in Python and modern AI / ML frameworks (PyTorch, TensorFlow).
- Neural network architectures (CNNs, transformers, RNNs)
- Model training, validation, and optimization techniques
- Computer vision and natural language processing fundamentals
- Hands‑on experience with ML infrastructure tools:
- Containerization (Docker, Kubernetes)
- Model serving platforms (FastAPI, Flask, or equivalent)
- GPU computing (CUDA, cuDNN) and performance optimization
- Proficiency in building and deploying microservices architectures.
- Experience with cloud platforms (AWS, GCP, or Azure) and MLOps practices.
- Strong foundation in software engineering principles, version control (Git), and collaborative development.
Preferred Qualifications
- Experience with edge AI deployment and optimization for resource‑constrained devices.
- Familiarity with IoT protocols and time‑series data processing.
- Contributions to open‑source AI / ML projects or research publications.
Location
Remote or on site in Boca Raton, Florida or São Paulo, Brazil