About Us
Zylon (https://www.Zylon.Ai) transforms company knowledge into immediate productivity gains through a secure, private AI workspace. Our on-premise deployment ensures complete data sovereignty, cost control, and full customization by running local AI models with no vendor lock-in. As creators and maintainers of the popular open-source project PrivateGPT (https://privategpt.dev) with more than 55K Github stars, we are committed to leveraging the latest AI technologies to drive innovation and deliver exceptional value to our customers. As an early-stage startup serving clients in financial, manufacturing, engineering, and other regulated industries, we're looking for talented individuals to help drive our growth in a company where we celebrate diversity and are committed to creating an inclusive environment for all employees.
Role Overview
We're seeking an experienced AI Engineer to join our team and play a crucial role in developing and enhancing our Private AI platform. You'll work on the entire AI stack, from GPU management and inference server to leveraging and optimizing open-source AI models for on-premise bare metal deployments, ensuring high performance and security for our enterprise clients in regulated industries.
Key Responsibilities
- Design, develop, and deploy AI systems that operate efficiently in on-premise environments with local models, independent of third-party providers like OpenAI or Amazon Bedrock.
- Optimize Small Language Models (SLMs) for private deployments focusing on performance and resource constraints.
- Contribute to the architecture of our AI platform, ensuring scalability and security.
- Implement advanced prompt engineering techniques and secure data processing pipelines for knowledge extraction and transformation.
- Research, design, and implement agentic strategies to enhance AI model interaction and user experience.
- Stay current with advancements in GenAI research to bring additional value to our product and deepen team understanding of AI models' capabilities and limitations.
- Collaborate with Product & Engineering teams on implementation, product definition, prioritization, scoping, and validation.
- Work with client success teams to troubleshoot and resolve technical issues.
Requirements
- 5+ years of experience in software or AI engineering roles, with a focus on GenAI in the last 2 years.
- Experience with on-premise deployment of AI systems.
- Strong background in integrating and fine-tuning open-source LLMs.
- Proficiency in Python and ML/AI frameworks (PyTorch, LlamaIndex, LangChain, etc.).
- Familiarity with agent strategies, RAG implementations, vector databases, and retrieval systems.
- Knowledge of containerization and deployment technologies (Docker, Kubernetes).
- Strong problem-solving skills, attention to detail, and excellent communication skills in English.
- Eagerness to stay updated with new techniques, models, and advances in GenAI.
- Creative, lean mindset, adaptable, and customer-focused.
Nice to Have
- Experience with vLLM + NVIDIA Triton architecture, CUDA, and NVIDIA drivers.
- Knowledge of evaluation and observability frameworks (LangSmith, Opik, Arize Phoenix, Ragas, etc.).
- Understanding of Model Context Protocol (MCP) for agentic systems.
- Experience with model quantization and optimization techniques.
- Background in startups or early-stage companies, especially in financial, manufacturing, or engineering domains.
What We Offer
- The opportunity to shape the future of private AI in regulated industries.
- Competitive salary and equity package.
- Work with cutting-edge AI technologies.
- Direct impact on product development and company growth.
- Collaborative, innovative team environment.
- Full-remote position within Europe, with a flexible schedule (40h/week).
- 23 days of PTO, necessary equipment, and periodic team-building events.
Location: Lugo, Kingdom Of Spain, ES