Enable job alerts via email!
A leading AI firm is looking for an AI Engineer to shape its platform using large language models. In this fully remote role, you'll design prompts, develop features, and continuously evaluate model performance, working closely with product managers and engineers to deliver intelligent capabilities. The ideal candidate excels in a collaborative environment and has a strong background in TypeScript and Python.
As an AI Engineer at Reforge, you will play a key role in shaping how our platform uses large language models to deliver value to customers. You’ll focus on building and improving LLM-powered products and features – from crafting high-quality prompts to evaluating and tuning model outputs for optimal performance. This is a highly collaborative, product-focused role where you’ll work closely with product managers and engineers to turn cutting-edge AI capabilities into practical features. This position is fully remote, so we’re looking for someone who is self-motivated and excels at clear, asynchronous communication within a distributed team.
Who you'll work with
This is a highly collaborative, product-focused role where you’ll work closely with product managers and engineers to turn cutting-edge AI capabilities into practical features. This position is fully remote, so we’re looking for someone who is self-motivated and excels at clear, asynchronous communication within a distributed team.
Prompt Engineering: Design, experiment with, and refine prompts and system instructions to maximize the effectiveness and reliability of LLMs across Reforge’s products.
LLM-Powered Features: Collaborate with product and engineering teams to develop and integrate new backend features powered by LLMs, enhancing our Insight Analytics platform with intelligent capabilities.
Continuous Improvement: Continuously evaluate the quality of LLM outputs via internal testing and user feedback. Iterate on prompts, model settings, or data pipelines to improve performance over time and deliver a better user experience.
Model Evaluation: Design and implement evaluation frameworks to assess LLM performance across different use cases. Monitor key metrics and develop automated testing pipelines to ensure consistent quality.
Coding & Integration: Write and maintain code (primarily in TypeScript, with some Python) to implement support for LLM interactions – including building retrieval-augmented generation (RAG) pipelines, working with vector embeddings, and handling content chunking of large documents. A majority of your work will be in-code shipping product — not in prompting tools or notebooks.