With annual revenues of $400M, Vireo Health is transforming the healthcare landscape through our ambitious mission to focus on data to drive our business and to build a data infrastructure that helps us scale through acquisitions, aiming to grow from $400M to $1B+ in 2 years. We are building a highly sophisticated business operation utilizing data for LLMs, ML, and analytics. Our Saigon AI and data platform team is expanding, and we are looking to add two data engineers who can help us develop a well-governed data infrastructure from an early stage system and architecture. We are also leveraging LLMs to inform strategic and operational decisions, seeking candidates committed to learning and growth in this area.
Commitment to Talent: Excellent candidates may transfer to our US office after working with us for >1 year (if they wish).
Responsibilities
- Support our solutions architect and data platform leader in building a scalable data infrastructure to support acquisitions of 4-5 companies per year, deployment of AI/ML, and scaling delivery and retail sales from $400M to $1.5B.
Data Warehouse & Lakehouse:
- Design, implement, and optimize scalable data warehouse and lakehouse solutions (e.g., Delta Lake, Apache Iceberg, Snowflake, BigQuery).
- Develop ETL/ELT pipelines for efficient ingestion, transformation, and management of structured and unstructured data.
- Ensure data governance, lineage, and security best practices, including access controls and encryption.
- Optimize data storage, partitioning, and query performance for cost efficiency.
Data Platform, Cloud & System Reliability:
- Build and maintain a robust, scalable data platform supporting engineering, analytics, and AI workloads.
- Deploy, manage, and optimize data infrastructure across cloud platforms (AWS, GCP, Azure).
- Enhance system reliability, observability, and cost efficiency through monitoring, logging, and automation.
- Develop SDKs, APIs, and automation tools to improve data engineering workflows.
- Collaborate with Data Analytics, ML Engineers, and Software Engineers to enhance platform capabilities and scalability.
- Integrate data with AI workloads, leveraging LLM techniques such as retrieval-augmented generation (RAG) and fine-tuning.
- Index and manage data within Vector Stores for real-time AI applications.
- Develop external tools and APIs to enable LLMs and AI models to query and interact with data efficiently.
Your skills and experience- 4+ years of experience in data platform engineering, backend development, or data infrastructure.
- Ability to craft complex SQL queries for data transformation, analysis, and optimization.
- Strong programming skills in Python, Go, or Java for building scalable data solutions and microservices.
- Deep knowledge of distributed data processing frameworks (e.g., Spark, Flink) and hands-on experience with at least one cloud data platform (AWS Glue + Athena, Google BigQuery, Snowflake, or Databricks).
- Proven experience in building reliable data pipelines and ETL workflows using workflow orchestration tools (e.g., DBT, Airflow, Dagster, Prefect).
- Strong experience in data integration across various sources, including RDBMS, Vector Stores, and real-time data streaming technologies (Kafka, Pulsar, Kinesis).
- Proficiency in containerization with Kubernetes, Docker for scalable Data & AI workloads.
- Experience designing and optimizing Data Warehouse and Data Lakehouse Architectures: Delta Lake, Apache Iceberg.
- Strong problem-solving skills, a product-led mindset, and AI-first thinking, with experience using AI for coding.
- Excellent communication skills in English, with the ability to collaborate across technical and non-technical teams.
- Ability to quickly prototype, test, and iterate solutions and products.
Preferred Qualifications
- Familiarity with vector databases (Pinecone, FAISS, Weaviate, ChromaDB) for AI-driven applications is a plus.
- Understanding of data versioning, lineage, and governance (e.g., Apache Atlas, Great Expectations).
Why you'll love working here
- Great compensation
- Stock options in a PUBLIC company (ESOP)
- 13th month and more
- Opportunity to transfer to the US
- Work with LLMs