
Enable job alerts via email!
Generate a tailored resume in minutes
Land an interview and earn more. Learn more
A forward-thinking tech company in Mpumalanga is seeking a candidate to own the ETL pipeline processing raw PDFs into structured resources. The role involves working with ML pipelines, implementing content freshness logic, and collaborating closely with team leads on UX integration. Ideal applicants will have experience with semantic search and a passion for transforming unstructured data into valuable insights.
INFUSE is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Privacy Policy.
INKHUB is ingesting 10 million raw PDFs to build the Internet's richest catalog of marketing-grade B2B content — tagged, summarized, and searchable by topic, company, or intent.
Python, PyTorch, sentence‑transformers, OpenAI APIs, or similar pretrained LLMs. FastAPI, Milvus or pgvector, PyPDF/Tika, Airflow or Lambda for orchestration. Docker, GPU scheduling, Athena/Redshift SQL.
Your models decide what gets found, how it's tagged, and which content and companies stand out. You’ll help define what “relevance” and “freshness” mean for over a million resources and 50+ company pages and make sure INKHUB stays ahead of the curve.
Referrals increase your chances of interviewing at INFUSE by 2x.
Be among the first 25 applicants to get a fair and detailed assessment from our seasoned recruiting professionals.