Overview
At Sweatcoin, we’re driven by a shared mission to make the world more active. We value creativity, collaboration, and solving complex problems. Our iOS and Android apps have more than 200M installs and 15M+ active users with a growing set of partners and independent research showing our impact on activity.
This is an exciting opportunity to make a huge impact on our data platform as we transition to a more scalable and efficient architecture. If you thrive on solving complex data challenges and want to be part of a high-performing team, we’d love to hear from you!
What we offer:
- A team of exceptional people who celebrate our community by being supportive and creative. The head of data once worked on a mind-reading helmet; a software developer has a certified psychological practice; QA professionals with unique backgrounds; and a number of musicians who help us stay innovative. We multiply each other’s talents to develop a product we’re all proud of.
- A product that promotes health and fitness in 100 countries with inspiring stories like this: https://blog.sweatco.in/one-sweatcoiners-journey-to-100000-steps/
- A startup that actually works. We are self-sufficient and supported by investors to keep growing. We recently broke a record of 10M new users each week.
- Models that help verify steps so there is no cheating for coins.
- Automatised analytics, a modern data stack (Snowplow, BigQuery, Airflow, Looker) and integrated tooling.
What you will do:
- Lead and execute the migration from Firebase-BigQuery-Looker to a self-hosted stack including Snowplow, Kafka, ClickHouse, Trino, Spark, S3, and Redash.
- Design, develop, and optimise scalable, high-performance data pipelines.
- Automate data processing and workflow orchestration.
- Enhance data infrastructure reliability, scalability, and cost-efficiency.
- Collaborate with engineers and analysts to define best practices for data processing, storage, and governance.
- Develop internal tools for data quality monitoring, lineage tracking, and debugging.
- Optimize query performance and ensure efficient data modeling.
What we expect from you:
- Expertise in Data Engineering: Strong experience building, maintaining, and optimizing ETL/ELT pipelines.
- Strong Coding Skills: Proficiency in Python and SQL for data processing and analytics.
- Distributed Systems Experience: Hands-on experience with Kafka, Trino, ClickHouse, Spark, or similar.
- Cloud & Storage: Experience with S3 or equivalent object storage solutions.
- Infrastructure & Tooling: Proficiency with Docker, Kubernetes, Git, and CI/CD pipelines.
- Orchestration & Automation: Familiarity with workflow orchestration tools like Airflow or dbt.
- Analytical Thinking: Ability to optimize system performance and troubleshoot complex data issues.
- Self-Starter Mentality: Comfortable working in a fast-paced, evolving environment with minimal supervision.
- Strong Communication Skills: Fluent English to ensure smooth collaboration within the team.
Nice to have:
- Experience with Snowplow for event tracking and data collection.
- Knowledge of data governance and security best practices.
- Familiarity with machine learning pipelines and real-time analytics.
What you get in return:
- Remote-friendly and flexible working hours. Performance is based on output, not hours spent; you can be wherever you want.
- Apple devices for work
- Team buildings abroad in exciting locations
- Health insurance coverage
- WellBeing program, which supports up to 2 counselling sessions per month
- Unlimited time-off policy
If you feel that this is a match, we’d be excited to have you on our team!