Strong understanding of Data Architecture and models, with experience leading data-driven projects.
Expertise in Data Modelling paradigms such as Kimball, Inmon, Data Marts, Data Vault, Medallion, etc., with well-formed opinions.
Experience with Cloud-Based data strategies and big data technologies, preferably AWS. Ability to create backend services in Python for data pipelines is required.
Designing data platforms on AWS for batch and stream processing pipelines.
Hands-on experience with AWS Managed services and big data tools like EMR, Glue, S3, Kinesis, DynamoDB, ECS.
Strong understanding of Apache Spark.
Knowledge of Data Lake/Lakehouse storage formats such as Delta, Iceberg, Hudi.
Experience in designing data lakehouse architectures with Medallion architecture is desirable.
Designing ETL pipelines involving ingestion, transformation, and data quality, with expertise in Python and SQL.
Proficiency in SQL.
Experience integrating Python and SQL in ETL pipelines.
Understanding of data manipulation libraries like Pandas, Polars, DuckDB is a plus.
Experience with data visualization tools like Tableau and PowerBI is desirable.
Familiarity with other data platforms such as Azure, Databricks, Snowflake is a plus.
Responsibilities
Participate in designing and developing features for the existing Data Warehouse.
Provide leadership in connecting Engineering, Product, and Analytics/Data Science teams.
Design and update batch ETL pipelines.
Define and implement data architecture.
Collaborate with engineers and data analysts to build reliable datasets for company-wide use.
Work with data orchestration tools like Apache Airflow, Dagster, Prefect, etc.
Adapt to a fast-paced startup environment.
Be passionate about your work and enjoy working in an international environment.
Experience in the telecom industry is a plus but not required.