We are seeking a skilled and motivated Data Scientist with 2–3 years of hands‑on experience in machine learning and applied statistics to join our growing analytics team. The ideal candidate will have a strong foundation in statistical analysis, ML algorithms, model development, and deployment, with a passion for solving real‑world problems using data‑driven and statistically sound approaches
Responsibilities
-
Machine Learning & Statistical Modeling
- Design, build, and evaluate supervised and unsupervised ML models (e.g., regression, classification, clustering, recommendation systems).
- Apply statistical modeling techniques such as linear/logistic regression, regularization, Bayesian methods, and time‑series analysis where appropriate.
- Perform feature engineering, model tuning, and validation using cross‑validation, statistical tests, and performance metrics.
-
Data Preparation, EDA & Statistical Analysis
- Clean, preprocess, and transform large datasets from multiple structured and unstructured sources.
- Conduct exploratory data analysis (EDA) using descriptive statistics, distributions, correlation analysis, and data visualization.
- Use inferential statistics (hypothesis testing, confidence intervals, A/B testing) to support modeling decisions and business insights.
- Evaluate models using both ML metrics (accuracy, precision‑recall, AUC, RMSE) and statistical measures.
- Deployment of ML models into production using tools such as Flask, FastAPI, or cloud‑native services will be an add on.
- Monitor model performance, data drift, and statistical stability; retrain or recalibrate models as required.
- Work closely with data engineers, product managers, and business stakeholders to translate business problems into statistically and analytically sound solutions.
- Communicate results, assumptions, and limitations of models clearly to both technical and non‑technical audiences.
-
Tools & Technologies
- Use Python and libraries such as NumPy, pandas, SciPy, scikit‑learn, XGBoost, TensorFlow, or PyTorch.
- Utilize visualization and analytics tools (Matplotlib, Seaborn, Plotly) for statistical reporting.
- Leverage version control (Git), Jupyter notebooks, and ML lifecycle tools (MLflow, DVC).
Preferred Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, Statistics, Mathematics, or a related field.
- 3–4 years of experience in building, evaluating, and deploying ML models.
- Strong programming skills in Python; working knowledge of SQL.
- Solid foundation in statistics, including probability theory, hypothesis testing, regression analysis, and experimental design.
- Exposure to cloud platforms (AWS, GCP, or Azure) and MLOps practices is advantageous.
- Excellent analytical thinking, problem‑solving, and communication skills.