Enable job alerts via email!

Data Scientist II

Davita Inc.

University City (PA)

On-site

USD 90,000 - 130,000

Full time

13 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Data Scientist II to drive digital transformation through machine learning and data analysis. You'll work with complex datasets, develop actionable insights, and collaborate with cross-functional teams to enhance operations. The ideal candidate has a strong educational background and relevant experience in data science and analytics.

Benefits

Comprehensive benefits package

401k with company contributions

Flexible spending accounts

Qualifications

3-6 years of experience in data science and analytics.
Strong familiarity with big data frameworks and tools.
Relevant certifications in AI/ML preferred.

Responsibilities

Apply data science techniques to solve business problems.
Build machine learning models for process improvement.
Communicate model findings clearly to stakeholders.

Skills

Python

Machine Learning

Data Analysis

Statistical Techniques

Education

MS/PhD in Computer Science, Mathematics, Statistics

Tools

Hadoop

AWS

Azure

MATLAB

Carpenter Technology Corporation is a leading producer and distributor of premium specialty alloys, including titanium alloys, nickel and cobalt based superalloys, stainless steels, alloy steels and tool steels. Carpenter Technology's high-performance materials and advanced process solutions are an integral part of critical applications used within the aerospace, transportation, medical and energy markets, among other markets. Building on its history of innovation, Carpenter Technology's wrought and powder technology capabilities support a range of next-generation products and manufacturing techniques, including novel magnetic materials and additive manufacturing.

DATA SCIENTIST II

JOB DESCRIPTION SUMMARY - The data scientist trains, validates, and manages machine learning solutions, with a focus on Generative AI and Large Language Models (LLMs) to advance Carpenter Technology's digital transformation. Mines and analyzes complex and unstructured data sets using advanced statistical methods for use in data driven decision making. Performs research, analysis, and modeling on organizational data. Develops and applies algorithms or models to key business metrics with the goal of improving operations or answering business questions. Position is responsible for the entirety of ML model suite that address production quality issues for all the business lines. Builds ML simulation models to support R&D with product development.

PRIMARY RESPONSIBILITIES FOR THE DATA SCIENTIST II

Apply data science techniques to massive structured / unstructured data sets across multiple environments in order to discover patterns and solve strategic / tactical business problems - process improvement, yield improvement, and product development.
Build statistical and machine learning models for detecting root causes in process and yield variability. Machine learning algorithms will be exercised are - Logit, probit, complementary log-log regression, Random Forest, GBMs such as XGBoost, AdaBoost, CatBoost, LightGBM, RusBoost, AveBoost, ORBoost, SMOTEBoost, etc., Support Vector Machines, KNNs, MLP Neural Net, Convolutional Neural Net. Statistical models will be exercised are - General Linear Model, Generalized Linear Model, Multivariate Regression, Survival Models, Stepwise Logistic Regression, and Non-Parametric Models.
Develop prescriptions with actionable and controllable recipes for critical process variables from model parameters with baseline performance and estimated performance upon implementation of model prescriptions.
Design and conduct experiment for observational data to identify the factors associated with cost of poor quality and process variability - Randomized, Randomized Block, Latin Square, and Full factorial and apply appropriate general linear models such as Fixed effect, Random Effect, Mixed Effect Models to derive ANOVA, ANCOVA, MANOVA, and MANCOVA.
Build process simulation model to identify optimal critical process path using both chaotic dynamic and stochastic process simulation such as Hidden Gauss-Markov and Monte-Carlo Simulation.
Develop anomaly detection models such as iForest, Local Outlier Factor, GMM, one class SVM, etc. to identify anomalous behavior in critical process inputs for both batch and stream processing.
Design, train, and fine-tune large language models such as GPT-4, BERT, or similar, for various applications and conduct experiments to improve model performance and efficiency.
Create, test, and refine prompts to enhance the accuracy and relevance of model outputs and develop strategies for prompt optimization and customization based on specific use cases.
Ensure that AI models and solutions adhere to ethical guidelines and standards and address biases in data and models to promote fairness and inclusivity.
Manage machine learning model life cycle through documentation, version control, model presentation, model audit, back testing, forward testing, benchmarking with the help of performance metrics.
Communicate and democratize model findings very clearly and precisely with stakeholders such as market leads, metallurgists, R&D and senior business leaders.
Communicate complex technical concepts to non-technical stakeholders through presentations and reports.
Drive the collection, cleansing, processing, and analysis of new and existing data sources.
Identify and correct errors, inconsistencies, and missing values in datasets using techniques such as deduplication, normalization, and imputation. Standardize data formats and structures to ensure consistency across datasets and implement data transformation techniques to prepare data for analysis and modeling.
Develop and apply data validation rules to ensure data integrity; Conduct regular audits of datasets to identify and rectify quality issues; Create and maintain documentation for data cleansing processes and standards
Report findings through appropriate outputs and visualizations tailored for the intended audiences
Learn and stay current on analytics developments in one or more business domains: Internet of Things, Manufacturing, Supply Chain, Forecasting, Marketing and Sales, Pricing, etc.
Learn and stay current on developments in one or more analytics domains: Operations Research, Machine Learning, Deep Learning / AI, Simulation, etc.
Generate innovative ideas, establish new research directions, and shape the information strategy in support of technical projects and new product developments
Collaborate with new, cross-functional teams on accelerated projects to scale data architecture, build digital products, and execute data science solutions
Work closely with cross-functional teams, including software engineers, product managers, and domain experts, to integrate AI solutions into new or existing digital products.

Perform all other duties and special projects as assigned.

REQUIRED FOR THE DATA SCIENTIST II

MS/PhD preferred in computer science, mathematics, statistics, operations research, or related field.
3-6 years of experience in data science, analytics, and model building roles.
Proficiency in programming in Python, R, Julia, MATLAB, and SAS
Knowledge of other programming languages and analysis tools (e.g. Scala, Java, Ruby, JavaScript, shell scripting)
Strong familiarity with big data frameworks and tools (e.g. Hadoop, Spark, MapReduce, Hive, Pig, Luigi/Airflow, Kafka, Data streaming, NoSQL, SQL)
Familiarity with cloud-based solutions (e.g. Azure, AWS EMR)
Relevant certifications in AI/ML, such as TensorFlow Developer Certificate, AWS Certified Machine Learning, etc.
Relevant certifications in data management or data quality are a plus.
Experience in consuming REST based API with JSON payload preferred
Perform work under general supervision. Handles moderately complex issues and problems, and refers more complex issues to higher-level staff.
Possess solid working knowledge of subject matter.
Practical knowledge of analytical techniques and methodologies (e.g. machine learning, segmentation, mix and time series modeling, response modeling, lift modeling, experimental design, neural networks, data mining, optimization techniques)
Understanding of data profiling and data cleansing techniques
Strong written and verbal communications skills, including with senior business leaders
Experience working with remote colleagues and teams
Natural curiosity and passion for empirical research and problem solving

Carpenter Technology Company offers a competitive salary and a comprehensive benefits package including life, medical, dental, vision, flexible spending accounts, disability coverage, 401k with company contributions as well as many other options to employees.

Carpenter Technology Corporation's policy is to fully and effectively maintain a program of equal employment opportunity and nondiscrimination for all employees, to employ affirmative action for all protected classes, and to recruit and develop the best qualified persons available regardless of age, race, color, religion, sex, gender identity, sexual orientation, marital status, national origin, political affiliation or any other characteristic protected by law. The Company also will recruit, develop and provide opportunities for qualified persons with disabilities and protected veterans.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs