Job Search and Career Advice Platform

Enable job alerts via email!

Member of Technical Staff, Data

Plus10 Recruitment

Town of Poland (NY)

Hybrid

USD 70,000 - 90,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technical recruitment agency is seeking a Member of Technical Staff, Data for a non-profit organization focused on advancing scientific discovery using AI. The role requires leading data extraction processes and integrating various data sources. Candidates should have a Bachelor's degree and 1-3 years of experience in data transformation or ETL processes, along with proficiency in Python. This is a unique opportunity to contribute to groundbreaking research efforts, fostering innovation through collaboration.

Qualifications

  • 1-3 years of experience working with data transformation or ETL processes.
  • Proficiency in Python and/or similar languages for data processing.
  • Experience managing small to medium-sized data projects.

Responsibilities

  • Lead projects to design and implement data extraction processes.
  • Develop and maintain parsers for diverse data sources.
  • Identify and extract valuable features from complex raw data sets.

Skills

Python
Data transformation
Data parsing libraries
Analytical skills
Project management

Education

Bachelor's degree in computer science, data science, information systems, or related field

Tools

SQL
JSON
Job description
Plus10 is a technical recruitment agencywith afocusonEngineering and Product professionals that build web applications using a modern stack.Plus10 recruiters are knowledge stewards that open doors for individuals looking to progress their career.We are working hand-in-hand with the following client to help find a Member of Technical Staff, Data
The client is a non-profit organization building an autonomous AI Physicist designed to advance humanity's understanding of the fundamental laws of nature. The goal is for the AI Physicist is to achieve a breakthrough that unifies quantum field theory & general relativity and to explain the deepest unresolved phenomena in our universe by 2035. They're pioneering a new approach to scientific discovery by creating an intelligent system that can explore theoretical frameworks, reason across disciplines, and generate novel insights. The organization operates like a tech start-up by moving quickly and continuously iterating to accelerate scientific progress. By combining AI, symbolic reasoning, and autonomous research capabilities, to develop a platform that goes beyond analyzing existing knowledge to actively contribute to physics research.

Job Description:

Our client is seeking a skilled and detail-oriented Member of Technical Staff, Data to play a crucial role in our data pipeline development. In this position, you will lead projects to design and implement data extraction processes from various structured and unstructured sources, create robust parsing mechanisms, and develop sophisticated logic to extract meaningful features from raw data. Working in an agile environment, you'll iteratively refine extraction methods based on on-going feedback.

Key Responsibilities:

Project Leadership:

  • Investigate and evaluate new data sources.
  • Create comprehensive extraction plans and strategies for each data source.
  • Lead the full lifecycle of data extraction projects from planning to implementation.
  • Work closely with peers and managers to iterate quickly and refine various approaches.
  • Progressively scale extraction processes from small test batches to full implementation.

Data Source Integration:

  • Develop and maintain parsers for diverse data sources including APIs, databases, web content, PDFs, and scientific literature.
  • Create reliable ETL processes to ensure data quality and consistency, including LLM-based extraction pipelines.
  • Design and refine prompts for LLMs to extract structured information from unstructured data sources, including text, images, and other multimodal inputs.
  • Implement error handling and logging systems to maintain data pipeline reliability.

Feature Engineering:

  • Identify and extract valuable features from complex raw data sets.
  • Develop logic and algorithms to transform unstructured information into structured, analyzable formats.
  • Create reproducible processes for data normalization and standardization.

Pipeline Architecture:

  • Design scalable data transformation workflows.
  • Optimize parsing procedures for performance and accuracy.
  • Document data lineage and transformation processes for transparency.

Collaboration:

    • Work closely with cross-functional teams to understand feature requirements.
    • Coordinate with engineering team to integrate data pipelines into broader systems.
    • Communicate technical concepts clearly to non-technical stakeholders.
    • Engage directly with third party data vendors to obtain technical specifications and integration details.
    • Demonstrate ability to work effectively both as part of a collaborative team and independently on self-directed tasks.

Qualifications:

  • Educational Background: Bachelor's degree in computer science, data science, information systems, or related field.
  • Experience: 1-3 years of experience working with data transformation, ETL processes, or similar roles.
  • Project Management Skills:
    • Experience managing small to medium-sized data projects from conception to completion.
    • Demonstrated ability to create technical plans and roadmaps for data extraction.
    • Experience working in agile environments with iterative development cycles.
  • Technical Skills:
    • Proficiency in Python and/or similar languages for data processing.
    • Experience with data parsing libraries and frameworks.
    • Knowledge of data storage systems and formats (SQL, JSON, etc.)
    • Familiarity with regular expressions and text processing techniques.
    • Experience with prompt engineering for LLMs and AI-assisted data extraction.
  • Analytical Skills: Strong problem-solving abilities and attention to detail.
  • Communication: Ability to document processes clearly and communicate technical concepts.
  • Bonus Skills:
    • Experience with natural language processing.
    • Knowledge of scientific literature and research data structures.
    • Familiarity with cloud-based data processing.
Discuss with your Plus10 Recruiter or complete the form below to apply for this role.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.