Enable job alerts via email!

Junior PySpark Engineer – AWS/EMR

Big Resourcing

Massachusetts

Hybrid

USD 80,000 - 110,000

Full time

13 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative technology consulting firm is seeking a Junior PySpark Engineer to join their dynamic team. In this role, you will leverage your strong programming skills in Python and hands-on experience with Apache Spark and AWS EMR to design and develop robust data pipelines. This coding-heavy position will allow you to work in a GxP-compliant environment, ensuring data integrity while collaborating with cross-functional teams. If you're passionate about data engineering and eager to contribute to cutting-edge solutions in a cloud-native setting, this is the perfect opportunity for you.

Qualifications

  • 2-4 years of experience in software or data engineering with a focus on distributed systems.
  • Deep hands-on experience with Apache Spark, PySpark, and AWS EMR.

Responsibilities

  • Design and maintain distributed ETL data pipelines using PySpark on AWS EMR.
  • Collaborate with teams to deliver end-to-end data solutions.

Skills

Python
PySpark
Apache Spark
AWS EMR
GxP compliance

Tools

Databricks
AWS

Job description

Location: Remote (EST Time Zone Preferred)- 5 Days a month in the Office

Duration: 6 Months Contract

About BigRio:

BigRio is a remote-based, technology consulting firm with headquarters in Boston, MA. We deliver software solutions ranging from custom development and software implementation to data analytics and machine learning/AI integrations. As a one-stop shop, we attract clients from a variety of industries due to our proven ability to deliver cutting-edge, cost-effective software solutions.

Job Overview:

We are seeking a Junior PySpark Engineer with strong hands-on experience in building distributed data pipelines using Apache Spark on AWS EMR. The ideal candidate is proficient in Python, has worked with Databricks, and has a solid understanding of GxP-compliant environments. This is a coding-heavy role — not DevOps or AWS administration — where you’ll contribute directly to the architecture and development of robust data solutions in a highly regulated, cloud-native environment.

Key Responsibilities:

  • Design, develop, and maintain distributed ETL data pipelines using PySpark on AWS EMR
  • Work within a GxP-compliant environment, ensuring data integrity and regulatory alignment
  • Write clean, scalable, and efficient PySpark code for large-scale data processing
  • Utilize AWS cloud services for pipeline orchestration, compute, and storage
  • Collaborate closely with cross-functional teams to deliver end-to-end data solutions
  • Participate in code reviews, testing, and deployment of pipeline components
  • Ensure performance optimization, fault tolerance, and scalability of data workflows
  • 2–4 years of experience in software or data engineering with a focus on distributed systems
  • Deep hands-on experience with Apache Spark, PySpark, and AWS (especially EMR)
  • Strong programming skills in Python
  • Solid understanding of cloud-native architectures
  • Familiarity with GxP compliance and working in regulated data environments
  • Proven ability to independently design and develop data pipelines (not a DevOps/AWS admin role)
  • Experience with distributed computing and high-volume ETL pipelines

Equal Opportunity Statement:

BigRio is an equal-opportunity employer. We prohibit discrimination and harassment of any kind based on race, religion, national origin, sex, sexual orientation, gender identity, age, pregnancy, status as a qualified individual with disability, protected veteran status, or other protected characteristic as outlined by federal, state, or local laws. BigRio makes hiring decisions based solely on qualifications, merit, and business needs at the time. All qualified applicants will receive equal consideration for employment.

BigRio is a cutting-edge technology company committed to being your strategic partner in accelerating digital transformation and fostering innovation. With a relentless focus on delivering exceptional solutions, we empower businesses to thrive in the rapidly evolving digital landscape.

  • Harvard Square, One Mifflin Place
    Suite 400
    Cambridge, MA 02138
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Call Center Supervisor- Remote- Massachusetts Only

Atrius Health

Norwood

Remote

USD 60,000 - 90,000

6 days ago
Be an early applicant

Infor Security Analyst-EPIC

Southcoast Health System, Inc.

New Bedford

Remote

USD 80,000 - 100,000

Yesterday
Be an early applicant

Infor Security Analyst-EPIC

Southcoast Health System

New Bedford

Remote

USD 80,000 - 100,000

2 days ago
Be an early applicant

Call Center Supervisor- Remote- Massachusetts Only

Freddie Mac

Norwood

Remote

USD 60,000 - 90,000

2 days ago
Be an early applicant

Managing Consultant, Monitoring, Evaluation, and Assessment (MEA)

BME Strategies

Massachusetts

Remote

USD 100,000 - 140,000

3 days ago
Be an early applicant

Aible Clinic Consulting Specialist -Remote USA

Medtronic plc

Danvers

Remote

USD 74,000 - 112,000

4 days ago
Be an early applicant

Outpatient Technical Advisor, Coding

Mass General Brigham

Somerville

Remote

USD 70,000 - 90,000

Today
Be an early applicant

Data Scientist - NLP & Real World Evidence (RWE) - Remote

噥牡摩杭

Boston

Remote

USD 90,000 - 130,000

Today
Be an early applicant

QA Scheduling Specialist

Mass General Brigham

Somerville

Remote

USD 65,000 - 85,000

Today
Be an early applicant