Enable job alerts via email!

Senior System Engineer

Core42 Technology Projects LLC

Baltimore (MD)

On-site

USD 109,000 - 165,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Senior Systems Engineer, where you'll play a pivotal role in deploying and managing cutting-edge AI infrastructure. This dynamic position involves provisioning large-scale GPU-based clusters and collaborating with a diverse team to ensure platform reliability. As part of an inclusive and innovative environment, you'll leverage your expertise in Linux and troubleshooting to contribute to transformative technology solutions. If you're passionate about AI and eager to make a meaningful impact, this opportunity is perfect for you.

Benefits

Healthcare Options
401(k) Plan with Company Matching
Paid Time Off
Life Insurance
Short-term and Long-term Disability Coverage

Qualifications

  • 3+ years of experience in systems engineering or AI infrastructure management.
  • Proficiency in Linux system administration and troubleshooting.
  • Experience with GPU-based AI clusters and workload orchestration.

Responsibilities

  • Deploy and manage GPU-based AI compute nodes and storage systems.
  • Ensure high availability and reliability of AI workloads.
  • Diagnose and resolve hardware, software, and network issues.

Skills

Linux System Administration
Troubleshooting Skills
AI Infrastructure Management
HPC Environments
Collaboration with Vendors

Education

Bachelor's Degree in Computer Science
Equivalent Degree

Tools

Ansible
Terraform
Bash
Python
Kubernetes
Docker
Prometheus
Grafana
ELK Stack

Job description

Senior Systems Engineer

Core42, a leader in AI-powered cloud and digital infrastructure, is driving transformative technology solutions globally. Leveraging advanced resources and partnerships, Core42 empowers clients to harness sovereign AI infrastructure, especially in sectors with stringent regulatory needs. With a mission to redefine digital transformation, we combine sovereign capabilities with scalable, high-performance compute infrastructure, positioning ourselves at the forefront of AI innovation in the Middle East and beyond.

With a diverse team of 1,100+ employees globally from ~70 nationalities, we foster an inclusive, innovative, and collaborative environment. At Core42, we foster a culture grounded in trust, accountability and high performance. We are united by our values: Grit, where we overcome challenges with resilience and determination, Passion, which drives us to pursue excellence in everything we do, and Impact, as we aim to inspire progress and create meaningful change. Our team members thrive in an environment where each person’s contributions propel us forward, and together, we commit to achieving extraordinary results.

The Opportunity

The Systems Engineer will be responsible for the provisioning, rollout, and maintenance of large-scale GPU-based AI clusters. This role will focus on deploying and managing AI computing infrastructure, troubleshooting system issues, and coordinating with onsite personnel and vendors to ensure platform reliability and efficiency.

The ideal candidate will have experience in AI and HPC environments, strong Linux system administration skills, and expertise in hardware, software, and networking troubleshooting.

Key Responsibilities

  • Deploy and configure GPU-based AI compute nodes, storage systems, and networking components.
  • Manage firmware, BIOS, and driver updates to maintain system stability and performance.
  • Work with automation tools to streamline infrastructure provisioning and configuration.
  • Ensure high availability, reliability, and scalability of AI workloads.
  • Diagnose and resolve hardware, software, and network issues in collaboration with onsite teams.
  • Perform root cause analysis (RCA) and corrective actions to prevent recurring failures.
  • Work with vendors (NVIDIA, AMD, Intel, Dell, HPE, etc.) to escalate and resolve technical issues.
  • Provide hands-on troubleshooting support for compute, management, and storage fabrics.
  • Implement and maintain monitoring and alerting tools to track system health and performance.
  • Actively monitor GPU utilization, memory management, and workload distribution.
  • Assist in capacity planning and scaling to support growing AI workloads.
  • Work closely with networking, storage, and DevOps teams to ensure seamless integration of AI workloads.
  • Document procedures, system configurations, and troubleshooting guides.
  • Assist in developing best practices and SOPs for AI infrastructure operations.

Required Qualifications

  • Bachelor’s Degree in Computer Science or Equivalent
  • 3+ years of experience in systems engineering, HPC, or AI infrastructure management.
  • BA/BS or higher degree in Computer Science or Equivalent
  • Proficiency in Linux system administration (RHEL, Ubuntu, Rocky Linux, etc.).
  • Experience with GPU-based AI clusters and workload orchestration.
  • Strong troubleshooting skills in compute, storage, and networking environments.
  • Familiarity with automation tools (Ansible, Terraform, Bash, Python, etc.).
  • Experience working with on-premise AI infrastructure and cloud-based AI platforms.
  • Ability to collaborate with onsite personnel and vendors for issue resolution.

Preferred Qualifications

  • Experience with containerized AI workloads (Kubernetes, Docker, Singularity).
  • Knowledge of high-speed Ethernet networking and distributed storage.
  • Familiarity with monitoring tools (Prometheus, Grafana, ELK stack, etc.).
  • Certifications such as RHCSA, NVIDIA DLI, or Kubernetes CKA.

Compensation & Benefits

The base salary for this full-time position ranges from $109,950 in our lowest geographic market to $164,900 in our highest geographic market. The actual base salary will be determined by various factors, including the position’s location, job-related skills, knowledge, experience, and relevant education or training.

Certain roles are eligible for additional rewards, such as merit-based salary increases, annual bonuses, and long-term incentive plans, which are contingent on individual and company performance. Additionally, some positions offer the opportunity to earn sales incentives based on revenue or utilization targets.

As a full-time employee, you will also have access to comprehensive benefits, including leading healthcare options (medical, dental, and vision insurance), a 401(k) plan with company matching, company-sponsored short-term and long-term disability coverage, life insurance, paid time off, and various well-being benefits, among others.

Equal Employment Opportunity

Core42 is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances.

If you need assistance and/or a reasonable accommodation to participate in the job application or interview process, or to perform the essential functions of the position, please contact us at USA-ExternalCandidates@core42.ai.

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Information Technology
  • Industries
    IT Services and IT Consulting

Referrals increase your chances of interviewing at Core42 by 2x

Get notified about new Senior System Engineer jobs in United States.

Sr. IBM i Systems Engineer - Consultant - REMOTE
Senior Cybersecurity Systems Engineer - Remote

United States $80,000.00-$100,000.00 3 weeks ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Sr. Systems Engineer

Motorola Solutions

Remote

USD 98.000 - 197.000

Yesterday
Be an early applicant

Sr. Systems Engineer

Virtua

Remote

USD 80.000 - 120.000

6 days ago
Be an early applicant

Senior system engineer

North Bridge

Remote

USD 80.000 - 130.000

3 days ago
Be an early applicant

Sr. Systems Engineer

Motorola Solutions

Philadelphia

Remote

USD 98.000 - 197.000

5 days ago
Be an early applicant

Sr. IT Systems Engineer - REMOTE

ZipRecruiter

Cincinnati

Remote

USD 80.000 - 110.000

5 days ago
Be an early applicant

Senior Design System Engineer II

Corporation PSI Intl Inc

Remote

USD 159.000 - 239.000

9 days ago

Sr. Systems Engineer

ASCENDING Inc.

New York

Remote

USD 90.000 - 150.000

12 days ago

Sr Systems Engineer

RailWorks

Baltimore

On-site

USD 120.000 - 165.000

Yesterday
Be an early applicant

Senior System Engineer

Franklin Fitch

New York

Remote

USD 80.000 - 120.000

10 days ago