Enable job alerts via email!

Principal Software Engineer - AIOPS (Hybrid)

Capital Markets Placement

Plano (TX)

Hybrid

USD 130,000 - 197,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player seeks a Principal Site Reliability Engineer with a focus on observability and AIOps. This role involves optimizing large-scale systems and ensuring reliability while collaborating to architect unified observability strategies. You will develop API-driven micro-services, enhance automation processes, and drive innovation in a dynamic environment. If you are passionate about solving complex problems and have a strong background in software scalability and performance, this opportunity is perfect for you. Join a team committed to excellence and make a significant impact on technology solutions.

Benefits

Comprehensive medical coverage
Dental coverage
Vision coverage
Retirement benefits
Maternity/Paternity leave
Flexible work arrangements
Education reimbursement
Wellness programs
Paid time off policy

Qualifications

  • 7+ years of application development experience.
  • Experience in large scale operations environments.
  • Ability to identify and implement automation tasks.

Responsibilities

  • Developing API-driven micro-services for complex platforms.
  • Maintaining automated test suites using CI/CD tools.
  • Participating in troubleshooting and capacity planning.

Skills

Python
Java
Go
Ruby
Linux/Unix
Networking Systems
Automation
Configuration Management

Education

BA/BS in Computer Science
Advanced Degrees

Tools

Puppet
Chef
SaltStack

Job description

3 Days Hybrid from any of our locations in RI, NJ,GA, MA, NC, TX or AZ

Role is not relocation eligible.

Principal Site Reliability Engineer - Observability / AIOps

Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures internally critical and externally visible systems have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance. SRE is a mindset, and a set of engineering approaches focused on optimizing existing systems, building infrastructure, and eliminating work through automation. As a Site Reliability Engineer with focus on observability you will build and operate next generation observability platforms.

As an SRE with Observability focus you will :

  • Explore the complex IT estates of our clients to understand their observability / AIOps opportunities
  • Collaborate to architect unified observability and AIOps strategies which employ leading AI technology
  • Implement enterprise observability / AIOps technology and processes
  • Amplify observability / AIOps outcomes by accelerating adoption across technology and business organizations

Responsibilities include :

  • Developing API-driven micro-services that combine into large and complex platforms
  • Planning and executing highly parallel distributed object storage transformations and migrations
  • Maintaining automated test suites using CI / CD tools
  • Participating in collaborative projects with small software engineering teams
  • Develop automation, processes, and tools designed to make our services simpler and more robust
  • Participate in troubleshooting, capacity planning and analysis, performance analysis activities
  • Advise management on service onboarding strategies and execution

Critical Hiring Criteria

What we are looking for :

  • Entrepreneurs who seek challenging problems to solve
  • Creativity, initiative and acute attention to detail
  • Thirst for innovation and solving problems at lightning speed
  • Passion for automating everything repetitive
  • Obsession with software scalability and performance under high loads
  • Love for using and contributing to open-source software

Please bring to the table :

  • Development experience, comfortable working in multiple languages(Python, Java, Go and Ruby a plus)
  • Experience working in collaborative coding environments (peer review, continuous integration, etc)
  • 7+ years of application development
  • Experience in large scale operations environments
  • 7+ years of experience with Linux / Unix development or systems administration
  • 3+ years of experience with networking systems and technologies
  • Deep understanding of network performance and security
  • Ability to identify tasks which require automation and implement required automation
  • Configuration Management tools experience with Puppet, Chef, SaltStack
  • Hands-on operational experience in a high-volume or critical production service environment - distributed systems, capacity planning, continuous deployment
  • BA / BS in Computer Science preferred, or equivalent experience (advanced degrees preferred)

We have opportunities to work with and learn :

  • Relational database technologies at large scale - Timescale / Vitess / Postgres / etc

Pay Transparency

The salary range for this position is $ 130,720 - $ 196,080 per year plus an opportunity to earn an annual discretionary bonus. Actual pay is based on various factors including but not limited to the work location, and relevant skills and experience.

We offer competitive pay, comprehensive medical, dental and vision coverage, retirement benefits, maternity / paternity leave, flexible work arrangements, education reimbursement, wellness programs and more. Note, Citizens' paid time off policy exceeds the mandatory, paid sick or paid time-away policy of very local and state jurisdiction in the United States. For an overview of our benefits, visit https : / / jobs.citizensbank.com / benefits.

About Us

Citizens, its parent, subsidiaries, and related companies (Citizens), provides equal employment and advancement opportunities to all colleagues and applicants for employment without regard to age, ancestry, color, citizenship, physical or mental disability, perceived disability, or history or record of a disability, ethnicity, gender, gender identity or expression, transgendered and transitioning individuals, genetic information, genetic characteristic, marital or domestic partner status, victim of domestic violence, family status / parenthood, medical condition, military or veteran status, national origin, pregnancy / childbirth / lactation, colleague's or a dependent's reproductive health decision making, race, religion, sex, sexual orientation, or any other category protected by federal, state and / or local laws. At Citizens we are committed to fostering an inclusive culture that enables colleagues to bring their best selves to work every day and where all are expected to be treated with respect and professionalism. Employment decisions are based solely on experience, performance, and ability. We perform our best so we can do more for our customers, colleagues, communities and shareholders.

Equal Employment and Opportunity Employer

Job Applicant Data Privacy Policy

Any offer of employment is conditioned upon the candidate successfully passing a background check, which may include initial credit, motor vehicle record, public record, prior employment verification, and criminal background checks. Results of the background check are individually reviewed based upon legal requirements imposed by our regulators and with consideration of the nature and gravity of the background history and the job offered. Any offer of employment will include further information.

Create a job alert for this search

Principal Software Engineer • Plano, TX, United States

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.