Enable job alerts via email!

Fullstack Engineer, Observability & SRE - (Remote)

ArcheSys Inc

Baltimore (MD)

Remote

USD 100,000 - 125,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology firm specializing in cloud solutions is seeking a Fullstack Engineer, Observability & SRE to enhance their monitoring solutions. The role focuses on developing Grafana dashboards, data pipelines, and AWS infrastructure management, ensuring operational excellence in a fully remote work environment. Ideal candidates will possess a strong background in software engineering, observability, and SRE practices, bringing innovative solutions to the team.

Benefits

Health insurance
Dental insurance
Vision insurance
Retirement plan
Generous paid time off
Flexible work arrangements

Qualifications

  • 4-7 years of experience in a Fullstack Development or SRE role.
  • Experience designing and maintaining Grafana dashboards.
  • Proficient in AWS services (EC2, Lambda, RDS, etc.).

Responsibilities

  • Design, develop, and maintain Grafana dashboards that visualize key metrics.
  • Implement and manage ETL/ELT pipelines for data processing.
  • Design, deploy, and manage AWS infrastructure for observability.

Skills

Grafana
AWS
DevOps Automation
Data Engineering
Site Reliability Engineering
Programming (Python, Go, Java, Node.js)

Education

Bachelor's degree in Computer Science, Software Engineering, or related field

Tools

Terraform
CloudFormation
Ansible
Jenkins
GitLab CI

Job description

Fullstack Engineer, Observability & SRE - (Remote)

Join to apply for the Fullstack Engineer, Observability & SRE - (Remote) role at ArcheSys Inc

Fullstack Engineer, Observability & SRE - (Remote)

Join to apply for the Fullstack Engineer, Observability & SRE - (Remote) role at ArcheSys Inc

This range is provided by ArcheSys Inc. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$100,000.00/yr - $125,000.00/yr

Archesys is a technology firm specializing in innovative cloud solutions and services for clients across various industries. We pride ourselves on our cutting-edge technologies, exceptional customer service, and collaborative work environment.

We're looking for a highly motivated and skilled Fullstack Engineer to join our team, focusing on Observability and Site Reliability Engineering (SRE). In this critical role, you'll be at the forefront of designing, developing, deploying, and ensuring the operational excellence of our Grafana dashboards and the vital data pipelines that feed them. You'll bridge the gap between intuitive front-end visualizations and robust back-end data engineering, ensuring our monitoring solutions are scalable, resilient, and provide actionable insights for our engineering and operations teams. You'll also play a key role in building and automating our DevOps pipelines, all while leveraging the power of AWS.

This position demands a blend of strong software engineering prowess, a deep understanding of SRE principles, expertise in leveraging Grafana to its fullest potential, and significant experience with AWS cloud services and DevOps build automation. You'll be instrumental in enhancing our system visibility, enabling proactive issue detection, and driving continuous improvement in our service reliability.

This is a fully remote, full-time position.

Key Responsibilities:

Grafana Dashboard Development:

Design, develop, and maintain comprehensive, intuitive, and real-time Grafana dashboards that visualize key operational metrics, business KPIs, and application logs.

  • Collaborate with SRE, development, and product teams to gather requirements and translate complex data into clear, actionable visualizations.
  • Optimize Grafana dashboards for performance, scalability, and usability, ensuring quick loading times and effective data presentation.
  • Implement alerting rules within Grafana to proactively notify teams of anomalies and potential issues.

Data Pipeline Engineering (Backend Focus):

  • Design and implement robust ETL/ELT pipelines to extract, transform, and load data from various sources (e.g., Prometheus, Splunk, CloudWatch, RDS, OpenTelemetry, custom APIs) into data stores consumable by Grafana.
  • Write and optimize complex queries (SQL, PromQL, Splunk SPL, etc.) to ensure data accuracy and efficiency.
  • Develop and maintain APIs to facilitate data exchange and integration between different system components and monitoring tools.
  • Implement data quality checks, performance tuning (indexing, partitioning), and backup/restore strategies for data sources.

AWS Infrastructure Management:

  • Design, deploy, and manage scalable and resilient AWS infrastructure to support Grafana instances, data sources, and related services.
  • Utilize AWS services such as EC2, ECS/EKS, Lambda, S3, RDS, CloudWatch, Kinesis, DynamoDB, and others to build and optimize our observability platform.
  • Implement security best practices within the AWS environment, including IAM roles, security groups, and network configurations.

DevOps Build Automation:

  • Design, implement, and maintain robust CI/CD pipelines for automating the build, testing, and deployment of Grafana dashboards, underlying data pipelines, and infrastructure as code.
  • Utilize tools like AWS CodePipeline, Jenkins, GitLab CI, or similar for continuous integration and continuous deployment.
  • Develop and maintain Infrastructure as Code (IaC) using Terraform, CloudFormation, or Ansible for managing all AWS resources.
  • Automate operational tasks, monitoring deployments, and testing processes to improve efficiency and reliability.

Site Reliability Engineering (SRE) Practices:

  • Apply SRE principles to ensure the reliability, scalability, and performance of our monitoring and observability infrastructure.
  • Participate in on-call rotations, responding to alerts and incidents related to dashboard functionality, data accuracy, and performance.
  • Conduct root cause analysis (RCA) for incidents and implement corrective actions to prevent recurrence.
  • Define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for key services, ensuring dashboards reflect these metrics accurately.

Documentation and Collaboration:

  • Work closely with cross-functional teams (development, operations, product) to understand monitoring needs and provide expert guidance on observability best practices.
  • Create and maintain comprehensive documentation detailing dashboard designs, data sources, query logic, AWS architecture, and operational procedures.
  • Contribute to code reviews, promote best practices, and mentor junior team members.

Qualifications:

  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field, or equivalent practical experience.
  • 4-7 years of experience in a Fullstack Development, Data Engineering, or SRE role with a strong focus on monitoring, observability, and AWS infrastructure.
  • Proven hands-on experience designing, developing, and maintaining complex Grafana dashboards.
  • Strong proficiency in at least one backend programming language (e.g., Python, Go, Java, Node.js).
  • Extensive experience with various data sources for Grafana (e.g., Prometheus, Loki, Splunk, SQL databases, CloudWatch).
  • Deep hands-on experience with AWS cloud services, including but not limited to EC2, ECS/EKS, Lambda, S3, RDS, CloudWatch, Kinesis, DynamoDB.
  • Proven experience designing and implementing robust CI/CD pipelines and DevOps automation using tools like AWS CodePipeline, Jenkins, GitLab CI, or similar.
  • Strong experience with Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Ansible.
  • Solid understanding of SRE principles, including SLOs, SLIs, error budgets, toil reduction, and incident management.
  • Experience with containerization technologies (Docker, Kubernetes).
  • Excellent analytical and problem-solving skills with a keen eye for detail.
  • Strong communication and interpersonal skills, with the ability to articulate complex technical concepts clearly to diverse audiences.
  • Ability to work independently and collaboratively in a fast-paced, dynamic environment.

Nice to Have:

  • AWS Certifications (e.g., Solutions Architect, DevOps Engineer).
  • Experience with other observability tools (e.g., Datadog, New Relic, OpenTelemetry).
  • Knowledge of distributed tracing concepts and tools (e.g., Jaeger, Tempo).
  • Experience with machine learning for anomaly detection in time-series data.
  • Contributions to open-source projects related to Grafana or observability.

What We Offer:

  • Competitive salary and benefits package, including health, dental, and vision insurance, retirement plan, and generous paid time off.
  • Opportunity to work with a talented team of professionals on exciting and innovative projects.
  • Flexible work arrangements, including remote work options.
  • Continuous learning and development opportunities, including access to training resources and professional development programs.
  • A collaborative, inclusive work environment that values diversity and encourages growth.

Join us at Archesys and be part of a team dedicated to delivering cutting-edge cloud solutions for clients in the public sector. Your expertise and passion for technology will help us continue to innovate and grow. We look forward to welcoming you to our team and supporting your success as a Fullstack Engineer, Observability & SRE

Archesys participate in E-Verify. Upon hire, we will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S.

Archesys is an equal opportunity employer committed to creating a diverse and inclusive workplace. We welcome applications from all qualified candidates, regardless of race, color, religion, sex.

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Engineering and Information Technology
  • Industries
    IT Services and IT Consulting

Referrals increase your chances of interviewing at ArcheSys Inc by 2x

Sign in to set job alerts for “Full Stack Engineer” roles.
Full Stack Software Engineer (.Net & Angular)
Software Engineer - Solutions Engineering

Maryland, United States
$160,100.00
-
$188,100.00
2 weeks ago

Fort Meade, MD
$120,000.00
-
$150,000.00
19 hours ago

Full Stack Ruby on Rails Developer (Remote)

Baltimore, MD
$100,000.00
-
$120,000.00
1 day ago

Software Engineer I with Public Trust Clearance
Software Engineer I (Merchant Data Platform)
Full Stack Senior Software Engineer - Java/Kotlin

Maryland, United States
$132,000.00
-
$222,100.00
2 months ago

Maryland, United States
$132,000.00
-
$222,100.00
2 months ago

Software Engineer II, Backend (Identity Decisioning)
Senior Staff Software Engineer, Purchase

Maryland, United States
$152,400.00
-
$247,800.00
1 month ago

Baltimore, MD
$120,000.00
-
$150,000.00
23 hours ago

Graduate Software Engineer, Open Source and Linux, Canonical Ubuntu
Software Engineer II, Backend (Consumer Authentication)
Front-End Engineer - Accessibility Specialist (Remote)
Software Engineer (Python/Linux/Packaging)
Software Engineer, Ceph & Distributed Storage
Distributed Systems Software Engineer, Python / Go

Maryland, United States $132,000 - $222,100 6 days ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Engineer

Jobot

Baltimore

On-site

USD 120,000 - 150,000

2 days ago
Be an early applicant