Enable job alerts via email!

Distributed Systems Optimization Consultant

LeadStack

Pleasanton (CA)

Remote

USD 120,000 - 160,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player seeks a Distributed Systems Optimization Consultant to enhance the performance and resiliency of their distributed systems. This role involves optimizing Apache Zookeeper, implementing fault tolerance solutions, and collaborating with engineering teams to ensure seamless integration with technologies like RabbitMQ and Kafka. The ideal candidate will have over 10 years of hands-on experience in managing distributed systems and a strong background in performance optimization. Join this forward-thinking company to make a significant impact in the world of contingent workforce solutions!

Qualifications

  • 10+ years managing and optimizing Apache Zookeeper in production.
  • Expertise in RabbitMQ, Redis, and Kafka in distributed systems.

Responsibilities

  • Optimize Zookeeper performance and enhance resiliency.
  • Collaborate with engineering teams for system scalability.

Skills

Apache Kafka
Apache ZooKeeper
RabbitMQ
Redis
Bash scripting
Python scripting
Problem-solving
Communication

Tools

Prometheus
Grafana

Job description

LeadStack Inc. is an award-winning, one of the nation's fastest-growing, certified minority-owned (MBE) staffing services provider of contingent workforce. As a recognized industry leader in contingent workforce solutions and Certified as a Great Place to Work, we're proud to partner with some of the most admired Fortune 500 brands in the world.


TITLE: Distributed Systems Optimization Consultant

LOCATION: Pleasanton, California 94588 Open for remote

DURATION: 4 weeks

Start Date - 1/6/25- Job End Date - 2/3/25


Description:


Technical skills

  • Must have
  • Apache Kafka
  • Apache ZooKeeper
  • RabbitMQ
  • Redis

Job Description -

About the Role:

  • We are seeking an experienced Apache Zookeeper Optimization Consultant to enhance the resiliency and performance of our distributed systems infrastructure.
  • The ideal candidate will possess deep expertise in Zookeeper configuration, tuning, and troubleshooting, with a strong understanding of distributed systems, high-availability requirements, and related technologies such as RabbitMQ, Redis, and Kafka.

Key Responsibilities:

  • Performance Optimization:
  • Analyze the current Zookeeper setup and identify bottlenecks affecting performance.
  • Implement tuning measures for read/write latency, throughput, and leader election times.
  • Optimize JVM parameters and Zookeeper settings (e.g., tick time, heap size).

Resiliency Enhancement:

  • Architect solutions for fault tolerance and disaster recovery.
  • Design and implement multi-region and multi-data center deployments.
  • Establish robust configurations for quorum consistency and failover mechanisms.

Monitoring and Alerting:

  • Review monitoring tools (e.g., Prometheus, Grafana) to track Zookeeper health for resiliency.
  • Develop custom alerts for potential issues such as latency spikes, memory usage, and connection limits.

Collaboration:

  • Work closely with engineering teams to ensure Zookeeper is optimized and resilient alongside other components like Kafka, RabbitMQ, Redis, and custom services.
  • Conduct capacity planning to ensure scalability for future workloads.

Qualifications:

Experience:

  • 10+ years of hands-on experience managing and optimizing Apache Zookeeper in production environments at large scale.
  • Proven track record of designing resilient distributed systems.
  • Experience with RabbitMQ, Redis, and Kafka in distributed architectures.

Technical Expertise:

  • Deep understanding of distributed systems, including Zookeeper internals (leader election, session management, quorum design).
  • Expertise in associated technologies like RabbitMQ, Redis, and Kafka, with an understanding of their integration into distributed environments.
  • Proficiency in monitoring and troubleshooting tools such as Prometheus, Grafana, or similar.

Skills:

  • Strong scripting skills (e.g., Bash, Python) for automation.
  • Excellent problem-solving and communication abilities.

Certifications (optional):

  • Relevant certifications in distributed systems, messaging technologies, or DevOps practices are a plus.

If interested, please share your updated resume and the best time and number to connect over the phone. In case you are not available/interested, will appreciate if you can share it with your friends/network. Your referrals are appreciated!


To know more about current opportunities at LeadStack, please visit us at https://leadstackinc.com/careers/

Should you have any questions, feel free to call me on 415 985-0816 or send an email on Nishanth.allam@leadstackinc.com

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.