Job Overview
We're looking for a hands-on Automation Architect to lead the next generation of quality engineering for our distributed storage platform. This is a role for a true innovator, focused on writing world-class code and pioneering new approaches to testing using AI/ML and chaos engineering. You'll be the driving force behind designing automation as a self-service platform for all of engineering, focused on solving real problems, accelerating test execution, and removing friction across environments. If you thrive on solving complex problems and building tools that make engineering teams faster and smarter, this is your opportunity to make a massive impact.
Responsibilities
- Create Reusable Tools: Develop robust and reusable Python libraries and pytest fixtures to streamline testing across our APIs, CLIs, and complex workload orchestration scenarios.
- Build Automation as a Service: Design the framework as a self-service platform, creating a "paved road" that enables developers to easily write, run, and contribute to automation for their own features.
- Drive Adoption: Create clear documentation, examples, and onboarding paths to evangelize automation best practices and drive adoption of the framework across the entire engineering organization.
- Pioneer AI-Driven Testing: Research and implement modern testing strategies using lightweight AI/ML techniques (e.g., NumPy, SciPy, scikit-learn) to create more intelligent, adaptive, and realistic workloads for cluster, storage, and QoS validation.
- Uphold Code & Product Quality: Champion high standards by leading code reviews for all automation submissions, from core framework enhancements to individual test cases, ensuring a high bar for quality and maintainability in the repository.
- Test for Scale and Resilience: Architect and implement automation that validates complex distributed system behaviors, including clustering, service failover, and horizontal scaling.
- Champion Resilience & Chaos Engineering: Extend automation beyond simple failure injection to embrace the principles of chaos engineering, proactively discovering systemic weaknesses.
- Integrate Performance Testing: Seamlessly weave performance and stress testing into our CI/CD pipelines using tools like fio, IOR, Minio Warp, Mongoose and MLPerf to validate throughput, latency, and system resilience under pressure.
- Scale with Modern Infrastructure: Design and deploy automation that runs with high efficiency and throughput across Kubernetes, Docker, hypervisors, and bare‑metal systems, ensuring test execution scales seamlessly with development.
- Drive Telemetry-Driven Quality: Integrate test results with our observability stack (Grafana, Prometheus, ELK) to move beyond simple pass/fail and validate quality using rich system telemetry.
- Mentor & Lead: Act as a key technical leader and mentor for QE and Development engineers worldwide, elevating their skills in Python, pytest, and modern automation design patterns.
Qualifications
- Expert-Level Python: Deep, hands‑on mastery of Python, including pytest (fixtures, plugins, parametrization), asyncio, and building scalable frameworks.
- Distributed Systems: A strong understanding of clustering, fault tolerance, and horizontal scaling principles. Experience with machine orchestration is highly desirable.
- Linux & Storage Systems: Extensive experience with Linux (Ubuntu/RHEL) and a strong understanding of storage protocols like S3/Object, NVMe/iSCSI, and NFS/SMB.
- Performance & Orchestration: Proven ability to integrate performance tools (fio, IOR, Minio Warp) and orchestrate tests within Docker and Kubernetes.
- CI/CD Expertise: A strong command of Jenkins or GitHub Actions for building, maintaining, and troubleshooting complex automation pipelines.
- Observability: Experience using Grafana, Prometheus, or the ELK Stack to analyze test results and system behavior.
- AI/ML for QA (Preferred): Experience applying data science or machine learning techniques to solve testing problems. Familiarity with libraries like Pandas, NumPy, SciPy, and scikit‑learn is a strong plus.
- Scripting: Proficiency in Bash is a must. Bonus points for Go or C++ experience.
Leadership & Soft Skills
- A Builder's Mindset: You have a demonstrated history of writing and owning code, not just configuring off‑the‑shelf tools.
- A Passion for Enablement: You are dedicated to building tools that other engineers find intuitive and powerful, and you are driven to help them succeed.
- Commitment to Quality: You believe that rigorous code reviews are essential for building robust, maintainable automation and for sharing knowledge across the team.
- Strategic Thinker: You can design a high‑level automation strategy while also diving deep into the code to solve complex technical challenges.
- Natural Mentor: You find satisfaction in teaching others and helping your colleagues grow their technical skills.
- Excellent Communicator: You can clearly articulate complex technical ideas to both technical and non‑technical stakeholders.