Job Title: Site Reliability Engineer
Location: France (Remote)
Language: English
Experience: 8 Years
Duration: 6 months Contract (extendable)
Note: Banking or BFSI domain experience mandatory
Job Description
- Primary Responsibilities:
- Develop software to make infrastructure services self-managing and self-service
- Deliver continuous service improvement by developing Infrastructure as Code
- Eliminate manual, repetitive, automatable, tactical tasks that are devoid of value
- Improve system performance, optimize resource use, distribute load, and reduce latency
- Identify SLOs (Service Level Objectives) to meet availability and latency goals
- Develop proactive monitoring solutions that alert on symptoms, not just outages
- Perform detailed root cause analyses on incidents and outages to prevent recurrence
- Partner with development teams to improve services through rigorous testing and release procedures
- Identify technical debt and collaborate with application teams on remediation plans
- Develop standard operational procedures and produce effective documentation
- Analyze workloads and devise suitable cloud migration strategies where appropriate
- Ensure project/workload delivery aligns with plans and budgets
- Liaise with infrastructure control and IT risk teams to satisfy audit requests
- Deputize for the team lead when required and act accordingly
- Identify cost-saving and optimization opportunities across the group
- Build strong relationships across the organization
Essential Skills and Knowledge:- Exceptional knowledge of PowerShell, including automation, API integration, and modularization
- Strong skills in Microsoft Windows Server internals and related technologies
- Experience managing Active Directory, DHCP, DNS, LDAP, and Kerberos
- Advanced knowledge of Clustering, High-Availability, Replication, and Disaster Recovery techniques
- Ability to tune network, storage, server, and virtualization layers for performance and reliability
- Ability to interpret and implement CIS security hardening recommendations
- Experience in hardware performance monitoring and tuning low latency systems
- Fluency in Backup and Recovery processes
- Excellent performance tuning skills and knowledge of system internals and performance analysis tools
- Awareness of security and auditing requirements in regulated environments
Highly Desirable Skills:- Experience managing Ansible Tower / AWX playbooks
- Knowledge of networking protocols (TCP/IP, DNS, DHCP, VLANs)