Enable job alerts via email!

Site Reliability Engineer (SRE)

bhft

Dubai

On-site

USD 60,000 - 120,000

Full time

15 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative algorithmic trading firm is seeking a Site Reliability Engineer to ensure the reliable operation of its trading platform. This role involves enhancing production efficiency through metrics, managing incidents, and optimizing technical performance. The company values collaboration and transparency while providing flexible remote work options. Join a dynamic team of over 200 professionals, where your contributions will drive continuous improvement and innovation in trading strategies across diverse asset classes. If you are passionate about technology and trading, this opportunity is perfect for you.

Qualifications

  • Deep understanding of trading processes and market microstructure.
  • Experience with monitoring and incident management in high-load environments.

Responsibilities

  • Ensure compliance with regulations and internal standards for production stability.
  • Develop monitoring systems to detect anomalies and maintain strategy performance.

Skills

Trading Processes Understanding
Monitoring and Alerting Systems
Incident Management
Regulatory Compliance Knowledge
Python Scripting
Linux Systems Administration

Tools

Grafana
ClickHouse
Prometheus
Opsgenie
PagerDuty

Job description

BHFT is a proprietary algorithmic trading firm. Our team manages the full trading cycle, from software development to creating and coding strategies and algorithms.

Our trading operations cover key exchanges across a broad range of asset classes, including equities, equity derivatives, options, commodity futures, and rates futures. We employ diverse algorithmic trading strategies, utilizing both High-Frequency Trading (HFT) and Medium-Frequency Trading (MFT) approaches.

Looking ahead, we are expanding into new markets and products, continuously experimenting with new markets, tools, and technologies.

Our team consists of over 200 professionals, with 70% being technical specialists in development, infrastructure, testing, and analytics. The remaining team members support business operations such as Risks, Compliance, Legal, and Operations.

With a focus on innovation and performance, BHFT is actively expanding in traditional financial markets. We value a results-driven culture emphasizing collaboration, transparency, and continuous improvement, offering the flexibility of remote work within a globally distributed team.

Job Description

We are seeking a Site Reliability Engineer responsible for ensuring our platform's reliable operation, improving production process efficiency through metrics, and participating in testing new product versions.

Responsibilities :

  1. Production Stability Management : Ensure compliance with external regulations and internal standards, including risk, security, technology, and trader needs. Support and automate validation and monitoring processes to adhere to standards.
  2. Incident Monitoring & Management : Develop and enhance monitoring and alerting systems to detect anomalies in key metrics. Implement rapid response mechanisms to maintain strategy performance.
  3. Release & Change Management : Enforce standards for managing releases and changes, minimizing deployment risks. Conduct strict acceptance testing for all releases.
  4. Process Management : Develop and maintain SOPs, manage task queues, and organize shift schedules to ensure continuous support and high availability of trading strategies.
  5. Integration Projects : Lead initiatives to connect with new exchanges, brokers, and trading platforms, ensuring secure and smooth integrations.
  6. Technical Performance Optimization : Improve system availability, resilience (MTTR, MTBF), and latency, while optimizing data exchange and order routing to maximize profitability.

Qualifications

Requirements :

  • Deep understanding of trading processes and market microstructure, including colocation trading on native exchange protocols and algorithmic trading.
  • Experience with monitoring, alerting systems, and incident management in high-load environments.
  • Knowledge of regulatory compliance and security standards.
  • Proficiency with tools like Grafana, ClickHouse, Prometheus, Opsgenie, Grafana OnCall, PagerDuty.
  • Experience developing and managing SOPs and KPIs for service teams.
  • Experience managing integration projects with brokers and exchanges.

Technical Skills :

  • Linux systems administration and optimization.
  • Knowledge of FIX and native exchange protocols.
  • Colocation infrastructure setup and management.
  • Python scripting for automation and monitoring.
  • English proficiency at C1 level or higher.

J-18808-Ljbffr

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.