Job Description:
We are looking for a highly motivated and detail-oriented NOC Engineer to join our technical operations team. The ideal candidate will have strong experience with Linux systems, cloud platforms, and network troubleshooting, and be comfortable managing incidents across a wide variety of tools and technologies in a high-availability environment.
Key Responsibilities:
- Monitor and ensure the availability, reliability, and performance of infrastructure and services
- Troubleshoot incidents across platforms and escalate to the appropriate teams or vendors as required
- Use tools such as CloudWatch, Datadog, Zabbix, Opsgenie, PRTG, and MRTG for alerting and incident response
- Perform basic Linux administration and use system commands (e.g., ping, traceroute, df, ls, grep, ps, top, cat, sort, awk) for diagnostics
- Coordinate with vendors (e.g., Juniper, Cisco, Extreme, Huawei) for issue resolution
- Manage cloud infrastructure components on AWS, including EC2, Aurora, ElastiCache, CloudWatch, OpenSearch, and Support Case Management
- Operate within virtualized environments such as QEMU, KVM, Proxmox, VMware, and AWS EC2
- Manage support workflows and knowledge documentation through Jira, Confluence, Microsoft Teams, and Rundeck
- Open and manage remote hands tickets for hardware replacements, server upgrades, and physical interventions in data centers such as Equinix, CoreSite, and Cirion
- Work with CPU, RAM, Storage, and Processes for system health monitoring
- Support patch management and security monitoring using tools like Crowdstrike
Requirements:
- Proficiency in Linux system administration (RHEL, CentOS, Ubuntu) and basic Windows Server knowledge
- Strong familiarity with network troubleshooting tools and methodologies
- Experience with cloud platforms, particularly Amazon Web Services (AWS)
- Knowledge of monitoring and incident management tools: CloudWatch, Grafana, Datadog, Zabbix, Opsgenie, etc.
- Hands-on experience with virtualization and hosting platforms such as QEMU, KVM, Proxmox, and VMware
- Familiarity with web and app stacks like NGINX, Apache, PHP, Python, and Perl
- Exposure to relational databases: MySQL, PostgreSQL, Microsoft SQL Server
- Experience coordinating with network hardware vendors and managing escalations
- Experience working with datacenter operations and remote hands procedures
- Exposure to security platforms such as Crowdstrike and patch management systems
Soft Skills:
- Excellent problem-solving and troubleshooting skills
- Strong communication skills (written and verbal)
- Ability to prioritize and manage multiple tasks in a fast-paced environment
- Experience working in a global or distributed technical support team
About Us:
We at Think Future Technologies (TFT) provide technology services to our customers, enabling them to achieve superior business outcomes. We act as a trusted partner, owning the entire technology stack. We collaborate with our customers to understand their business challenges, develop appropriate solutions, deploy the right technical resources, and ensure successful project implementation.