Iraq , Iraq
--
Company

Job Details

Job Description

Roles & Responsibilities

Responsibilities:

  • System Monitoring
    Actively monitor applications, servers, and network using dashboards and alerts; identify anomalies and take immediate action or escalate when thresholds are breached.
  • Incident Management
    Own incidents from detection to closure: validate alerts, classify severity, perform initial troubleshooting, apply fixes when possible, and escalate with proper details.
  • Troubleshooting & Support
    Perform first-level diagnosis using logs, metrics, and system checks; restart services, verify dependencies, and support L2 teams with accurate findings.
  • Communication & Reporting
    Provide clear incident updates to stakeholders, log all actions in the ticketing system, and ensure proper documentation and shift handover notes.
  • Service Availability
    Track system uptime and performance; proactively act on alerts to prevent outages and ensure SLA targets are met.
  • Change & Deployment Support
    Monitor systems during releases, validate service health post-deployment, and report or escalate any issues observed during change windows.
  • Shift Operations
    Work in assigned shifts (24/7), respond to alerts within SLA, and ensure smooth handover with complete status and pending actions.

Desired Candidate Profile

Education
Bachelor s degree in Computer Science, IT, Electronics, or related field

  • Experience
    1 3 years in NOC / SOC / IT Operations with hands-on incident handling in production environments (24/7 support experience required)
  • Networking Knowledge
    Practical understanding of TCP/IP, DNS, HTTP/HTTPS, VPN (evidenced by troubleshooting or support roles)
  • Systems Administration
    Hands-on experience with Linux and/or Windows Server (service checks, logs, basic commands, system health)
  • Monitoring Tools Experience
    Proven use of monitoring/logging tools (e.g., Elastic Stack, Kibana, Zabbix, Grafana) for alerting and issue investigation
  • Log & Metrics Analysis
    Demonstrated ability to analyze logs and system metrics to detect issues and support root cause analysis
  • Incident Management
    Experience following structured incident management and escalation processes (ticketing systems, SLAs, severity handling)

Similar Jobs