Job Summary:

SRE Engineers are typically responsible for the availability and reliability in AWS cloud based of critical platform services and applications, ensuring they meet the requirements in terms of SLI, SLO and SLA. SRE Engineers also take part in on-call duties to fix cases related to support incident escalation. SRE engineers will collaborate with cross-function team to build and run sustainable product system.

Job Responsibilities:

  • Be on a PagerDuty rotation to respond to availability incidents and provide support for service engineers with customer incidents.
  • Debug production issues across services.
  • Proposes ideas and solutions within the infrastructure team to reduce the workload by automation.
  • Measure and optimize system performance, create dashboard, making capacity planning and innovating to continually improve.
  • Improve reliability, quality, and time-to-market of our suite of software solutions

Qualifications:

  • Bachelor’s degree in computer science/engineering or other highly technical
  • Ability to work under pressure.
  • 1-3 years in AWS Cloud service. EC2, EKS, RDS, AWS batch, runbook script
  • 1-3 years in DevOps tools ex. Jira, Gitlab, Confluence, Terraform.
  • 1-3 years in Monitoring and Dashboard ex Prometheus , Grafana, ELK.
  • Good knowledge in phyton or RPA ins preferable.