Site Reliability Engineer (Medellín)

Site Reliability Engineer (Medellín)

19 may
|
Agileengine
|
Medellín

19 may

Agileengine

Medellín

AgileEngine is an Inc.

5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries.

We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.

WHY JOIN US

If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!

ABOUT THE ROLE

We are looking for a SRE Operations Engineer to maintain reliability across a cloud-based SaaS platform.

You’ll handle live incidents, improve observability, and reduce toil through automation using Kubernetes, Terraform, Grafana, and AWS.

Hands-on, execution-focused, with real ownership across CI/CD pipelines, GitOps workflows, and on-call rotations.

WHAT YOU WILL DO

- Monitor and support production and staging environments to ensure availability, performance, and stability;

- Respond to incidents, perform triage and root cause analysis, and contribute to remediation efforts;

- Participate in on-call rotations with defined SLAs;

- Handle operational requests from internal teams;

- Maintain and improve monitoring, alerting, dashboards, logs, and metrics;

- Support CI/CD pipelines, production releases, and GitOps workflows;

- Contribute to automation initiatives to reduce operational overhead;

- Maintain and improve Kubernetes-based infrastructure and containerized workloads;

- Support Infrastructure as Code practices and environment improvements.

MUST HAVES

- 2+ years of experience in Site Reliability Engineering, DevOps,



or Production Operations ;

- Experience with AWS supporting production environments;

- Experience supporting production SaaS applications;

- Strong understanding of CI/CD systems (GitHub Actions, Jenkins, CircleCI);

- Experience with GitOps and Git fundamentals;

- Experience using GitHub, Jira, and Confluence ;

- Experience with Kubernetes (EKS, kOps or similar);

- Experience with Docker and containerization ;

- Experience with observability tools (Grafana, Prometheus, Loki, PagerDuty);

- Proficiency in scripting ( Bash, Python, or Go );

- Experience with Infrastructure as Code (Terraform, Helm);

- Ability to work within structured operational processes and SLAs;

- Strong written and verbal English communication skills;

- Self-driven with a growth mindset.

NICE TO HAVES

- AWS certifications such as Solutions Architect, DevOps Engineer, or SysOps Administrator;

- Experience with multi-tenant SaaS environments;

- Experience working in globally distributed teams;

- Familiarity with ChatOps practices;

- Experience improving monitoring quality and reducing alert fatigue.

PERKS AND BENEFITS

- Professional growth: Mentorship, TechTalks, and personalized growth roadmaps.

- Competitive compensation: USD-based pay with education, fitness, and team activity budgets.

- Exciting projects: Modern solutions with Fortune 500 and top product companies.

- Flextime: Adaptable schedule with remote and office options.

Required Skill Profession

Computer Occupations

📌 Site Reliability Engineer (Medellín)
🏢 Agileengine
📍 Medellín

Postulate a este anuncio

Muestra tus habilidades a la empresa, rellenar el formulario y deja un toque personal en la carta, ayudará el reclutador en la elección del candidato.

Suscribete a esta alerta:
Escribe tu dirección de correo electrónico, te permitirá de estar al tanto de los últimos empleos por: site reliability engineer (medellín) / medellín
Suscribete a esta alerta:
Escribe tu dirección de correo electrónico, te permitirá de estar al tanto de los últimos empleos por: site reliability engineer (medellín) / medellín