Site Reliability Engineer (Sre) (Santander)

Site Reliability Engineer (Sre) (Santander)

30 may
|
Michael Page Colombia
|
Santander

30 may

Michael Page Colombia

Santander

Build reliable, scalable systems through automation and engineering.
Improve service stability using SLOs, monitoring and incident response.
Acerca de nuestro cliente
A U.S.-based e-commerce organization specializing in personalized products, operating high-volume digital platforms supported by integral teams.
The company emphasizes technology-driven operations, strong customer experience, and scalable infrastructure to support rapid growth and large production capacity.
Descripción
Reliability & Performance
Define and manage SLIs, SLOs, and error budgets.
Improve system reliability, scalability, and resilience.
Lead reliability reviews and prevent incidents proactively.
Observability & Monitoring
Build and maintain monitoring, logging, and alerting.
Ensure actionable alerts and effective dashboards.
Implement distributed tracing.
Automation & Tooling
Automate operational tasks to reduce toil.
Build tools for reliability and automated remediation.
CI/CD & Deployments
Improve CI/CD pipelines for safe deployments.
Implement canary, blue/green, and rollback strategies.
Ensure production readiness.
Incident Management
Join on-call rotations.
Lead incident response and post-incident reviews.
Promote a blameless culture.
Cloud & Infrastructure
Manage AWS/Azure cloud environments.
Work with containers, serverless, and event-driven systems.
Ensure scalable, secure, and cost-efficient infrastructure.
Infrastructure as Code
Build and manage infrastructure using Terraform.
Maintain automated and consistent provisioning.




Security & Compliance
Embed security in CI/CD pipelines.
Support audits and compliance activities.
Perfil buscado (h/m)
4+ years of experience in SRE, DevOps, or Platform Engineering.
Strong software engineering mindset and programming/scripting skills (Python, Go, Bash, etc.).
Hands‐on experience with AWS or Azure cloud environments.
Solid understanding of distributed systems and cloud-native architectures.
Proficiency with Terraform and Infrastructure as Code practices.
Experience defining and managing SLIs, SLOs, and error budgets.
Strong background in observability: monitoring, logging, alerting, and tracing.
Experience improving CI/CD pipelines and deployment strategies.
Ability to lead incident response and conduct blameless postmortems.
Familiarity with automation, reliability tooling, and reducing operational toil.
Strong analytical and problem‐solving skills.
Excellent communication skills and ability to partner with engineering teams.
Proactive, detail-oriented, and focused on continuous improvement.
Advanced English (B2-C1) required for daily communication with international teams.
Qué Ofrecemos
100% remote role from Colombia.
Undefined contract through Michael Page Colombia.
Exposure to modern SRE practices, automation frameworks, resilience engineering, and cloud-native tooling.
Professional growth through complex technical challenges and continuous learning.
Chance to work with global teams and cutting-edge cloud technologies across AWS and Azure.
#J-*****-Ljbffr

📌 Site Reliability Engineer (Sre) (Santander)
🏢 Michael Page Colombia
📍 Santander

Postulate a este anuncio

Muestra tus habilidades a la empresa, rellenar el formulario y deja un toque personal en la carta, ayudará el reclutador en la elección del candidato.

Suscribete a esta alerta:
Escribe tu dirección de correo electrónico, te permitirá de estar al tanto de los últimos empleos por: site reliability engineer (sre) (santander) / santander
Suscribete a esta alerta:
Escribe tu dirección de correo electrónico, te permitirá de estar al tanto de los últimos empleos por: site reliability engineer (sre) (santander) / santander