28 may
|
EPAM Systems
|
Medellín
28 may
EPAM Systems
Medellín
Postúlate en Kit Empleo: kitempleo.com.co/empleo/1as73y
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi‑national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting‑edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are looking for a Lead HPC Network Engineer to drive the strategy, architecture, and engineering excellence behind advanced AI, research, and Kubernetes‑based GPU infrastructure for a major global technology client.
The role focuses on defining the technical vision, leading architecture decisions, and setting engineering standards for high‑performance network fabrics supporting large‑scale LLM and distributed AI workloads, including InfiniBand/RDMA, high‑speed Ethernet, Kubernetes networking,
host‑side GPU networking, SmartNIC/DPU technologies, and deep network observability. As a technical leader, you will mentor senior engineers, influence client roadmaps, and own end‑to‑end delivery of mission‑critical network platforms.
The idóneo candidate combines deep expertise across InfiniBand NDR/HDR and next‑generation fabrics, RDMA/RoCE, NVIDIA/Mellanox networking, NCCL/MSCCL communication patterns, Linux host networking, PCIe/GPU/NIC topology, and Kubernetes networking for GPU clusters, with a proven track record of leading engineering teams and shaping large‑scale HPC/AI network platforms.
Responsibilities
- Own the architectural vision and long‑term roadmap for high‑performance InfiniBand/RDMA and Ethernet fabrics supporting large‑scale GPU clusters and distributed AI/LLM workloads
- Lead the design, evaluation, and selection of cluster network topologies, including Fat‑tree, Clos, Ra
Postúlate en Kit Empleo: kitempleo.com.co/empleo/1as73y
📌 Lead HPC Network Engineer (Medellín)
🏢 EPAM Systems
📍 Medellín