Principal Site Reliability Engineer |

hace 2 días


Madrid, Madrid, España Groupon A tiempo completo

Groupon is a marketplace where customers discover new experiences and services every day and local businesses thrive.
To date, we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories.
In a world often dominated by e-commerce giants, we stand out as one of the few platforms uniquely committed to helping local businesses succeed on a performance basis. Groupon is on a radical journey to transform our business with a relentless pursuit of results.
Even with thousands of employees spread across multiple continents, we still maintain a culture that inspires innovation, rewards risk-taking, and celebrates success.

The impact here can be immediate due to our scale and the speed of our transformation.
We're a "best of both worlds" kind of company. We're big enough to have the resources and scale, but small enough that a single person has a surprising amount of autonomy and can make a meaningful impact.

Principal Site Reliability Engineer

Role Overview:
Are you ready to take your expertise to the next level and make a meaningful impact on the reliability and scalability of mission-critical systems?
As a Principal Site Reliability Engineer (SRE Level V/VI), you will play a central role in ensuring the performance, availability, and resilience of our platforms. In this position, you will go beyond maintaining systems by leading initiatives that redefine operational excellence. You will collaborate with diverse teams to implement cutting-edge technologies and best practices, foster a culture of reliability, and mentor others in their growth as engineers.

This is an exceptional opportunity for someone passionate about solving complex challenges and shaping the future of platform reliability in a high-impact role.

Key Responsibilities:
  1. Architect and maintain fault-tolerant systems, ensuring uptime SLAs of 99.9% or higher.
  2. Drive automation in infrastructure management and deployment using Terraform, Ansible, Kubernetes, and similar tools.
  3. Create and optimize CI/CD pipelines to ensure reliable, secure, and efficient software delivery.
  4. Build and enhance comprehensive observability solutions, including monitoring, logging, and alerting systems using Prometheus, Grafana, and the ELK stack.
  5. Collaborate with stakeholders to define and achieve SLIs, SLOs, and error budgets aligned with business needs.
  6. Lead incident response during on-call rotations, ensuring rapid resolution and root cause analysis for critical issues.
  7. Design and execute performance testing, capacity planning, and scalability strategies for evolving workloads.
  8. Proactively identify and resolve bottlenecks, increasing system performance and developer efficiency.
  9. Mentor junior engineers, fostering a collaborative and growth-oriented team environment.
  10. Guide architectural decisions that drive innovation and enhance system reliability.
#J-18808-Ljbffr

  • Madrid, Madrid, España Oxigent Technologies A tiempo completo

    ¿Te interesaría seguir desarrollándote como Site Reliability Engineer (SRE) en una empresa líder en logística y mensajería? Desde Oxigent Technologies seleccionamos un/a Site Reliability Engineer (SRE) para participar en un proyecto de infraestrcutura cloud en remoto. ¿Cuáles serán tus funciones principales? • Gestion de infraestructura en Google...

  • Site Reliability Engineer

    hace 2 semanas


    Madrid, Madrid, España Ll Oefentherapie A tiempo completo

    Are you a creative person who loves a challenge? Solve the complex puzzles you've been dreaming of as our Site Reliability Engineer. If you have a passion for innovation in tech, we want you on our team Thrive in this crucial role. Oracle is a technology leader that's changing how the world does business. We're looking for an experienced and self-motivated...


  • Madrid, Madrid, España PIXIE A tiempo completo

    PIXIE es una agencia focalizada en proyectos de SAP, IT y Salesforce. Fundada en 2007, se encarga de dar servicio a múltiples clientes tanto nacionales como internacionales. En este momento estamos en búsqueda de varios perfiles de Site Reliability Engineer ¿Qué ofrecemos? Proyecto Freelance Jornada completa. Proyecto de larga duración. Remoto ¿Qué...

  • Site Reliability Engineer

    hace 2 semanas


    Madrid, Madrid, España We Bring A tiempo completo

    **_¿Te gustaría formar parte de un equipo ágil y colaborativo, que trabaja con últimas tecnologías y tiene ganas de aprender todos los días?_**Desde **We Bring** estamos seleccionando para su incorporación en una empresa tecnológica, reconocida a nível internacional y que apuesta fuerte por la innovación a un/a **Site Reliability Engineer / SRE en...


  • Madrid, Madrid, España buscojobs España A tiempo completo

    Te gustaría formar parte de un equipo ágil y colaborativo, que trabaja con últimas tecnologías y tiene ganas de aprender todos los días? Desde We Bring estamos seleccionando para su incorporación en una empresa tecnológica, reconocida a nivel internacional y que apuesta fuerte por la innovación a un / a Site Reliability Engineer / SRE en Madrid....

  • Site Reliability Engineer

    hace 2 semanas


    Madrid, Madrid, España Antal International A tiempo completo

    Job Description Company Overview:  Join a leading international fintech company at the forefront of innovation, revolutionizing financial services for millions worldwide. Our client is looking for a Senior Site Reliability Engineer (SRE) to play a pivotal role in ensuring the scalability, reliability, and sustainability of their services. Position...


  • Madrid, Madrid, España TN Spain A tiempo completo

    Job OverviewWe are seeking a highly skilled Site Reliability Engineer to join our team in Madrid, Spain. As a key member of our engineering department, you will be responsible for designing, building, and maintaining the robust infrastructure that powers our products and services.This is an excellent opportunity for a passionate and experienced engineer to...


  • Madrid, Madrid, España Logicalis Spain A tiempo completo

    En Logicalis Spain actualmente estamos buscando a una persona con experiencia en herramientas vinculadas al mundo de contenedores y orquestadores, tanto on premise como en el Cloud, para trabajar como Site Reliability Engineer (SRE) en uno de nuestros clientes del sector financiero. Experiencia requerida: Administración de Kubernetes (on premise y/o cloud)....


  • Madrid, Madrid, España Grupo Digital A tiempo completo

    **Descripción**:Buscamos cubrir una vacante para importante compañia de ambito nacional**Site Reliability Engineer**Tareas:- Análisis de actividad (tickets) continuo para mejorar la disponibilidad a negocio de la aplicación- Monitorización funcional detallada y reglas complejas para identificar problemas- Automatización basada en runbooks, como...


  • Madrid, Madrid, España Logicalis Spain A tiempo completo

    En Logicalis Spain actualmente estamos buscando a una persona con experiencia en herramientas vinculadas al mundo de contenedores y orquestadores, tanto on premise como en el Cloud, para trabajar como Site Reliability Engineer (SRE) en uno de nuestros clientes del sector financiero. Experiencia requerida: Administración de Kubernetes (on premise y/o...


  • Madrid, Madrid, España Logicalis Spain A tiempo completo

    En Logicalis Spain actualmente estamos buscando a una persona con experiencia en herramientas vinculadas al mundo de contenedores y orquestadores, tanto on premise como en el Cloud, para trabajar como Site Reliability Engineer (SRE) en uno de nuestros clientes del sector financiero.Funciones principales:Desarrollo de PaaS (Platform as a Service) para alojar...


  • Madrid, Madrid, España Norconsulting A tiempo completo

    Pozuelo de Alarcón, MD, Spain Site Reliability Engineer SRE 2 Job Description: Norconsulting busca para uno de sus clientes, empresas líderes del sector de Seguridad, un Administrador de Sistemas para unirse a su equipo de sistemas y desarrollo en sus oficinas en Madrid. ADMINISTRADOR DE SISTEMAS Experiencia con herramientas de monitorización y Service...


  • Madrid, Madrid, España buscojobs España A tiempo completo

    We are seeking a skilled Senior Site Reliability Engineer to join our team at buscojobs España.Job Summary:As a Senior Site Reliability Engineer, you will be responsible for ensuring the high availability and reliability of our systems, collaborating with our development teams to ensure scalable and reliable application design, and participating in the...


  • Madrid, Madrid, España Logicalis Spain A tiempo completo

    En Logicalis Spain actualmente estamos buscando a una persona con experiencia en herramientas vinculadas al mundo de contenedores y orquestadores, tanto on premise como en el Cloud, para trabajar como Site Reliability Engineer (SRE) en uno de nuestros clientes del sector financiero. Funciones principales: Desarrollo de PaaS (Platform as a Service) para...


  • Madrid, Madrid, España Celonis A tiempo completo

    About CelonisCelonis is the global leader in Process Mining technology, revolutionizing how businesses operate by uncovering hidden inefficiencies and optimizing processes.As a Site Reliability Engineer at Celonis, you will play a vital role in designing, writing, and delivering software that improves the availability, performance, scalability, and...


  • Madrid, Madrid, España Oracle A tiempo completo

    Site Reliability EngineerWhat you will doAs a Site Reliability Engineer (SRE), you will solve exciting technical challenges by defining, designing, deploying, and troubleshooting key Oracle Cloud services, platforms, and infrastructure, always thinking about reliability, scalability, resilience, security, and performance.You will be part of a team of SREs...


  • Madrid, Madrid, España Joinrs US A tiempo completo

    We are seeking a Site Reliability Engineering Lead to join our team at Joinrs US.This is an exceptional opportunity for a seasoned engineer who can lead the development and implementation of our platform infrastructure, ensuring high availability, scalability, and reliability.Lead the design and implementation of scalable and fault-tolerant systems,...


  • Madrid, Madrid, España Groupon A tiempo completo

    About GrouponGroupon is a leading e-commerce platform that connects customers with local businesses worldwide. With over 16 million customers and a presence in multiple continents, we offer a unique opportunity for engineers to make a meaningful impact.Principal Site Reliability Engineer Role OverviewWe are seeking an experienced Principal Site Reliability...


  • Madrid, Madrid, España Oracle A tiempo completo

    Site Reliability EngineerWhat you will do As a Site Reliability Engineer (SRE) you will solve exciting technical challenges by defining designing deploying and troubleshooting key Oracle Cloud services platforms and infrastructure always thinking about reliability scalability resilience security and performance. You will be part of a team of SREs...


  • Madrid, Madrid, España Oracle A tiempo completo

    Key ResponsibilitiesAs a Site Reliability Engineer, you will be responsible for:Product Ownership: You will own the end-to-end configuration, technical dependencies, and overall behavioral characteristics of the presentation tier products.Ownership Scope: You will ensure that products and systems are designed and delivered to be available and incident-free...