Principal Site Reliability Engineer

hace 3 semanas


Madrid, España Groupon A tiempo completo

Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms uniquely committed to helping local businesses succeed on a performance basis.

Groupon is on a radical journey to transform our business with relentless pursuit of results. Even with thousands of employees spread across multiple continents, we still maintain a culture that inspires innovation, rewards risk-taking and celebrates success. The impact here can be immediate due to our scale and the speed of our transformation. We're a "best of both worlds" kind of company. We're big enough to have the resources and scale, but small enough that a single person has a surprising amount of autonomy and can make a meaningful impact.

**Principal Site Reliability Engineer**

**Role Overview**:
Are you ready to take your expertise to the next level and make a meaningful impact on the reliability and scalability of mission-critical systems? As a Principal Site Reliability Engineer (SRE Level V/VI), you will play a central role in ensuring the performance, availability, and resilience of our platforms. In this position, you will go beyond maintaining systems by leading initiatives that redefine operational excellence. You will collaborate with diverse teams to implement cutting-edge technologies and best practices, foster a culture of reliability, and mentor others in their growth as engineers. This is an exceptional opportunity for someone passionate about solving complex challenges and shaping the future of platform reliability in a high-impact role.

**Key Responsibilities**:

- Architect and maintain fault-tolerant systems, ensuring uptime SLAs of 99.9% or higher.
- Drive automation in infrastructure management and deployment using Terraform, Ansible, Kubernetes, and similar tools.
- Create and optimize CI/CD pipelines to ensure reliable, secure, and efficient software delivery.
- Build and enhance comprehensive observability solutions, including monitoring, logging, and alerting systems using Prometheus, Grafana, and the ELK stack.
- Collaborate with stakeholders to define and achieve SLIs, SLOs, and error budgets aligned with business needs.
- Lead incident response during on-call rotations, ensuring rapid resolution and root cause analysis for critical issues.
- Design and execute performance testing, capacity planning, and scalability strategies for evolving workloads.
- Proactively identify and resolve bottlenecks, increasing system performance and developer efficiency.
- Mentor junior engineers, fostering a collaborative and growth-oriented team environment.
- Guide architectural decisions that drive innovation and enhance system reliability.

**Qualifications**:

- 10+ years in systems engineering, with at least 5+ years in SRE or DevOps roles.
- Expertise in cloud platforms (GCP, AWS) and container orchestration (Kubernetes, Docker).
- Proficiency in programming and scripting languages like Python, Go, and Bash.
- Advanced knowledge of Infrastructure as Code (IaC) tools such as Terraform and Ansible.
- Deep understanding of networking, DNS, load balancing, and security principles.
- Proven track record of managing high-availability systems in demanding environments.
- Exceptional analytical and problem-solving skills.

**Preferred Qualifications**:

- Certifications in cloud or container technologies (e.g., AWS/GCP/Azure, Kubernetes CKA).
- Experience in industries like eCommerce, FinTech, or SaaS.
- Familiarity with Agile development processes and frameworks.

**What We Offer**:

- The opportunity to work with cutting-edge technologies in a transformative environment.
- A collaborative and innovative work culture that values your expertise and contributions.
- Professional growth and leadership development pathways tailored to your aspirations.
- A chance to leave a lasting impact by shaping the future of reliable and scalable systems.

**Join us to push the boundaries of platform reliability and drive meaningful change in a fast-evolving digital world



  • Madrid, España Ares Consultores A tiempo completo

    See yourself at Twilio: Join the team as our next Principal Site Reliability Engineer (L5) - Centralized Kubernetes Who we are & why we're hiring: Although we're headquartered in San Francisco, we have a presence throughout South America, Europe, Asia, and Australia.We're on a journey to becoming a globally anti-racist, anti-oppressive, anti-bias company...


  • Madrid, España Ares Consultores A tiempo completo

    See yourself at Twilio: Join the team as our next Principal Site Reliability Engineer (L5) - Centralized KubernetesWho we are & why we're hiring: Although we're headquartered in San Francisco, we have a presence throughout South America, Europe, Asia, and Australia.We're on a journey to becoming a globally anti-racist, anti-oppressive, anti-bias company that...

  • Site Reliability Engineer

    hace 2 semanas


    Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOING Santander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDAD En Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOINGSantander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOINGSantander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOINGSantander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOINGSantander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...

  • Site Reliability Engineer

    hace 3 semanas


    Madrid, España Antal International A tiempo completo

    Job DescriptionCompany Overview:Join a leading international fintech company at the forefront of innovation, revolutionizing financial services for millions worldwide. Our client is looking for a Senior Site Reliability Engineer (SRE) to play a pivotal role in ensuring the scalability, reliability, and sustainability of their services.Position Overview:As a...


  • Madrid, España Antal International A tiempo completo

    Job DescriptionCompany Overview:Join a leading international fintech company at the forefront of innovation, revolutionizing financial services for millions worldwide. Our client is looking for a Senior Site Reliability Engineer (SRE) to play a pivotal role in ensuring the scalability, reliability, and sustainability of their services.Position Overview:As a...


  • Madrid, España Antal International A tiempo completo

    Job Description Company Overview:Join a leading international fintech company at the forefront of innovation, revolutionizing financial services for millions worldwide. Our client is looking for a Senior Site Reliability Engineer (SRE) to play a pivotal role in ensuring the scalability, reliability, and sustainability of their services.Position Overview:As a...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOINGSantander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital Services Country: SpainWHAT YOU WILL BE DOINGSantander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid). POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Twilio A tiempo completo

    **See yourself at Twilio**: Join the team as our next Principal Site Reliability Engineer (L5) - Centralized Kubernetes **Who we are & why we're hiring**: Although we're headquartered in San Francisco, we have presence throughout South America, Europe, Asia and Australia. We're on a journey to becoming a globally anti-racist, anti-oppressive, anti-bias...


  • Madrid, España Cabify A tiempo completo

    Do you want to change the world? At Cabify, that's what we're doing. We aim to make cities better places to live by improving mobility for the people living in them, connecting riders to drivers, providing mobility alternatives such as scooters and mopeds and many others to come, all at the touch of a button. Maybe one day cities will be places where nobody...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital Services Country: Spain WHAT YOU WILL BE DOING Santander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDAD En Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros,...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital Services Country: Spain WHAT YOU WILL BE DOING Santander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid).POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDAD En Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros,...


  • Madrid, España Ares Consultores A tiempo completo

    Site Reliability Engineer - Santander Digital ServicesCountry: SpainWHAT YOU WILL BE DOING Santander Digital Services está buscando un/a Site Reliability Engineer para nuestras oficinas en ABELIAS (Madrid). POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDADEn Santander, la tecnología tiene un papel esencial.No solo nos estamos transformando nosotros, también...


  • Madrid, España Grupo Digital A tiempo completo

    **Descripción**: Buscamos cubrir una vacante para importante compañia de ambito nacional **Site Reliability Engineer** Tareas: - Análisis de actividad (tickets) continuo para mejorar la disponibilidad a negocio de la aplicación - Monitorización funcional detallada y reglas complejas para identificar problemas - Automatización basada en runbooks,...


  • Madrid, España Iwantic A tiempo completo

    Desde nuestra división de **IT & Cloud** estamos buscando un **Site Reliability Engineer 100% remoto **para cliente final internacional con sede en Madrid. **RESPONSABILIDADES**: - Implantación y mantenimiento de sistemas de recopilación de métricas, gestión de registros y comprobación del estado de los servicios. - Diseño de sistemas,...

  • Site Reliability Engineer

    hace 2 semanas


    Madrid, España We Bring A tiempo completo

    **_¿Te gustaría formar parte de un equipo ágil y colaborativo, que trabaja con últimas tecnologías y tiene ganas de aprender todos los días?_** Desde **We Bring** estamos seleccionando para su incorporación en una empresa tecnológica, reconocida a nível internacional y que apuesta fuerte por la innovación a un/a **Site Reliability Engineer / SRE...