Principal Site Reliability Engineer

hace 6 días


España Groupon A tiempo completo

Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms uniquely committed to helping local businesses succeed on a performance basis. Groupon is on a radical journey to transform our business with relentless pursuit of results. Even with thousands of employees spread across multiple continents, we still maintain a culture that inspires innovation, rewards risk-taking and celebrates success. The impact here can be immediate due to our scale and the speed of our transformation. We're a "best of both worlds" kind of company. We're big enough to have the resources and scale, but small enough that a single person has a surprising amount of autonomy and can make a meaningful impact. **Principal Site Reliability Engineer** **Role Overview**: Are you ready to take your expertise to the next level and make a meaningful impact on the reliability and scalability of mission-critical systems? As a Principal Site Reliability Engineer (SRE Level V/VI), you will play a central role in ensuring the performance, availability, and resilience of our platforms. In this position, you will go beyond maintaining systems by leading initiatives that redefine operational excellence. You will collaborate with diverse teams to implement cutting-edge technologies and best practices, foster a culture of reliability, and mentor others in their growth as engineers. This is an exceptional opportunity for someone passionate about solving complex challenges and shaping the future of platform reliability in a high-impact role. **Key Responsibilities**: - Architect and maintain fault-tolerant systems, ensuring uptime SLAs of 99.9% or higher. - Drive automation in infrastructure management and deployment using Terraform, Ansible, Kubernetes, and similar tools. - Create and optimize CI/CD pipelines to ensure reliable, secure, and efficient software delivery. - Build and enhance comprehensive observability solutions, including monitoring, logging, and alerting systems using Prometheus, Grafana, and the ELK stack. - Collaborate with stakeholders to define and achieve SLIs, SLOs, and error budgets aligned with business needs. - Lead incident response during on-call rotations, ensuring rapid resolution and root cause analysis for critical issues. - Design and execute performance testing, capacity planning, and scalability strategies for evolving workloads. - Proactively identify and resolve bottlenecks, increasing system performance and developer efficiency. - Mentor junior engineers, fostering a collaborative and growth-oriented team environment. - Guide architectural decisions that drive innovation and enhance system reliability. **Qualifications**: - 10+ years in systems engineering, with at least 5+ years in SRE or DevOps roles. - Expertise in cloud platforms (GCP, AWS) and container orchestration (Kubernetes, Docker). - Proficiency in programming and scripting languages like Python, Go, and Bash. - Advanced knowledge of Infrastructure as Code (IaC) tools such as Terraform and Ansible. - Deep understanding of networking, DNS, load balancing, and security principles. - Proven track record of managing high-availability systems in demanding environments. - Exceptional analytical and problem-solving skills. **Preferred Qualifications**: - Certifications in cloud or container technologies (e.g., AWS/GCP/Azure, Kubernetes CKA). - Experience in industries like eCommerce, FinTech, or SaaS. - Familiarity with Agile development processes and frameworks. **What We Offer**: - The opportunity to work with cutting-edge technologies in a transformative environment. - A collaborative and innovative work culture that values your expertise and contributions. - Professional growth and leadership development pathways tailored to your aspirations. - A chance to leave a lasting impact by shaping the future of reliable and scalable systems. **Join us to push the boundaries of platform reliability and drive meaningful change in a fast-evolving digital world** J-18808-Ljbffr



  • españa Datadope A tiempo completo

    En DataDope estamos transformando la forma en que las organizaciones entienden, monitorizan y gestionan sus sistemas a través de la observabilidad y el análisis inteligente de datos. Colaboramos con clientes líderes en sectores críticos, donde ayudamos a definir y desplegar oficinas de observabilidad que aportan un valor real al negocio. Estamos buscando...

  • Site Reliability Engineer

    hace 2 semanas


    españa Okta for Developers A tiempo completo

    Get to know Okta Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth. Okta is The World’s Identity...

  • Reliability Engineer

    hace 6 días


    España International Flavors & Fragrances A tiempo completo

    It’s an exciting time to be part of the IFF family. We are a global leader in taste, scent and nutrition, offering our customers a broader range of natural solutions and accelerating our growth strategy. **About the role. What your work will taste like**: Located in our site in Benicarló (Spain), as a Reliability Engineer you’ll be our Subject Matter...


  • España The Contracts Management Group A tiempo completo

    You found an old site! Please login using the below link: Click Here **Senior Site Reliability Engineer - Remote**: **Are you excited by the prospect of working with innovative security products?** **Do you enjoy creating innovative and strategic solutions to solve complex problems?** **Join Guardicore (now Akamai Enterprise Security Group)** Are you...

  • Site Reliability Engineer

    hace 2 semanas


    españa Nextiva Inc. A tiempo completo

    Redefine the future of customer experiences. One conversation at a time. We’re changing the game with a first-of-its-kind, conversation-centric platform that unifies team collaboration and customer experience in one place. Powered by AI, built by amazing humans. Our culture is forward-thinking, customer-obsessed and built on an unwavering belief that...


  • España Akamai A tiempo completo

    **Are you excited by the prospect of working with innovative security products?** **Do you enjoy creating innovative and strategic solutions to solve complex problems?** **Join Guardicore (now Akamai Enterprise Security Group)** Are you passionate about innovative security products and ready to solve complex challenges? Join us and help redefine how...


  • España ICEO - Venture Builder A tiempo completo

    A leading tech company is seeking a Senior Site Reliability Engineer for a full-time, 100% remote position. You will shape the organization's reliability strategy, lead infrastructure development, and implement best practices. The ideal candidate has 5+ years of experience in a DevOps or SRE role, proficiency in programming languages like Python or Go, and...

  • Site Engineer

    hace 2 semanas


    españa Utopia Design A tiempo completo

    About The CompanyUtopia is a forward‑thinking global real estate development group dedicated to creating a distinctive portfolio of luxurious and high‑end hospitality assets. Founded in 2023, Utopia Design – the company’s in‑house architectural firm – drives bold vision and innovative thinking in every project. About The Role Local Site Manager...


  • España ICEO - Venture Builder A tiempo completo

    A leading tech company is seeking a Senior Site Reliability Engineer for a full-time, 100% remote position. You will shape the organization's reliability strategy, lead infrastructure development, and implement best practices. The ideal candidate has 5+ years of experience in a DevOps or SRE role, proficiency in programming languages like Python or Go, and...


  • españa Affirm A tiempo completo

    Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest. Affirm is looking for a Senior Software Engineer for our Cloud Compute team to play a pivotal role in ensuring the robust and scalable foundation of our entire platform. As a fully remote...