Senior Site Reliability Engineer

hace 2 semanas

Madrid, España Colliers A tiempo completo

Senior Site Reliability Engineer (Hybrid) Colliers – Madrid, Community of Madrid, Spain We are looking for an experienced and passionate Senior Site Reliability Engineer for our newly established Technology Hub in Madrid. This is a unique opportunity to become part of the founding team that helps shape the culture, practices, and technical direction. Working in a hybrid model (2 days on-site), you will enjoy significant freedom to innovate and influence the future of one of the world’s largest commercial real estate companies. With the leadership of the Global Technology Hub coming from a background of technology startups, you will benefit from a fast‑paced learning environment, high visibility of your contributions and opportunities to shape processes, culture and technology strategies. At the same time, you take advantage of the stability, resources, and reach of a successful global company . Guided by our global digital strategy, the Madrid Hub collaborates closely with international teams to deliver world‑class technology solutions that power the future of commercial real estate. As Senior Site Reliability Engineer , you are focused on ensuring the reliability, performance and availability of our applications and platforms across GCP and Azure, while enabling development teams to ship faster with confidence. As a senior member of the DevOps team, you will help design and implement observability systems, reliability practices, and incident response processes in collaboration with Software Engineering and Infrastructure teams. Your mission is to bring an engineering‑first mindset to operations, applying automation, data, and feedback loops continuously to improve the resilience of our systems and platforms. You will contribute to global products and platforms that serve both internal and external customers in Commercial Real Estate across multiple regions. Working closely with international Product, Engineering, DevOps, Data, QA and Architecture teams, you will ensure delivery excellence, engineering quality and great consumer experience. You are a hands‑on problem‑solver with strong design principles, who thrives in a collaborative and agile environment. As a senior member of the DevOps function, you will set technical direction, drive best practices, and mentor junior engineers while building solutions with real business impact. Reliability Engineering, and Operational Excellence Define and maintain Service Level Indicators, Service Level Objectives and Service Level Agreements across critical services in partnership with Product Owners; Engineering and Infrastructure Teams. Identify resilience gaps and lead initiatives such as redundancy improvements and scaling strategies to address these. Automate incident response, recovery and scaling where possible. Build tooling for self‑healing infrastructure and applications, reducing manual intervention. Contribute to runbooks, playbooks and knowledge sharing for operations best practices. Build mechanisms to ensure error budgets are respected and used to drive prioritization decisions. Observability Monitoring and Incident Management Design, implement and evolve monitoring, logging, and tracing systems (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog). Develop dashboards and alerting systems that provide actionable insights for engineers and stakeholders. Design and implement a comprehensive ChatOps strategy and ensure close integration with Teams. Collaborate with QA teams and Engineering teams to integrate performance and availability testing into CI/CD pipelines. Lead incident response and postmortems, ensuring learnings are captured and acted upon. Partner with Engineering Ops to ensure metrics and trends are tracked, reported, and tied into continuous improvement initiatives. Drive a blameless culture of reliability, focused on learning, prevention and continuous improvement. Leadership and Collaboration Work closely with Software Engineering teams and Engineering Ops to embed reliability best practices in the development lifecycle. Partner with CloudOps Engineers to ensure resilient cloud architectures. Collaborate with Platforms Engineers to optimize container and Kubernetes workloads for reliability. Support Product Owners with visibility into availability, reliability trade‑offs and error budgets. Mentor engineers across the organization in incident best practices. Provide input into the global operation model for cloud management, balancing standardization with regional needs. Qualifications Required Skills/Experience 5‑8+ years of professional experience in SRE, reliability engineering or production operations. Strong knowledge of GCP and Azure services, with focus on reliability, high availability, and scalability. Hands‑on experience with monitoring, logging and observability stacks (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog). Deep experience with incident response, postmortems and reliability reporting.Proficiency in automation and scripting (Bash, PowerShell, Python). Familiarity with Kubernetes, containers and service mesh technologies. Agile Development experience (Scrum/Kanban), including story estimation, code reviews and pair programming. Experience working in distributed or remote teams and using tools such as Jira, GitHub, Gitlab, Miro and Azure DevOps. Excellent interpersonal and communication skills, fluent in English and Spanish. Understanding of secure development practices and compliance frameworks (e.g. ISO27001, GDPR, SOC2). Preferred Skills/Experience Experience with reliability‑focused testing (load, performance, failover) Contributions to internal development platforms or developer experience initiatives. Ability to work in a fast‑paced, growing tech environment and to foster change in larger organizations. #J-18808-Ljbffr

Site Reliability Engineer

hace 2 semanas

Madrid, España Switch Tech Talent A tiempo completo

Role: Site Reliability Engineer Location:Barcelona/Hybrid (3 days a week in office) Salary:up to €85,000 per annum Key Skills:AWS, IaC, Docker, ScriptingAs a Site Reliability Engineer you will be at the forefront of maintaining robust, scalable, and secure cloud solutions that power this cutting-edge e-commerce platform. Your expertise will ensure...
Senior site reliability engineer

hace 5 días

Madrid, España Trust In SODA A tiempo completo

Senior Site Reliability Engineer | Spain (Hybrid)An opportunity to join a high growth, late stage technology company operating at significant scale. The business supports thousands of customers globally and is investing heavily in reliability, platform maturity and engineering quality as it continues to grow.This is a true senior SRE role for someone who has...
Senior site reliability engineer

hace 7 días

madrid, España Trust In SODA A tiempo completo

Senior Site Reliability Engineer | Spain (Hybrid) An opportunity to join a high growth, late stage technology company operating at significant scale. The business supports thousands of customers globally and is investing heavily in reliability, platform maturity and engineering quality as it continues to grow. This is a true senior SRE role for someone who...
Senior Site Reliability Engineer

hace 1 semana

Madrid, España Trust In SODA A tiempo completo

Senior Site Reliability Engineer | Spain (Hybrid)An opportunity to join a high growth, late stage technology company operating at significant scale. The business supports thousands of customers globally and is investing heavily in reliability, platform maturity and engineering quality as it continues to grow.This is a true senior SRE role for someone who has...
Senior Site Reliability Engineer

hace 6 días

Madrid, España Trust In SODA A tiempo completo

Senior Site Reliability Engineer | Spain (Hybrid)An opportunity to join a high growth, late stage technology company operating at significant scale. The business supports thousands of customers globally and is investing heavily in reliability, platform maturity and engineering quality as it continues to grow. This is atrue senior SRE rolefor someone who has...
Senior Site Reliability Engineer

hace 5 días

Madrid, España Trust In Soda A tiempo completo

Senior Site Reliability Engineer | Spain (Hybrid)Se pueden requerir diversas habilidades interpersonales y experiencia para el siguiente puesto. Por favor, asegúrese de consultar la descripción a continuación con atención.An opportunity to join a high growth, late stage technology company operating at significant scale. The business supports thousands of...
Senior Site Reliability Engineer

hace 7 días

madrid, España Canonical A tiempo completo

Senior Site Reliability Engineer Canonical is a leading provider of open source software and operating systems. The company’s platform, Ubuntu, powers breakthrough initiatives in public cloud, AI, engineering, and IoT. Location: Globally remote role. Role summary We are looking for a Senior Site Reliability Engineer to lead next‑gen operations at scale,...
Senior Site Reliability Engineer

hace 6 días

Madrid, España Canonical A tiempo completo

Senior Site Reliability EngineerCanonical is a leading provider of open source software and operating systems. The company’s platform, Ubuntu, powers breakthrough initiatives in public cloud, AI, engineering, and IoT.Location: Globally remote role.Role summaryWe are looking for a Senior Site Reliability Engineer to lead next‑gen operations at scale,...
Senior Site Reliability Engineer

hace 6 días

madrid, España Circle A tiempo completo

Join to apply for the Senior Site Reliability Engineer role at Circle Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data — globally, nearly instantly and less expensively than legacy settlement systems. This ground‑breaking new internet layer opens up...
Senior Site Reliability Engineer

hace 2 días

Madrid, España Okta A tiempo completo

As a Site Reliability Engineer you will champion all things pertaining to reliability at Okta for our Customer Identity Cloud (formerly Auth0). Working closely with the product engineers, quality engineers, platform engineers and architecture teams, your primary focus will be on ensuring production systems remain operational at all times, while continually...

América

Europa

Asia / Oceanía

África

Senior Site Reliability Engineer