Senior Site Reliability Engineer
hace 1 día
Manychat is a leading Chat Marketing platform. We help businesses engage with their customers on Instagram, Facebook Messenger, WhatsApp, and Telegram. Trusted by over 1 million brands in 170+ countries, we’re an official Meta Business Partner, backed by top investors, including Bessemer Venture Partners. With 200+ teammates across international offices in Barcelona, Austin, Amsterdam, São Paulo, and Yerevan — Manychat helps businesses across the globe improve their ROI and grow faster. ABOUT THE ROLE We’re looking for a Senior Site Reliability Engineer who thrives at the crossroads of classic Linux and AWS infrastructure and modern Site Reliability Engineering. This is a high‑impact, hybrid role designed for someone who can manage cloud resources, harden Kubernetes clusters, and shape a more reliable and developer‑friendly platform. We need you not just to maintain but to rethink and evolve our infrastructure, balancing hands‑on operations with strategic improvements that future‑proof our growing AI product landscape. WHY THE ROLE IS SPECIAL You won’t be a cog in a massive SRE org. You’ll be the bridge between Infrastructure and Engineering, shaping how we scale Kubernetes, how we approach platform reliability, and how developers ship fast without fear. You’ll get autonomy, ownership, and a smart, humble team excited to learn with you. WHAT YOU’LL DO Maintain and harden AWS infrastructure (EC2, ALB/NLB, WAF, IAM, CloudWatch) Operate and evolve our EKS clusters powering Python‑based AI services Migrate existing services to Kubernetes using Terraform and Helm Codify infrastructure with Terraform and manage host‑level automation via Ansible Build and improve CI/CD pipelines with GitHub Actions Own observability efforts: Prometheus, Grafana, alerting, and on‑call readiness Support OS‑level patching, certs, WAF rules, and general infra hygiene Partner with engineers to guide best practices and drive platform reliability Create clean, maintainable infrastructure documentation and playbooks Occasionally support rare off‑hours incidents QUALIFICATIONS 5+ years of experience managing Linux in production (Ubuntu, Amazon Linux) Strong experience with Kubernetes (ideally EKS), Helm, and Terraform Comfort with running and debugging Python workloads in containers Solid understanding of networking, IAM, and cloud security best practices Hands‑on Nginx experience (Ingress and reverse proxy setups) Excellent communication skills; you can explain complex infra to devs clearly Strong Ansible skills beyond the basics PostgreSQL or Amazon RDS tuning and operations experience Deep understanding of observability tools (Prometheus, Grafana, Loki, etc.) Familiarity with PHP production environments Experience with TDD, CI/CD best practices, and agile development Any previous SRE‑like exposure such as building resilience, automation, or incident tooling WHAT WE OFFER Comprehensive health insurance for both you and your family. Professional development budget for conference tickets, online courses, and other relevant resources to help you grow.
-
Senior Site Reliability Engineer
hace 4 horas
Plaza Catalunya, España Trust In Soda A tiempo completoSenior Site Reliability Engineer | Spain (Hybrid)¿Es este el puesto que está buscando? Si es así, siga leyendo para obtener más detalles y no olvide enviar su solicitud hoy mismo.An opportunity to join a high growth, late stage technology company operating at significant scale. The business supports thousands of customers globally and is investing...
-
Senior Site Reliability Engineer
hace 1 día
catalunya, España Okta A tiempo completoSenior Site Reliability Engineer (Auth0) Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth. At Okta, we...
-
Site Reliability Engineer
hace 3 días
Plaza Catalunya, España Bright Purple A tiempo completoSite Reliability Engineer – Barcelona, Hybrid ¿Tiene las siguientes habilidades, experiencia e impulso para tener éxito en este puesto? Descúbralo a continuación.Our client are an award-winning scale-up company seeking a talented Site Reliability Engineer (SRE) to help ensure our systems are fast, reliable, and ready to scale. Paying yp to €90k and...
-
Site Reliability Engineer
hace 6 días
Plaza Catalunya, España Bright Purple A tiempo completoSite Reliability Engineer – Barcelona, Hybrid ¿Tiene las siguientes habilidades, experiencia e impulso para tener éxito en este puesto? Descúbralo a continuación.Our client are an award-winning scale-up company seeking a talented Site Reliability Engineer (SRE) to help ensure our systems are fast, reliable, and ready to scale. Paying yp to €90k and...
-
Site Reliability Engineer
hace 1 día
Plaza Catalunya, España Bright Purple A tiempo completoSite Reliability Engineer – Barcelona, Hybrid A continuación se detalla todo lo que necesita saber sobre lo que implica esta oportunidad, así como lo que se espera de los solicitantes.Our client are an award-winning scale-up company seeking a talented Site Reliability Engineer (SRE) to help ensure our systems are fast, reliable, and ready to scale....
-
GCP Platform Site Reliability Engineer
hace 4 horas
Plaza Catalunya, España K2 Partnering Solutions A tiempo completoFor a prestigious Manufacturing company we are looking for a GCP Platform Site Reliability Engineer.Por favor, asegúrese de leer atentamente los siguientes detalles antes de enviar cualquier solicitud.Location: Barcelona (Hybrid – 2 days/week in the office)Contract Type: Permanent, full-timeWe’re looking for a Senior GCP Platform Site Reliability...
-
Senior Site Reliability Engineer
hace 4 horas
Plaza Catalunya, España K2 Partnering Solutions A tiempo completoWe are seeking a highly skilled (Senior) Site Reliability Engineer to join our Platform Engineering team. In this role, you will be at the heart of our technical vision, designing and maintaining the scalable, reliable systems that power our global operations. This is a "code-first" SRE role where excellent programming skills are the foundation of everything...
-
Senior Site Reliability Engineer
hace 1 día
catalunya, España Canonical A tiempo completo2 days ago Be among the first 25 applicants Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's...
-
Site Reliability Engineer
hace 1 día
Plaza Catalunya, España K2 Partnering Solutions A tiempo completoJoin a high-impact IT hub working on cloud-native platforms at massive scale. You’ll design, build, and operate reliable software on Kubernetes, contribute to architecture decisions, review code, and automate processes to improve system reliability. ¿Tiene las cualificaciones y habilidades adecuadas para este trabajo? Descúbralo a continuación y pulse...
-
Remote DevOps
hace 1 día
catalunya, España Neara A tiempo completoA leading AI start-up is seeking a Site Reliability Engineer to build, scale, and maintain cloud infrastructure. You will ensure systems are reliable and perform optimally as the company scales. Key responsibilities include managing AWS using Terraform, optimizing Kubernetes, and implementing monitoring tools like Grafana and Prometheus. The ideal candidate...