Senior Site Reliability Engineer
hace 1 semana
Red Hat is looking for a Platform Engineer to join its Platform Engineering team In this role, you will help architect, implement, improve, and support the OpenShift-based platform that runs many of Red Hat's most important multi-tenant Software-as-a-Service (SaaS) and Managed-service offerings. Using your expertise in SRE principles, you will help create an environment where reliability, scalability, and security come first, and are not treated as an afterthought.
In this role, you will spend a portion of your time working across teams to define and iterate upon processes for onboarding new managed services at Red Hat and demonstrate good judgment in employing onboarding methods and techniques that can be repeated and iterated upon. You will also contribute to the codebase of command-and-control software that automates the building, deployment, monitoring, and alerting of Red Hat managed services. The remainder will be spent on various other tasks, such as diagnosing issues, planning, documenting, and mentoring.
What you will doDesign, write, and maintain software (primarily in Python and Golang) that automates the deployment, monitoring, and maintenance of Red Hat managed services.
Onboarding of new services onto our OpenShift-based platform: adhering to cloud-native design principles & best practices to ensure reliability, scalability, and security; contribute to documents, like standard operating procedures (SOPs) and playbooks, that assist in issue resolution and new-service onboarding.
Proactively utilize AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code) for code generation, auto-completion, and intelligent suggestions to accelerate development cycles and enhance code quality.
Participate in an Agile Scrum team that scopes, prioritizes, and allocates work items.
Participate in an on-call rotation that is responsible for responding to service incidents.
Background writing object-oriented automation software in Python, experience with Golang is only plus
Background administering production cloud-native services, preferably containerized and deployed via a container-orchestration system like Kubernetes or OpenShift
Experience diagnosing service failures and carrying out incident response procedures
Familiarity with Linux operating system and its configuration
Ability to effectively work in a globally distributed team
Understanding of computer networking and protocols, including TCP/IP and DNS
Understanding of computer security and cryptography basics, including certificates, TLS, and credential-storage systems like Vault is a plus
Familiarity with CI/CD pipeline concepts and systems, like Jenkins and Tekton/Argo is a plus
Familiarity with observability tools like Prometheus and Grafana, and how to define metrics that can be used to measure service health and reliability is a plus
About Red Hat
Red Hat is the world's leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.
Inclusion at Red Hat
Red Hat's culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from different backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions that compose our global village.
Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.
Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.
Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application- General inquiries, such as those regarding the status of a job application, will not receive a reply.
-
Site Reliability Engineer
hace 2 días
Remote (within ± hours EST), España OpenFX A tiempo completoAbout UsOpenFX is on a mission to move money as freely as data, unrestricted by time zones, banking hours, or legacy systems. We are building the infrastructure that powers the next generation of cross-border payment systems for institutions. Our early team comes with experience from J.P. Morgan, Goldman Sachs, FalconX, PayPal, Affirm, Kraken, and Nium, and...
-
Site Reliability Engineer, Technical Referent
hace 2 días
Spain / Rome (Remote) dLocal A tiempo completoWhy should you join dLocal? dLocal enables the biggest companies in the world to collect payments in 40 countries in emerging markets. Global brands rely on us to increase conversion rates and simplify payment expansion effortlessly. As both a payments processor and a merchant of record where we operate, we make it possible for our merchants to make inroads...
-
Senior Site Reliability Engineer
hace 6 días
Spain K2 Partnering Solutions A tiempo completoWe are seeking a highly skilled (Senior) Site Reliability Engineer to join our Platform Engineering team. In this role, you will be at the heart of our technical vision, designing and maintaining the scalable, reliable systems that power our global operations. This is a "code-first" SRE role where excellent programming skills are the foundation of everything...
-
Senior Site Reliability Engineer
hace 2 días
Europe, Remote, España RetailNext A tiempo completoRetailNext is looking to expand our SRE team. We need people who have the skillset of good backend developers to focus on the operation and reliability of our SAAS retail analytics solution. We pull in and process data from thousands of brick and mortar stores to help our customers better understand and serve their customers. We actively develop in Go and...
-
Senior Site Reliability Engineer
hace 1 semana
P.º de la Castellana, , Tetuán, Madrid, Spain Colliers International EMEA A tiempo completoCompany Description Colliers is a leading diversified professional services and investment management company. With operations in 68 countries, our 22,000 enterprising people work collaboratively to provide expert advice to maximize the potential of property and real assets to accelerate the success of our clients, our investors and our people.We are at the...
-
Site Reliability Developer
hace 2 días
Remote, Spain WatchGuard Technologies A tiempo completoWatchGuard embraces a Flexible Work Philosophy. Most of our employees can choose to work from the office, at home, or any combination of the two. We've built a global workforce of outstanding team members and a flexible culture built on trust, collaboration, and belonging. Who you are: You are a customer-focused, data-driven developer who has a passion for...
-
Site Reliability Engineer with
hace 2 días
Barcelona, Connecticut, Spain Kodify A tiempo completoJob description We're based on over 15 years of success, producing world-class video content and building, developing, and managing a number of high-traffic websites. Our award-winning content and websites are created exclusively by us and directly for the use of millions of users worldwide. At Kodify, we're not just pushing boundaries in online...
-
Site Manager
hace 5 días
Spain Salas Montalvo A tiempo completoFrom Salas Montalvo, a firm specialised in executive and strategic search, we are looking to appoint a Site Operations Manager, based in Noblejas (Toledo), for an international company specialised in automated logistics and industrial warehousing solutions, currently in a phase of operational consolidation and international growth. The position has a senior,...
-
Advanced Lead Field Service Engineer
hace 14 horas
Remote, España GE Aerospace A tiempo completoJob Description SummaryAt GE Aerospace our Field Service Engineer (FSE) are the embodiment of our core behaviours. Primarily they are permanent at the customers location, for a customer, supporting them with on-site technical troubleshooting and fleet management problem solving.Job DescriptionKey Roles & Responsibilities:Our FSEs embody our mission,...
-
Senior Backend Engineer
hace 2 días
Remote, España BizAway A tiempo completoAbout BizAwayHere at BizAway, we Deliver the Future of Travel. We are a solid international company with strong ambitions and great expertise. With a focus on sustainability , on a daily basis we support companies enabling them to improve their travel management through our constantly evolving services and solutions, always characterized by our tech...